
For a lot of you guys, I think at the B-Sides, a couple of years ago in San Francisco, they even had a rage quit event where whoever wanted to could get on stage and complain about their job. So it's not easy, right? And so I always try to think about why is that not easy? Why the things that we're doing, they're tedious
and uncomfortable and unnatural. And in the end, what it boils down to is we get the data with mostly its logs, right? And we're supposed to get insights out of them somehow. So how does that happen? How do we get from just this overload of crappy data into finding the bad guy, finding the threat, finding the complete, being able to complete our investigation, And that's kind of, that's worth thinking about, right? It's worth thinking about why are we doing these things the way that we're doing them? And one of the main reasons that we're relying on logs so much is because that's kind of all we have. I mean, we have data dumps from, you know, we can do data dumps from
Active Directory. We can correlate them maybe with some dumps from, you know, from asset management database. Maybe if we're really cool, we have a robust enough endpoint solution that will give us some sort of documentation about the systems and what's running on them. So we can kind of correlate that information with the information in the logs. But in reality, I've been in the log management space for a very long time. And both me and Rafael, we used to work at HP ArcSight maybe a little bit different time. Oh, sorry. He doesn't like to talk about it. He's recovering as well. And, you know, besides the scaling problem for the logs themselves, you also have a problem of context, right? You have to have a
system that's not only fast enough and flexible enough and intuitive enough to let you analyze billions of events per day, but also do it in such a way that it either correlates or kind of joins that information with a lot of these other things that matter, right? Because logs are just half of the picture. And it's like we don't think about it hard enough. The vendors, security vendors, don't do us justice by not giving us the information that we need in such a way that it's easy for us to process. So that's what I mean by what's in the box and who the hell knows. So let's take one example. from this, let's call it a log management solution,
which I'm sure some of you guys are very familiar with. I have obscured the logos, but everybody knows it's Spunk. And let's see what's going on here. This is actually an alert from a threat management tool that a lot of organizations are probably going to pay big money for. to receive these indicators of compromise or the hostile entities and kind of the connections. That will allow you to basically correlate that information with information in your logs, outgoing traffic, network connections, proxy, whatever.
And so as a result, first of all, can we talk about the query in there, right? I consider myself somewhat technical. I'm definitely not a developer. But that's a steep learning curve just to be able to put that query together. I hope you guys agree with that. On the other hand, yeah, sure, it could be saved. It could be saved search or something like that. But just the fact that for us, to get something as simple as which of my systems... went to a hostile website or a phishing website or to a known command and control center, it takes that, there's something wrong with our tools. They're just not good enough. This is not OK. And now, after I
finagle this query and I get these results, voila, I am blessed with the list of IP addresses, each of which
I have to go and freaking investigate. All I get is IP address. I have no idea what's there. Is it a server? Is it a desktop? Is it somebody's cell phone? Who owns it? What's on it? Which users are on it? What is it doing? What is it supposed to be doing? What is its role? You know, which part of my gigantic network is it running in? And those are all of the questions that I have to answer about each one of those things. Right? There's something wrong with that. Why don't we have a system that gets us those answers faster? And not only that, if I would be lucky to find a real incident, if
any one of those checks out to be true, and hopefully they are because you're paying a lot of money for that threat intel, now you have to kick off a full-blown investigation where not only you have to investigate things about those AP addresses, but literally every single small block in the kill chain, and I understand some of you guys might not like that word, but in the history of the attack, right? That from the very beginning, how they got in, who clicked on what, how did they get infected, what got infected, all of the lateral movement, all of the CNC stuff, all of the exfiltration, All of the weaponization, all of that, like every single one of them. Like, I am
not jealous. I don't want that job. Like, I know Rafael was talking about you guys, and, you know, it's an awesome space to be in, and I agree, but I also don't wish that on anybody. So this is how I imagine, you know, the life in the SOC. You know, we get alerts, that they're just crap. You know, they're just imperfect. They make a lot of assumptions. And, you know, as Samuel Jackson said in Long Kiss Goodnight, when you make an assumption, you make an ass out of you an umption. And I think that's part of the problem with a lot of our approaches is that we cannot not make assumptions. We simply cannot afford to. Right? Otherwise, we'll be just completely dead in the water. It's
very difficult to to conduct investigation using facts alone. Because you don't have enough facts. You have to start somewhere. You have to have a tool that will tell you, it's probably this, although it's possible that it's not.
And don't even get me started on machine learning. I am still, I would like to see a successful practical implementation of machine learning outside of anomaly detection. So if you guys, in your experience, have seen one, worked with one, created one, I would love to talk to you. Anomaly detection is neat because it builds a baseline over time. It can make things fall within patterns. And then it can look for deviations from those baselines. To me, that's not intelligence. That's not artificial intelligence. To me, that's statistics. Some controversial statements there, for sure. And there's nothing wrong with statistics. Statistics are awesome, but let's not call it AI. So that's one problem. Another problem is that
algorithms are basically built by mathematicians and implemented by developers that understand that math. Most of us are not.
So when an algorithm finds something for you, unlike a rule or a signature, it's very difficult to go and validate it. Was the algorithm wrong? How do I make it more right? Those are the things that are much more removed from us than good old sim rules. If this, then that. So that's why it actually makes our job more difficult. not easier. Now it will find some things and maybe in combination with some other things it will build a picture. But it's not a silver bullet by any means. So to me, it all boils down to this. We have complex environments. We do. Cloud, hybrid, on-prem, we have virtualization, containerization wave is coming next. If you guys haven't had to deal with that,
you will very soon. Docker, Kubernetes. It just makes things more advanced and more complicated.
And that's not even talking about the standard information security information stack that you need to deal with. You need to understand everything from a memory dump all the way up to the business function of the host of which you're taking the memory down from. Does it contain crown jewels? Is it something that attackers would go after? Is it a customer database? Is it the code that your company has developed? Or is it something that somebody spun up in the lab and they're gonna shut it down in 20 minutes anyway and it's not worth your time? That's an important distinction. And nothing in the logs will say that.
So we need something better. We need a better approach. And that's why I'm actually excited to talk to you guys about this because I think this could really be a game changer. And what I'm talking about is semantic graph and the particular application of it is semantic reasoning. But before we dive into that, I just wanna do a shout out to data modeling.
In my experience, and again, I love to be proven wrong, but in my experience, data modeling is still an unsolved problem in InfoSec. And the reason for that, you have hundreds of vendors, you have hundreds of thousands of data formats, message formats, and you just simply cannot build those mappers fast enough. Now, to the credit, to software like Splunk, and which is one of the reasons why I do like Splunk, actually, I do. is that they have gotten the data modeling part right. There is a support for CIM standard.
I'm a bit of a groupie of theirs. They're trying to solve some problems around OWL by having an alternative ontology and schema format. And also they blog a lot and they're happy to share their findings and their stuff, their core is open source so you can contribute to it and you can take it and run with it as well. If you're looking for a backend, make sure it's basically anything that supports latest TinkerPop. TinkerPop is kind of like a set of drivers that allow you to do really, really cool things without really caring what the database is. So the database could be something really scalable like Cassandra or Elastic or whatever you like, HBase. But TinkerPop is the key. Queer
languages, Cypher for Neo4j, Gremlin, TinkerPop, the data formats, JSON-LD for data in transfer, and Turtle for ontologies.
That's all I have. This is my Twitter. If you guys have any questions, you want to buy me a beer, ask me questions, or tell me I'm wrong, I always want to hear. And thank you for your time.
That's exactly what I was complaining about. Because if you did not build this algorithm, unless the vendor went into a great depth to explain to you the fundamentals behind it, you just don't know. It's like, why are you showing this to me?
with the signature or with the rule you can go look it up. Why is the rule triggering? Like, oh, okay, because, gotcha. And some of the correlations are like, oh, there's an anomaly on this server and then an admin created a new account on this server that's completely unrelated and it gives it a credit score and you're like, of course. Have you looked at that one? Blugger. Blugger.
Scoring, dude. Anytime I see a score, I'm like. Oh, the stuff that's real? The stuff that's real is always listed under a low score. And the stuff that's completely irrelevant is always 90 and above. And then the stuff that's real is like 20
and 30. They're heavily invested in that conversation. HR SaaS tool for managing objectives and performance. So it's my job to secure that data set. It's a fairly large HR data set. It contains a lot of... of the more pernicious types of data that people don't want to get out there. And so that's what I do. But what I'm here today to talk to you about is cultural debt. And particularly cultural debt in security programs. So when we're talking about cultural debt, I'm borrowing from the idea of technical debt. So what we're going to do today is we're going to walk through technical debt, build our understanding of that, and apply that concept to our organizations and look at how
we can create cultural debt. The idea is that as the system, as the subject matter, as the scope grows, the complexity increases and the opportunity for debt to form also increases. So that'll be sort of the overarching concept that runs through this. So the system that we're going to be looking at is an internet-facing application designed and built internally. So I work in a product company that sells software, it sells a SaaS tool, so this is the context that I'm coming at it from. We're a development house, so a lot of my job is AppSec and what you might call the DevSecOps, which is kind of a brutal buzzword going around these days. So that's a lot of what I do. So I'm looking at this from
an engineering and from a development perspective. The technical debt that you might be used to thinking about is something along the lines of I write some code, the code gets stale or the code changes, so I need to refactor the code. I need to invest some time in rewriting my code. That's sort of the myopic view of technical debt that might come from a software developer or an engineer, but as security people, we have to look at a much larger system. We have to understand this at a higher scope. So this is the first, this is the application itself. So the first scope here. You've got your code deployed. It depends on a number of
things like databases. For me it's a cloud-based tool so it depends on AWS, it depends on security groups and VPCs and so on. Whatever security tools I have installed there. And the delivery technology stack might include the operating system and the web server and so on. So the software is packaged together, it's deployed, but it's really just a collection of components, a collection of a lot of different things. And the trend today in development and engineering is to really make those components as small as possible. So microservices, you've heard of these. And developers are incentivized to work in microservices. They want loosely coupled components, as many as they can, and that compartmentalizes the risk of change.
I can make changes to a specific microservice or this thing over here, and it's not going to break everything else. They have a strong incentive to break everything up.
Where that goes is into the serverless world, where everything is now broken up into different components that are hosted. But for us, as security people, this is not a good thing, in my opinion. Each of those services creates a risk. And with each microservice that gets deployed and each additional service that we have to use in our application, our environment increases and the number of threats increases. So we have to manage that. And for people in operations, I come from an IT background, an operations background. And so that just creates more and more and more work. and for security as well. It's just creating more and more risk, more and more work. There's a lot to think about
at this level. This is your basic application level. All that is ideally, it's hidden from the world. The using world is only seeing the application through the feature, at the feature level. It's deployed in a web browser, they work with a feature, and that's all they know. We don't want them to know everything else. In fact, it's dangerous if any of that information about what's going on in the backend leaks. We try to protect that as much as possible. But this is not nearly the whole picture. This is where things have moved to. This is the sort of evolved DevOps landscape of an application. The actual deployed application is over there, and there's all this stuff that goes into the process of getting
that application deployed. It starts with the developer writing code, and then it moves through testing, it moves through deployment, and so on. We actually interact with this whole process down here, this whole system down here. We're interacting with it through these tools up at the top. So those tools are essentially our instruments, they're the components that we use to represent the system to ourselves. That's how we observe the system, is through our tools. And it's mediated through that. So essentially we're peering into the application through our tools. What that means is that each of us has to use those tools to form our understanding about the system. Nobody has a complete knowledge of the system. We're all working very hard and
everybody involved in this deployment process is working very hard to maintain a mental model of the system. What it is, how it works. But the problem is that we can't. Because we can't possibly know what's happening all the way across. So I work in security, or say I work in operations, I only get access to these tools. I may not even have access to what's going on at the beginning and the end of the pipeline. And the developer certainly doesn't have access to what's going on or historically hasn't had access to what's going on. So we get this problem of many worlds and many possible outcomes with each change that gets deployed at the very
stage in the system. And it increases the complexity. So again, nobody has a complete understanding. Everybody's trying to generate their understanding, and everybody's putting in effort to generate that understanding on a continuous basis. That's part of our job, or it is our job, essentially, to know and understand the system, at least as far as we can from what's given to us. And you have this balance as well between intention and performance. So the developer's intending for his code to work in a certain way and for changes to go down in a certain way, but they're not always working out that way. So over here, on this side of the pipeline, you've got a very dynamic system that doesn't necessarily respond well to change. And as we talked about, the
way they've dealt with that, the way we have dealt with that is by compartmentalizing as much as possible. So the complexity is adding, what it's doing is adding boundaries. And like I said, the goal on the left side is actually to increase the number of boundaries, to reduce the scope of change, to reduce the risk of change. Whereas the issues on our side have to do with those boundaries and the complexity and the security issues, the risk that they introduce. Now, a lot of the ways that we can deal with that is through reducing the mental model, but there is a part of that mental model that we all have to share. Ways we might
reduce the mental model is through automation, for example. So we might try to automate as much as possible. The developers are automating their tests. They don't have to know about what's happening in the testing anymore. They just automate it, and then they essentially relieve themselves of having to cope with that. And security's no different. So we want to automate as much as possible, too. Automate ourselves out of a job. But that can be quite difficult. But this isn't the extent of the model either. We're going to go one step further and we're going to look at, this is everything coming together. So you've got your deployed application, your using world, your deployment pipeline, your organization that's building this, and the mental models and the questions
that they're asking themselves to try to discover and learn about this application. Which of course is very difficult. As I said, the certainty is from the point of view of the observer. Nothing actually assures that certainty other than what's going on and what we can see in our tools. Essentially, most of the system to most people working in the system is dark. We can't see into it. Even if we could see into it, it would be a lot of effort and a lot of energy to try to maintain a mental model of that system that allows us to have that certainty. The disconnect across the system can be enormous. For example, the using world, the mental model that they're using to interact with this application, again, is
through features. A lot of times, your developers will be so disconnected from this mental model of how the features are being used that they're building things that don't actually help or provide value to the using world, which is, I think, a very interesting situation to be in. But it also happens, and this especially occurs in larger organizations where just the people sitting right beside each other, so from development to operations, is a very tricky relationship. And they don't share that same mental model. And that's where that complexity arises from and where this technical debt in the system arises from. So I'm deploying changes to the system at any place and I have an incomplete mental model It's dangerous and
I can break the system. And we don't actually, because the system, because these changes are dark and the effects of them are going to occur somewhere else in the system, we don't actually know the problems we've introduced into the system until we're seeing them, until we're experiencing them. And we'll talk about it in a second. I'm going really deep into the technical side of things. We're going to talk in a second about culture and how that plays out in culture as well. And that is very much true as well. The changes that we introduce in our organization that impact culture, they don't appear until maybe six months, a year, two years later. And then we
start seeing really toxic environments, really bad behavior, and so on.
It's very interesting that security has the very difficult task of trying to see the whole system. We have to know. We can't afford to have dark spots in our system. We have to be able to shine that light and see what's going on. That makes our job, I think, one of the most difficult roles in this system. What am I at here? 10 minutes in. Okay, we're doing well. How's everybody doing? Good, okay.
Technical debt, just in summary, like I said, it can be a lot of times we're introducing these changes, they're dark, we don't know what's going to happen, or we're not even aware that they're going to impact other systems. But a lot of the time also technical debt is purposeful and it's done deliberately. And that's one of the advantages of knowing and understanding technical debt. It's become a core competency in engineering is to manage technical debt. And it's managed at the local level, it's managed at the team level, and teams will usually go through, spend a few other cycles refactoring and doing that work that they've already identified as being a technical debt issue. So they mark code as being debt and then they come back and
deal with it later. They spend those cycles. They have that awareness about what's happening. But that's not the case necessarily with our organizations when we introduce debt into our organizations. And there's lots of ways that we can do that. For example, on the technical side, your engineers might start using a new technology.
So I can read that almost, and I'm in the front row. You don't really need to read this. It's basically talking about everything that you're going to see in the next hour or so. If you didn't notice on the first slide here, I'm Jeremy Coho. I have a link to the slides on here. So if you can't see them and you're one of those people who has a laptop open with internet, just go grab the slides right there.
I have a clicker so I can advance the slides, but it's not gonna work because we're not in presentation mode. Cisco.box.com slash v slash bsides2018.
It's really difficult looking up at the screen like this. Okay, so, oh yeah, the purpose. Why are you here? So, are you in the right room? You've had lots of time to determine if you need to go somewhere else and jump ship. There's a couple other talks going on right now. This is the virtualized network monitoring for fun and profit. Gonna be talking about VMware, VLANs, switching, routing, wireless, open source tools, ElkStack, SaltStack, SecurityOnion, Bro, couple other technologies. Talking about threat hunting. That's about it, no big deal. So if you read about this session, you'll see this exact text. I wrote this like 10 minutes before the CFP was due. I was like, okay, better write some in here for my
talk. They gave me the opportunity to change that a couple weeks ago, and I actually looked at it. And I thought, you know, this is actually pretty good. It hits all the main points that I want to hit. We're going to be talking about hardware and software, network monitoring. One thing I didn't mention, this is all in my home lab in my apartment. But a lot of the technology, the solutions, and the stacks all scale to the enterprise. Talking about threat hunting on the LAN, network visibility, troubleshooting, but really most importantly, I want you to walk out of this saying, hey, I saw an idea that was really cool. I saw an idea that I can take back to work and use. I can really solve a problem
for myself or my team based off of something that I saw in the presentation or an idea that I had because I went to this one hour session.
Okay, that's what I just said. Sharing all the tools. And the third one, oh yeah, remember how I set things up. You'll see in the next couple slides I have a network diagram of some of the software stacks that I have that I'm working with, and I just can't keep track of it all. It's too much. It's too many ports, server names, passwords, all that stuff. So I started documenting it. And I used a tool called OmniGraffle. It's like Busio for Mac.
Pretty hard to tell everything that's on here, even for me. So I've actually broken it down into three or four different groups or categories here. Where we have the core infrastructure in the center,
It's too bright. How's that? Just as bright. Off? All off? Is this good? Awesome. See, now I want to put my badge back on.
All right, how's that? Is everyone good? Awesome. Oh yeah, so core infrastructure. bunch of VMs, have a bunch of services on those VMs, and then there's a whole wireless infrastructure. I'm not going to be talking too much today about wireless infrastructure. I'm a Wi-Fi guy. I've been doing Wi-Fi for over 10 years. I'm employed professionally for Cisco Systems working in the wireless networking business, and that's why one of the reasons why I built this lab out was because I need to work with the technologies that I support at work. Okay, so as I mentioned, I have seven demos that I was planning on doing. Hopefully we can get to some of them. I am VPNed in already, so I have access to all these resources. Who's familiar with
all of these options up here? So like VMware, PFSense, Grafana? Half, okay, awesome. Probably not too many are gonna know about MotionEye. That's an IP camera software solution, the worst with Raspberry Pi, it's very cheap. And then I have a Honeypot and Security Onion, so all basic. You should know about what these concepts are. Okay, so as far as the core infrastructure here, It's my primary data center right here. It's a 14 by 26 by three inch deep little hole in the wall. In it I have a ubiquity tough switch so that provides one gig connectivity throughout my 800 square foot apartment to all eight ports. I have an Intel NUC on the right which is my i7
compute. I have a quad core processor in there with one terabyte SSD. And then the box on the left is the Cisco wireless device.
Yeah, that's a good question. So the Nook is probably one of the newest components in this. And it was $1,500. And that's for, I'm totally playing with fire by moving this here. Oh no. It'll come back, I promise. So the Nook was $1,500 Canadian. The Synology NAS, which you haven't seen yet, was another $1,200 or so, plus the disks for $800, so $2,000. Oh man, totally should not have touched that. So, you know, the point is that the investment is not that much to get these capabilities in your home network. You know, to have VMs, VLANs, services, all this stuff, you can do it on a budget for sure.
So, all that time when you're sitting here thinking things are going to work great? It's true. Okay, and the Synology NAS. So I have a 8-bay Synology NAS. This is my second data center, or half of my second data center here. This has, the nice thing about the Synology, it has a really nice UI. You don't need to manage it too much.
Anyway. The Synology solved a lot of problems for me, because it's a turnkey solution where I can just download a package and it'll just have what I need.
send all that traffic into your whatever VLAN has that virtual network assigned to it.
This is when I would go log into VMware and you'd see the VMs running and see the virtual networks. Skip that. Talk really briefly about switching. It's important to have a VLAN capable switch. You really need to segregate your virtual networks. I was using a Ubiquiti tough switch for the longest time. This thing was great. It served me really well. It was like 200 bucks, gig ports, PoE, Management interface worked pretty well, was GUI based, didn't have to learn like a whole CLI. Highly, highly recommend that. There is another TP link that I mentioned up there, the SG108E, someone else said that they were using, we were having a lot of success and we're really happy with it. Again, that's a cheap, lightweight, consumer grade VLAN capable
switch. That's gonna have all the capabilities you need. Say again?
It's blasphemy not for me to be using something that's not Cisco. No comment on that one. This is my home lab, right? And I've had this going for a long time. I actually didn't put in updated photos that have pictures of Cisco switching in some of the screens because I'm not here to sell a Cisco solution, and I'm definitely not a rah, rah, rah Cisco guy. It just so happens that I'm now employed there. But we're using all open source technology stacks here, all open source solutions. So yeah, definitely try not to be a vendor agnostic topic. Did I mention everything? Yep. Since we are talking about Cisco, just really briefly, I have two of these 3560 switches, one in my primary data center and one in my
alternate data center, also known as the closet and the office.
There's a 10 gig ethernet link between the two switches. And I was expecting someone to ask a question like, hey, you're running VMs on your NAS, and your compute is in the other data center. So how are you going to manage that when you have a lot of ethernet traffic? And the solution is right here. It's a 10 gig link. Yep, NFS. So the NFS link between the VMware, the Intel Nook, to the Synology NAS. So I have a couple VMs running on the NAS and a couple on the Synology. Who's familiar with PFSense? How about OpenSense? Nice. OK, so PFSense firewall is just a virtual firewall. It's a virtual machine that runs on the Nook. I've been running
this PFSense instance for probably a couple years. It's been upgraded in line through the whole time and very, very happy with what PFSense has done for me. OpenSense is a fork of PFSense that has some of that configuration done for you and it has some of the analytics already created. For example, like NetFlow, some system monitoring, some bandwidth monitoring is built into OpenSense, whereas with PFSense you need to build that third party.
This is my router. This Intel Nook is my router. These are just some screenshots of PFSense so you can see some of the capabilities that it does here. This is actually just the login page so it's not too interesting. I was gonna ask a question here. You can see on the left and the right it's two very different screens. PFSense 2 I believe it is on the left and PFSense 1 on the right. Is anyone still running PFSense 1 other than me? Awesome, couple guys, okay cool. Yeah, this one has like an 800 day uptime or something like that, so I'm kind of afraid to touch it. So what does PFSense actually do for me? So it provides my DNS and DHCP services. Everything that joins my network
is on the .lan.nopics.ca domain. So it's really easy to go find a host that comes on. I can just go ping PFSense or ping NAS or ping Synology, whatever. It's really easy to find. Everything's in DNS and DHCP. We also have a captive portal if there's extra time, which there won't be. I was going to tell a story about how I'm using captive portal on a special VLAN with some special traffic filtering for some special users. We can talk about that one after. These are the screenshots from PFSense. So maybe I'll zoom up here and you can see this a little bit better. So the static DHCP leases. So here's all the manually entered names. for the static DECP assignments.
And then here are the dynamically assigned. So this is taking the host name from the actual device. Sometimes that's good enough when I leave it as the regular host name, and sometimes I want to give it a custom host name, and then I'll add a static assignment for it.
Cool. Here's where we would go into PFSense and poke around in there and look at the charts and use the interface, but I think a lot of people have had hands-on PFSense. This is just the stock screenshot of OpenSense. So when you go to the OpenSense site, you'll see this. I was running OpenSense on that special VLAN for a couple of months when it was in deployment and very happy with it. OpenSense was like totally a turnkey solution where I just spooled up the VM one night, put it on the VLAN, and was off to the races. Everything worked, it's just right out of the box, no issues. So talk about wireless a little bit
here. Just have a, actually, does anyone care about wireless? Okay, so, darned one guy. Okay, so I have four wireless access points in my house for different purposes, different use cases, ranging from high speed, high throughput with 802.11ac, wave two, to some tunneled access points that connect back to headquarters in San Jose and broadcast the corporate SSID or let me on to our engineering VLANs, for example, over wireless. That's Wi-Fi. OK, storage, we talked about the Synology a little bit. It has a very rich ecosystem of packages. I'm using Photo Station for managing all my photos. Everything, just dump it in the directory there and it does everything for you. What else does this say up here?
Yeah, monitor it with SNMP. You'll see a dashboard coming up where we see the health status of the Synology, where we see the RAID status, the temperatures and all that stuff, which is this. So this is the tick stack. So I'm going to talk about the tick stack in detail up in a couple slides. This is the front end of the tick stack. So this is Grafana. If you're familiar with the elk stack, which has been around a little bit longer, this would be similar to the K in the elk stack, the cabana. This is the visualization front end.
And this is when we jump to the demo. You guys want to roll the dice and see this? All right.
are with me for where's my mouse? First, we find the mouse.
You see this? Oh, not really. Center.
definitely not going to move the laptop this time.
OK, awesome. Yay. So we're looking at the last 90 days here. Let's spool this down to the last, like, just the one hour.
So what I'm tracking here is just the load, like how much CPU usage is going on here, the temperature, and then the throughput. I have one ethernet interface connected to it right now. And then just the disk status. So what's the temperature, high and low? And then what's the status of the individual physical disks? Whereas at the top, I have the status of the RAID arrays. So that's going to be the logical block. I have one more dashboard that, since we're here, let's take a look at it. This is the temperature and humidity sensor dashboard. So I have a Raspberry Pi that's connected to, there's two purposes. One, it's connected to the temperature and humidity sensor, so it reads the values from that sensor, sends it over the network
into the tick stack where I'm visualizing it here. The other use case for that Raspberry Pi is it has one of the Raspberry Pi camera modules attached to it, and it's taking a snapshot and streaming that video over the network. So probably I'll move over to that demo as well so we can see what's happening with the security camera system. that's getting the same data from this sensor.
Oh, perfect time. Raspberry Pi camera time. OK, so this is what it looks like. Everyone's familiar with Raspberry Pi? So on the network diagram, we're just going to be focusing in on the right side right now. So this is going to be the infrastructure that's set up just to support the camera infrastructure. And you might be asking, like, why? Right? Why is there like three Raspberry Pis taking snapshots and sending video, stress testing the network? That's really the only reason. These take snapshots of nature, they look out the window, send data over the network so that I can just have simulated traffic over my network. When I want to spool things up, I change the snapshot to either once every five seconds or continual 1080p stream.
Lots of possibilities. So this is logically what happens with the Raspberry Pis. They're in the middle here. And they stream their data all the time to a VM, this is an Ubuntu VM that's also running the same software. And then they also take a snapshot periodically, every 60 seconds, transfer that over, I think it's SMB, to the NAS. So let's try and access the demo.
Yes, okay, so this is pointing out over top of BC Place. You can see the display's updating pretty frequently here, so when we log in, we can see this in real time, it's very nice. If I can scroll down here, we have a view facing north, and again, you can see the cars going by. I have one other demo where I actually stitch all the 60 second time snapshots together to make a time lapse video, which is actually But the long-term purpose of this is, is just to have some nice time lapses of nature and what changes in the city. So let's take a look at what that time lapse is, if we can, on the mouse. Is this interesting? Do you guys care about this stuff?
Yes. It would be extremely easy.
Oh, that's my password.
going on here, just the temperature and humidity sensor. And we already saw the demo. So this is what it actually looks like in my apartment. I took this photo last night. Here I have an access point with the hardware developer kit on it. This is Cisco specific. But what's nice with this kit is that it has the temperature and humidity sensor. So that's my source for the sensor. In here I have a Raspberry Pi. It's pretty hard to tell. I had it outlined in a different slide. But here's the Raspberry Pi. So it just sits right on this development board. It's powered and Ethernet through the board through the access point. And then this is the camera module. So this is the ribbon that just plugs
right into the motherboard or the main board on the Raspberry Pi. It's a native module that comes from Adafruit, I believe it is. And that's the $30 one.
So this Cisco development board plugs into this access point. This is a 3,800 series Cisco access point. So yeah, you need the AP to get the power and pass the ethernet through. You totally don't need this. If you want to just go buy a $2, like a one-wire sensor, a humidity sensor from Texas Instruments, it's a lot cheaper than this $4,000 box. But it just so happens it was laying around, and I thought it'd be a fun project. All part of the day job. OK, so ElkStack. Are we mostly familiar with ElkStack here? No? Yes? Yeah, what's that?
Talkstack's been around a while. I started working with it probably four or five years ago, did a talk here at B-Sides in 2000 and something about it.
It's very similar to the Tick Stack, all the same components, very similar capabilities, but some key differences. Elasticsearch, the company that owns Elastic, or the other way around, Elastic, the company that owns Elasticsearch, has a lot of third party options now. as well as a lot of paid subscriptions. So if you want alerting, for example, in your Elk stack, you basically have to subscribe to their gold service, and then you get the alerting capabilities, which is one of the reasons why I started looking at other stacks, including Tick Stack. Logstash does not have a Tick Stack comparison, comparable product. Logstash is your Swiss army knife to transform data It can basically take any text data that's coming in, any log
format that you get, any data that's text or strings, it can parse it, you can apply tags, you can slice and dice it, you can send it to different data sources, duplicate it, deduplicate it, normalize, geo IP, all sorts of stuff. Logstash is extremely, extremely powerful. And it's the differentiator for the ELK stack. Logstash is pretty old and they have a new solution for getting data into your Elasticsearch cluster, and that's called Beats. So most people who are familiar with Elk are probably going to be familiar with Beats. It really simplifies getting data into your Elasticsearch cluster. Beats have connectors for packets, for files, and a couple others. Someone can help me out. No, no, no, no one using Beats?
Windows event logs, that's what I'm talking about. So getting your Windows event logs, Logstash will parse it with help from Beats.
No. When you use Beats, you don't need to use Logstash because Beats will send the data right into your data store. However, you can still push it through Logstash to do some additional processing or parsing or slice and dicing if you need to. Exactly.
There you go. OK, so anyway, moving on. Logstash has some, you've got to build all your dashboards with the Elk Stack for the most part. There is not an ecosystem for sharing dashboards, unlike within Grafana. So Grafana has a website you can go to. Everyone can post their dashboards on there with a dashboard ID, which is what I did for the Synology dashboard. I just typed in the Synology dashboard ID, pulled it down, and it's good to go. That doesn't exist with Elasticsearch. What's happened is the community has posted their dashboards on GitHub and other locations. And that's what this guy's done with his dashboards for NetFlow. So if you have a NetFlow receiver or
exporter, so a PFSense firewall, for example, or an Ubiquiti switch, those can both export NetFlow, pipe that data through your Elk stack, and now you have visibility into what's going on on your network. Just two screenshots from the actual product here. These are like stock screenshots from their website, so nothing too fancy here. The point is that if you want to get NetFlow going, set up the collector, get your ElkStat going, grab these dashboards, it's basically three high level processes, high level steps, and you're good to go.
So I mentioned I'm also running an SSH honeypot, because why not?
I found some public IP space that I found somewhere. It's been running for a couple years. And really, the only thing I'm doing is harvesting usernames and passwords that bad actors are trying on my network. I'm harvesting their IPs and their country codes so I know where they're coming from and how often they're coming. And do I have it on here? It's just for research. I want to know what the top 100 passwords are that were used this week. I can go pull that from my own data store. No problem. This is what it's going to look like when we jump over. Hopefully we can access this. So I just have on the left, it's
pretty hard to see, but the top usernames that have been trying to log in over the past, whatever this is, couple hours. So,
root, yeah, password, one, two, three, four, five, six is like the number one password, and the second one is no password.
I don't know if they're actually just trying no password. It doesn't work, right? Ever said a no password? Still need a password. So yeah, it's interesting. Let's go over the demo and try and look at the data here real quick. Where's the mouse?
So we're looking at the last six hours. So there is like, looks like no data around here. So I don't know if that's like an actual network outage or the attackers just stopped for some reason for a couple minutes or took a break. There was a big spike after that. So probably it was a network issue. Yeah, so pretty much all root, but a whole bunch of other usernames that are being tried here. Admin, PI, UBNT. So they're trying Ubuntu, they're trying Raspberry Pis. All the products that I'm talking about today, they're trying to the default credentials here. And then I have the IPs. So one thing that's really nice to look at here is
the maps and the ISP info. So we can see the ASN, or the AS number, the routing number that the attackers are coming from. So using an IP address for a reputation or a blacklist or a filter is kind of all right, but we were doing that a few years ago. We need to move it one layer up now to get a little bit more data. Roo gets a candy here for getting his phone ringing. Here we go.
So I'm actually looking at the AS number here, right? So instead of looking at the subnet, a slash 24, or whatever size the subnet is, we're looking at the ISP that's providing the connectivity or the routing. So if I want to use this information in my network somewhere else to say have a block, I'm gonna block whole China net or I'm gonna block the whole ISP that's bad. Maybe not the whole China net, might have some impacts for your network but looking at just the IPs is not good enough anymore. And it's also broken down by country code so I'm using the GUIP database that has all this information in it. And then we can
actually filter to say like, oh yeah, I wanna see all attacks coming in from like Australia. Okay, so let's see what they're trying. They're only trying root, they tried 21 times, and they tried all their passwords, a bunch of them twice for some reason, which is, I've noticed, an interesting pattern. And these guys, these guys are looking for UC Linux, which is probably some exploit or something that's come out recently.
So the architecture for this is the SSH Honeypot runs. It's Kauri or Kippo. That has logging just to a flat file. And then I'm either shipping or I'm processing that log file from Kippo or Kauri right on the box, on the Honeypot box, and then sending it into Elasticsearch from there. Is that it? Yeah, so Kippo and Kauri has a lot of capabilities. only for harvesting username, passwords, and IPs. What I would really like to do is open this up so that you can actually log in with your username and password that you provide, and you can download your malware, and you can attempt to execute it in my sandbox. That's a little too risky for me right now. I'm not doing security professionally right
now. I would really like to open that up and analyze the malware and look at the attacker's methodologies and really understand what they're doing, but I totally just don't have time. But all the infrastructure and all the framework is in place so that I can just reconfigure Kauri or clone the VM or whatever it is and have those capabilities, yeah, for sure. Yes?
Yeah, for sure. I haven't done enough analysis. And this collector is the longest running collector that I've had. And it's on that public IP space. I really do want to put some more around and study the different behaviors of the attackers against different IP blocks in corporate, in education, in residential, for example. I want to study the three different verticals or three or four different use cases. So we can also take a look at filtering by one of these, WebLogic. And we can see that he's just trying WebLogic, all these passwords, and just one attacker using WebLogic in the past six hours. But let's pump this up to four days here. Oh, there's one other guy here. So now we can start to see
that in Vietnam and Singapore and Malaysia, they're all looking for this one exploit. That's the demo for the SSH Honeypot.
Okay, so threat hunting and some of the software solutions that we have for this. Security Onion, who's familiar with Security Onion? Yes.
This is the off the shelf problem.
and it's all stored in PCAPs. Moloch also has some pretty nice hunting and pivoting capabilities built into it as well. But the hunting elk is, I think there's a lot of capabilities there. This is the dashboard with Security Onion. So maybe I'll move over to the Security Onion demo now, and we'll just hunt around in Security Onion and see what we can see. It's running on my home network. I love the new dashboard. So the new dashboard, I'm running a beta version of Security Onion. It's probably production now. I've been running it for a few months, but it's based off of Kavana 6, the latest and greatest Elk stack, and it comes pre-configured. There's all the dashboards that you're gonna see. I haven't done anything other than
turn this VM on and let it sit there.
getting really good at finding the mouse. Really hard to drive a demo when you're sideways. Okay, so, can we see this all right? Somewhat. So on the left we have all the pre-built dashboards that are gonna come with Security Onion. If you're familiar with bro, you'll recognize a lot of these terms, like the connection log. This is probably one of the biggest gold mines of data that we have. But all these other, everything else under hunting, like FTP, HTTP, DNS, these are pre-built dashboards that we can just go in and understand, hey, what's happening with DNS on my network? So let me unzoom this. So we're looking at the past 24 hours. There's been like this many DNS queries. Most of them are on these ports, but there's
some on 53, 53, so what's up with that? We can easily go figure that out. What kind of records they are, the source, destination, and then we actually have the full log breakdown, and then we have some additional highest registered domain. So this is doing some analysis for us where it's taking down the first part of the DNS query and just giving us the domain name, so that if we're seeing a lot of traffic to one, we can pinpoint in on it.
Let's take a look at just the connection logs so we'll see the source destination IPs, where everything's going to and from. This is all about traffic, but yeah, so here's my main host 101010.35. It's sending a lot of traffic out to somewhere. Let's figure that out. So if we click on this, it's gonna redraw the dashboard with just this one host here. So we can see there's a big spike between eight and or whatever last night, probably when I was working on building this presentation.
So mostly TCP, some UDP connections. And it's all going, it's all SSL traffic. And it's all going to 10, 10, 10, 37. And some of it's going some other places. So this is benign traffic that's on my network between two internal hosts. One thing that I don't like about Security Onion in this setup is that we're not looking at any DNS for the individual IPs, so I actually need to leave Bro to do a DNS lookup to understand what that IP's name is. So there's probably some integration that I could do in Security Onion to get that going, but out of the box, it doesn't come in the setup like that. So that's Security Onion. Any questions on that?
I mean, I hope I don't have any Modbus in the network. No, so these are all pre-built, right? So if we move this down to pass, like, back to 24 hours, is there any radius in my network? Probably not. Is there any, like, FTP? I hope not. Yep. Nice. Lots. Right, so some device is going to get some, like, site stats.xml or whatever, right? So that's pretty cool. It's anonymous too, right? It's great, you can go download some stuff. I found some really weird stuff in here, like searching, yeah, it's pretty awesome. Yeah, so this is all coming from Bro in Security Onion, visualizing it within the Elk stack. Wanna look at any of these other ones here? Some nice
ones for like SSL, so you're looking at the providers of the SSL certificate and how long it's been, how long ago it was generated, for example.
Oh, yeah, no, this is all on a private network. Yeah, so my tap is actually tapping the WAN and the LAN. So if there's any other traffic on the outside interface of my firewall, I'm picking it up. Does that expose me internally? How would that expose me internally? So it could expose me internally by exposing me to sensitive traffic that's on the outside of my WAN. But really, that's my ISP zone. They should be locking that down. I've seen some weird TFTP stuff that my neighbor's devices are doing that's on the same LAN. But yeah, it's like my ISP has some work to do, for sure. Just lots of really good information. Okay, so got about 10
minutes left and like 82 slides, I think. So it should be no problem.
Okay, I'm just kidding. This is the last one. Okay, now what? Was this useful? Did you guys get anything out of this? Was there some cool stuff? Okay, great. Hopefully, again, like I said, you can take something away from this and be like, hey, Joe, set up the Elk Stack, or hey, let's get our Bro sensor network going, or let's investigate open source tools, or let's start using virtualization technologies in the way that we haven't before, if that's where you need to be. I'm on the Mars Slack. I attend the Vancity Sec every, I think it's the second Thursday of the month, and I encourage everyone to. It's at the Central City Brew Pub. It's
like informal, just come talk about how much you love technology and whatever else. InfoSec, I think, too. Please stay in touch. I'm on LinkedIn, whatever. That's this, and that's my talk. Thank you.
So we still have a couple minutes, so if there is Q&A, definitely let's get interactive.
For sure, so if you deploy a TP-Link router for 100 bucks, and if you buy a bare bones PC that's gonna support virtualization, that's gonna be like, what, 500 bucks? Okay, and then you need like a VLAN, no, you got the VLAN cable switch, and you got the micro PC, that's all you need, right? Get VMware on there, build up some VMs. Yeah, so the Nook that I have is the $1,500 version. It's the latest and greatest. There's a new one coming out in a few weeks that's like twice as much power and more CPU and all that. But there is legacy Nooks that are like a little bit smaller, like this size physically. Less
resources and they're like two, three hundred bucks, I wanna say. So definitely there's lots of different deployment methods here. Hey. Yeah, man. Did I put it in the end? No, only in the front. It's going to take like 30 clicks to get here. What else?
Yes.
Yep. Yeah, exactly. So you don't even need to go buy an Intel Nook. You can use your old Beater laptop, any machine really. Before I had the NUC, I actually built a PC, a mini ITX. So it's just the motherboards like this and the boxes like that. And that thing was like $1,000. And that served all the needs for me.
It's all through VMware. All the traffic routes through my virtual firewall. And there's a span port on that virtual firewall.
Right, yeah, you can tap two hosts, no problem.
No man.
Have you? Okay, has anyone else? One, couple, okay. Yeah, I mean, I have so much R&D in here already and so much on my to-do list that like, hey, let's open a new protocol and find all sorts of new bugs and issues and just hurt my head. I'll do that later. Hey, man. . Are you familiar with things like PyHole? For DNS, yeah. Yeah, so PyHole for DNS is like your on-prem DNS service that does blocking for ads and malware and stuff. Yeah, I actually did deploy that one night, and then it wasn't production ready or crashed or something, and I never looked at it again. But yeah, definitely looking at your DNS queries. a very nice source
of information.
Yeah.
Right. Yeah. Cool.
No. Can you tell me elevator pitch, PFBlockerNG? OK.
So ultimately I'd like to take alerts from Bro and Security Onion in the Elk stack and alert on those. And then not only alert but also create rules, block things, send emails or whatever. Yeah, nice.
Yeah, cool man.
That's actually one of the main use cases for operating this whole infrastructure is that because I work with some technologies on the day job where I want to understand what's happening with this device. What TFTP server is it trying to hit? What data is it trying to send out on the network? Or how much data is it sending out over time so that when people ask me, happens when I run this on my enterprise network, I can say, hey, on my enterprise network, it took this much. And on yours, it's probably going to take this much. Yeah. Yeah, I actually did a talk a few years ago on the RF side. I have some.
Okay, normally I'd be napping right now, too. I think the back row's going to be napping. They're all my guys.
Yeah, who books a conference on National Napping Day?
I'm Derek. Nice to meet you, man.
I don't know. I know. Then I got more pressure.
Is that security? Is obscurity security? It's got security in it.
That's true. As it's obscure, it works when it's no longer obscure.
really funny, when you run your own honeypot for a while, you start seeing people log in who know it's a honeypot, start having a lot of fun with you. I remember going through a few logs, and I'm going, okay, they've really been messing with me. They're doing all sorts of typing stuff. Then they started typing messages to me as the messages go on.
It was not the CIA. Saw one of that down here. Don't trust free Wi-Fi. Saw that one.
Yeah. When I was down in the States, we used to use NSA one all the time. Actually, we put definitely not an NSA surveillance in.
I actually have one that's actually for malware, so I named it malware. You'd be amazed how many people try to connect to that.
That's why I usually leave USB sticks in my check bag just in case they want to do that.
Go ahead and check my phone.
I'm Derek. I am not the speaker you're looking for.
Oh, you're not allowed in here. Oh, you're not allowed in here. Come on.
That's why I said, oh, my guy's back there.
I do work for a major at telecom, so yeah.
I gave this presentation to my work audience about two weeks ago. Trust me, you can be much worse. Where do you work? I actually work for Telus. But they're not part of the presentation.
Are you based out of Vancouver? No, actually I'm based out of Edmonton. But we've got guys all across the country. So we're all a remote team. And so yeah, so we've got several guys actually here from the Vancouver office. We've got a few from, a couple from Edmonton. Andrew's out at Ontario. Makes me even fun. Yeah, it's like, okay, what time? OK, I can do that. No, no, 8 o'clock Eastern Time. No, can't do that. Not waking up at 6.
You're lucky I'm on camera.
and go to work for Shopify.
Do you want a chance vendor cookies? Put some chemical in and make you want to buy stuff.
We got another seven minutes, actually. We haven't got enough 7 minutes yet.
Depends if you want me to start. I can go early. I don't know. No, you're right. You're right. Okay, I was going by the clock. Oh, analog. Going analog, are you? I like analog. Analog has definite advantage. I have no idea how many analog to digital converters I have for different things. measure direction on, but I need to measure down to 0.1 degree. It's going to be a 32-bit analog. We used to tell
them that until there were no more analogs. Well, they used to say, don't do...
Video was the original thing. Don't do video. Do pictures, but don't do video. And I was always like, well, isn't just video a bunch of pictures? Because ISC Squared, a bunch of the other groups, they still say that for forensics. Don't do videotaping. And I'm like...
different levels of people. And you know we have guys that you know developers that haven't done very much at all the security and I've got a guy that's a network architect. Correcting me on stuff. Man you just keep on your toes. You just have to keep balancing. I can remember one course and I had you know a few normal you know first year firewall.
OK, I think we're going to get going. Good afternoon, you guys. I just want to make a quick announcement before I turn things over to Derek. Our after party is this evening from 6 o'clock on. We will have a shuttle bus leading from Seymour that holds about 45 people at a time. And they'll be rotating, they'll come back to see if it drops off people. So that's from 6 o'clock to 10 o'clock. And the other thing is, stop by the Trinimbus Mirai table booth for the actual address. And if you want to walk over to Nathan's sony, about a 15 minute walk, 20 minute walk on gravel street. Thanks for coming out. Derek, take it away. But no idea where it is until we talk to the
vendor. I can tell you, but I'm not. That's wonderful. Thanks, guys.
Is that what I'm doing?
What do you say for us? Derek is going to give us a fascinating presentation, which you will notice is the right price. I'm not giving money back anyways. Thank you guys for all coming out. My name is Derek Armstrong. I'm actually a senior incident handler with a major telecom slash ISP here in Canada. I do a lot of the incident response and a little bit of forensics with them. And this is a talk that I wanted to do for a little bit of time. So about this talk, you know, why this presentation? This presentation came about because I had a lot of conversations with managers, with executives, asking me what blocks can I buy to fix IR.
How can we do incident response fast, timely, triage stuff, and do it, you know, and how much money is it going to cost? Obviously, if you hear this talk, my idea is cost should be zero. It's not a box, it's a people problem, it's a people and process problem, it's not necessarily a technology problem. Who am I again? I'm Derek Armstrong. I'm actually out of Edmonton, so if anybody's actually out in Edmonton, let me know. We
have quite a vibrant community now, which we didn't even have two years ago. It was started by a couple of gentlemen and has been getting a lot of traction. So I'm going to start off with what is Live IR. Live IR is incident response on a system that is being used. It is actively being used either by a user or by an attacker. A lot of times we do find attackers on systems, malware, people logged on, or we find evidence that they were. But even if they're not, the user usually will be. So all the tools, all the examples that I'm going to provide are done without the user's knowledge. These are two ways. Number one, to not inconvenience the user, but
also in case the user is implicated in something. Where does it fit? It really fits in the triage section. So it's where, you know, you have an IOC hit. and you have 100 systems, maybe 1,000 systems, all implicated, all going to a website, downloading a file, doing something suspicious with that IOC, how do you determine which ones should be investigated further? Many times it will lead to identification of a false positive. So they went to a Russian site, .ru site or something. Okay, maybe it wasn't malicious. Or it may lead to further in-depth. forensic analysis. A lot of times when we do this, ones that turn out positive, we'll end up doing maybe a RAM image, doing an
investigation that way. And in other cases, we'll do the full disk, dead system kind of forensics. However, it is not suitable for anything that would involve law enforcement or a court involvement. And the reason is simply because all the tools you see here do touch the systems in some way. They tamper with them. We'll change evidence, logins, files, processes. So there is some, you know, you have to be cautious with doing any of these kind of techniques. We have had times when things have led to a law enforcement engagement after something like this. So in those cases, what we do is we make sure we take careful notes of what system we touched, what tools we ran, and when we ran them. So what kind of
evidence do we normally collect during LiveIR? We grab system information, host name, IP address, kind of information. We grab process information, process lists, can be parent process information, memory per process, all sorts of things. Networking information, network connections that are open, network ports that are being listened to, ARP caches, DNS caches, all that goodness. Historical information, this usually is what I call log information, so Windows event logs for instance, application logs, stuff like that. And configuration information, how is the system currently configured to connect to the internet, how is it currently configured to boot, what kind of automatically running programs are there. How about this shit's expensive, isn't it? Usually it is. I mean, I could be honest. You talk to the vendors out
there, they can all sell you boxes for a lot of money. Even my favorite vendors are out there. They can sell you hundreds of thousands of dollars of software easily to do this kind of stuff. But, you know, it's not necessary. There are options open out there, open source or free. There are some commercially free ones as well. They just take some scripting. Again, this is a people and process problem. It is not a technology problem. So options for LiveIR. Now these are just some examples. There are a lot more. These are ones that I personally played with and taken a look at. In a few cases, we've actually used and actively used it today.
PowerShell remoting. There's several options for PowerShell remoting. Option commandlets available that you can download, run, pull in information from endpoints. PowerShell WMI, it's another example. I separate them out because they really are two different things. They both use PowerShell, but the network commands they use in the back end are different. GIR, which is Google Rapid Response, is another one. MIG from Mozilla is the Mozilla Investigator. I always say it wrong, but that is right. So it's this internal tools. Redline from Mandiant slash FireEye right now. And OS query was another one that I just added on the list. So PowerShell remoting. Canza is one in...
I can go back if you want too. That's why we got the remotes. This will be online. Actually I will be presenting the PowerPoint presentation and all my code. that a code that I have will be online as well. None of this stuff is copyrighted. This is all freely available. One example for PowerShell remoting was Kanza. Kanza is a nice little option. It's well supported. There are a lot of modules. And when I say a lot of modules, I mean in the hundreds for doing different things. It does take PowerShell remoting, which is
usually turned off in most organizations I've come across. So you either need to turn it on or maybe it's not an option. And then there's native PowerShell commands, which is kind of the write your own shit kind of script, I mean, kind of option. If you know PowerShell, you can do it. All the information, there's get process, whole bunch of commandlets built right into PowerShell to do a lot of the stuff. So PowerShell... WMI, there was PowerShell R2, which is another pretty well supported platform, not quite as well as Kanza. It's not quite as powerful, but it does run on a wide variety of systems. It only needs version two of PowerShell. That's it. If you have that enabled and
WMI enabled, it will run. It works, like I said, where PS remoting is disabled. I'll give you an instance, our organization. And it's designed to work across a large number of systems. So when you run this, you will actually choose a system or a list of systems or a domain controller to pull all systems in. So you can literally do this across an organization, hopefully smaller than us.
it was really good and Google actually played it across just about all their systems with the exception of their cloud. Their cloud customer systems are not touched by it, but all their internal stuff uses, has the agent for GIR on it. It's client server based. Some advantages and disadvantages to that. It scales really well. Google scale. That's not much more to say. It does have some live memory forensic capabilities, but it's still coming along. Now the downside. It can be very complex to set up. I've set it up three times. I don't think I've ever got a consistent set up yet. It does require a pre-installed agent. They are very good at supporting the agents though, so the agents are easily
kept up to date. They're just EXEs to install. They run as a service, pretty much standard. You must subscribe to the Google way of thinking. What this means is that you don't do a hunt. You don't do an investigation in the normal sense. You have to do things in order of a query, then a hunt, then an investigation, and then I believe a report at the end of that. So you've got to kind of think of their way of thinking. Very powerful if you're willing to. If not, it can mess up your mind.
which is Mozilla's investigator. Another good one. I've actually had good luck with this one, although I am not a Go programmer, so I have to take their word on a lot of their code, but it is an IR framework built in Go. They have been developing it actually for about the last, don't quote me, but maybe five years now. It is client-server based again.
With each release of it, they update the agents, so you're kind of responsible for updating the agents yourself. But it is real-time query. This is not a phone home system where the agents phone home, put data into a database, and go on. This is real-time query. Server goes out to all the clients that are online, will send out the command using message queuing, and send a request out live, real-time. So if you want to know if there's an IOC on a box, you can send the command, you'll get all boxes back that have that ROC in live time. It's a bit easier to understand. Command line based. Very long command lines, but it's just command line based, so you can specify targets and options for
process, disk, memory, and then some queries to run against them. And it will literally go out and send 10,000 queries out there. It does have the ability to query RAM, which I really like. I'm a RAM forensics guy, so anything that I can say, you know, I'm looking for an IOC, go look in memory, see if this Yara rule may be hit. Great. I can find a lot of stuff that way. But again, it does require a pre-install agent. Some environments are very picky on agents. included, you don't want to be running too many agents, you already have AV, you already have endpoint protection, you've got endpoint visibility, you may have a data protection agent on there, you know, what's one more agent? And
unfortunately not all platforms are supported equally, the Windows platform being the least supported, Mac is better supported and Linux is of course the best. And that goes by Mozilla's environment.
Everybody can still hear me though, right? Okay, good. So yeah, because of their environment, they're mostly Linux, and then they have a fair bit of Mac, and then they have very few Windows. That's the reason for that. They are very willing to have people help them on that though. I was actually contacted by the author and asked me a few questions on some stuff, but like I said, I am not a Go programmer, so that would not be my cup of tea, at least not yet. Sysinternals. This was actually one of the first tool sets I started using. One of the ones we use right now. Developed by Mark Rosinovich. He was on an independent
company, then Microsoft essentially bought him and the technology. So they now distribute it. It's got a benefit though. I mean Microsoft's official seals behind it, so all those PS tools, all The SysInternals tools are now officially supported by Microsoft. They may not support it in every configuration, but they do have a little bit of stamp behind it if you're on the enterprise support agreements. They will help you out. There are many useful tools. PS tools, which are command line tools for doing anything from disk to process to whatever on a system. Autoruns. If anybody's never seen Autoruns, download it. you'll be amazed at how much crap is running on a standard Windows box or how much starts automatically. It's even
more scary when you see how much stuff is set to run and will not run because it's either not there, been uninstalled, been broken or deleted. There's usually, you'll find about 25% of the list is broken already. Process Explorer, basically your task manager on steroids and well, on steroids on steroids. Process Monitor. So if you ever do want to have an in-depth look at what an application is doing, what disk rates and reads it does, what network activity it does, what processes get started, stopped, threads get started and stopped. Very, very useful set of tools. Tools are easy to script, as you'll see. And again, I am not a programmer. I am a hacker in
the very traditional sense of the word. I will take code that everybody else writes and try to hack it together into something that actually works. Many command line versions are available. Sorry to any GUI programmers, but I like command line just because it's a lot easier to script. And they're easy to modify. My next goal is to get prefetch files, log files, registry files, pull them into a version that's easy to access. The reason we haven't done it yet is simply because those items take the longest. We're looking at getting in and out of a system hopefully in less than five minutes. And pulling in log files can take hours in some cases depending on your network length. So it's one of my next goals. And I like all
this for less than the price of coffee. This is an example of script. Very simple. Just running commands. Sending the output out to an output directory. That's it. This is actually what we do for a lot of IR. Pulling in information from the system. So what's the patch level? What's the OS? Stuff like that. Who's logged on? Who's logged onto a system? Process listings. Process listings. Running the... Oh, yeah.
cache. I don't have the ARP cache on here for a reason because at Dells we actually do a little different. Netstat, getting your listening connections, your communicating connections out. What services are available and then of course the auto runs. Auto runs actually pulls all the information, steps it into a CSV and you can see everything it runs. In fact actually there are options to even have auto runs check virus total. So every item that's set to automatically run Is the hash to virus total? You can pull in the result, say you can sort it by. Oh, I got 5 out of 64 hits on this guy. Better check them out. Yep. Is that for the community person, or do you have to have the paid person? All
community. All community. You have to say that you have accepted end user license, but looking up hashes does not require any commercial or non-free community version. That was what I liked about it the most. I don't even have Vyres Total commercial. And it's probably the one item that finds the most malware. Because malware may not be visible on the system. It may not be in your process listing. But it will be auto-running, unless it's ransomware. And you'll see it there. It'll pop out usually like a sore thumb. Or PowerShell command. You'll see a PowerShell command, and literally the output will be 20 lines long. That's something to go investigate. Redline for any of the GUI guys out there.
Redline is developed by Mandiant, which is now taken over by FireEye. It is a combination GUI and command line tool. The collector agents are command line. They run command line. But the actual tool to create the agent and to actually analyze the data are GUI. Now up until the latest version, Mandiant had their risk score database included with it. They just took it out of the latest version, which is actually a real hit to the community because it was one of the nice things. You could say, oh, well, we know that this is a bad process based on these kind of scores, and you could see that in the end results. It was really easy to triage. Now
that they've taken that out, it's much more up to the individual investigator.
Collectors can be configured for as little or as much evidence as you want. I've literally done this where the collection takes two minutes and then I've reconfigured it and I don't think it ever finished over a week or the last thing. You can actually have it go into every process, pull out the memory and download every string out of that memory and record that to a file if you want. Again, why would you? But availability would be there. The analysis of collected information does take time. So the collection agent may run for two to ten minutes, but you may spend another hour importing it into the collection tool, into the analyst tool to actually do
it. But once there it's really nice. You get a process listing, it's all GUI based. You can see the process tree, what creates what. Like I said, previous versions you could actually see the risk score on all of the items. Then the latest one, and this is one actually I just started to play with in my sandbox at home. Actually if anyone was in the last talk I really appreciated the way he did his sandbox or his home lab. It's kind of a similar idea. If you ever have a chance to get a system, put a bunch of RAM in it, put VMware on it, by all means do it. It's so useful. OS query
was developed by Facebook. Yeah, that Facebook. Anything developed by Facebook, I was actually really hesitant over. I didn't do anything with it for quite a while. Then somebody at one of my sessions actually said, hey, you know, I do this and this and this. I'm like, okay, I got to try it now. If you have any database skills, if you're an old DBA, it's, well, I say it's SQL-like. It is SQL, actually. You will actually say select star from processes where system equals something.
It actually does live, kind of like MIG, it does live real-time plus it also has historical information. It's getting a lot bigger. It's getting huge. Multi-platform, Mac, Windows, Linux. There's agents for many of the server.
I know there's ways. There's a Docker setup for it. You can run under Docker, have that going. If you don't want to run it on your own bare metal system, lots of options. And again, it requires a pre-installed agent. But it is one of the smallest agents I've ever seen. I think the initial agent was 73 kilobytes or something like that, which is amazing. And demo time. Now, I recorded the demo because I did not trust my internet.
I know that. So I video it.
Now I need to change the window first.
It's not going to work on me.
OK. That should do it. There we go. Hopefully you can see it. It's a little hard. Yes.
Actually, the whole thing's a video. That's one of the problems. And it's unfortunately a little bit on the small side. So what this is doing, this is the sysinternal script I literally showed you earlier. First one was, yes?
Oh yes, absolutely. Woohoo, I like dark. Always makes it easier when it's white on black right in me. Literally, and this is real time. This was real time running with the exception of the last command. The auto runs, takes about five minutes. I just cut that down. This is all against a remote system. This is all running, no pre-installed agent. Script is running from an admin PC with admin credentials and connecting to the victim or host system.
running the various commands. Again, I wish this was a little clearer. You can see PSService just ran, goes to run for a few minutes, and then we'll continue on. The scripts for this, actually, I'm gonna be making available on GitHub. There's nothing copyrighted in it. They're very useful. Use it, hack it, delete it, hate it, I don't care. I just hope you guys find something useful out of it. PSService runs for a little bit, and then the auto runs. In a couple of these cases, files are copied to the end system. That's why I do say be careful for law enforcement. For instance, the PSExec service, what it's doing is actually copying autoruns to the end system, running it there, and then pulling in the
information back. That's actually how PSExec actually works. Other abilities of PSExec, you can even run a remote command line. So you can actually run PSExec commands, CMD. you can actually have a full command line into a remote system, assuming you have credentials. So I snipped that one so you can see some of the files. It just creates a bunch of output files. They're all text, CSV, TXTs. Hopefully it's going to view it a little bit better. Anybody that's ever run any netstat commands, you should recognize. That's literally all it is. It's just the output of a netstat. listing with the sorting so you can see what starts what. And there's a whole bunch more on that one too. There's a lot of information.
And then the auto runs. I'm just going to show you the auto runs here. You can see literally this will be date, where it's running from, who's running it, process, command lines, hashes. So here's the virus total hits for instance. You can actually sort on that. It'll get all the hashes for all of them. So you can do, you know, if you have a suspicious file, you can look at it. Again, one of my most useful kind of features. So that was it. That was SysInternals tools. Should be just a second here. Obviously didn't stitch it together. And then I'm going to run Quash. This is the PowerShell WMI set of It's literally just running in a PowerShell window. You can't necessarily see it,
but the first one is a domain controller. The second one is a list of IPs or systems. And the third one is just a single IP. So as tactical or as strategic as you like. In this case, I only had one system tested against. You'll see you'll pull down a bunch of information. Again, this is live. This does not take very long. This is why I like using it for really quick triage. I'll get a hit. I'm not sure if it's any good. Let's just put this on there, see what comes back. Usually from this we have a good idea what we want to do with the system. Sometimes we'll send it for scanning. Sometimes we'll send it for re-imaging. Sometimes we'll send it for forensics. Again, you can't
see all the information, but there's logged in users, services that get started, environment variables, groups. There's a users one in there, drivers, all bunch of information. All of these CSVs get generated. The interesting thing with Posh though is that if you run it against multiple systems, each one of these files will have multiple systems inside of it. So you actually have to go through and specify which system you'd want. It's good and bad. It's not so good from a tactical situation. If you have 1,000 systems and you want to see which one of these systems has this process running on it, you've got one file to look through. You don't have to look through everything. But that's it. It's just text files.
This will be networking information, system information, IP address.
All useful. All useful information. I think that would be it. Sure.
It depends on what they're looking for. Again, I always start off an IR based on what triggered you first. If what triggered you first was a network IOC, so an indicator compromise, I'd start from network, literally. If you don't know what you're looking for, I usually go system, process, network, and then go down from there. So it depends on what you're looking for. I usually do a lot of these. We'll use the Sysinternals tools. What we'll do is we'll look at the other ones. I'll look at the process listing first. Malware, almost never see anything there. Almost all malware today takes itself, copies it into another process, and you'll never see it. The only place you ever see it half the time is auto-run. That's a lot
of times. I'll look at the process listing. Nope. Okay. Network. Nope. Okay. Auto-run. I'll go through that. The nice thing is because of CSV, you just put in a cell or whatever spreadsheet you like, sort it, filter it. They actually look at, I don't have an example here, but they actually will go through and they'll actually look for, they'll check the cryptographic signatures on all of them. So one of the things I'll do is I'll go and I'll take out all of the ones that are signed. Give me all the ones that are blank or unsigned or are invalid. Go through that. Make sure I'm good on time. Okay. I'm pretty much out of time,
so.
Linux, there is some stuff. It's a little bit more scripting. You have to SSH in, and you have to run the commands. The nice thing with Linux is almost everything has SSH. Mac, where SSH exists and is on, you can use it, but no, not as easy on Mac.
No. Well, none of this is neat and easy. I'll say it does take time. It takes some knowledge. But I'll go quickly with what's next. Advanced evidence collection, endpoint visibility, Sysmon, proactive IR, pulling all the information into a central repository before it actually happens so you have a baseline of before and after. And then the Windows subsystem for Linux. You now have Kali running on Windows, although not complete, but I still think it's a majorly cool advancement. And then you guys can do this too. I'm not a programmer. I'm definitely not. I am probably the poorest excuse for a programmer and the only limitation is creativity and time and need maybe. Necessity is mother invention. And this is where all this stuff came
from. I will be around afterwards because I know we have another talk coming. It's only half an hour so I want to be respectful of time. My contact information is up there. I'm going to be around. Don't be afraid to come up and say hi. And thank you guys very much.
Yes? What's your GitHub repo? Actually, I haven't put it up into the repo yet. I was actually working on it last night. Do you know what the repository it has for you? It'll be under my username, so you can look for dsplice. If you look for dsplice, I'm the only dsplice I know on the whole internet. We've had that acronym for 27 years now. So I hope it's you. I'm
going to pull that out. I'll grab my power bar after
you're probably going. Yeah, just pull it up. Either way, I don't care. Might get you going.
Thank you.
We have the break afterwards. I'm going to try to keep it to 30 minutes. Yeah, but still I want to be very mindful of your time. So thank you. And I think the previous speaker left his phone. Or if that's yours. It's not mine. He comes back. Correct.
There will be a prize at the end of this. There's a prize? There's also a quiz. No, there's no quiz. Okay, thank you for being here. Thank you for having me. Thank you to the volunteers. Thank you very much to the audio video guys. They don't get any love. This is a new venue for them. And as I've figured out from reverse engineering the equipment here, it's pretty ancient, but looks like we're all set. So today's talk is building a predictive pipeline to rapidly detect fishing domains. So this is something that did around the November-December time frame. And I'm going to be releasing the slide deck. I've got a command line tool, which I'll show you. We'll get into it, but you can see it's
running right there. It's a little bit small, but you can see some of the bottom ones, paypal.com.somerandomdomain.site. And I'll get into all the details on the plumbing and the configuration, but that's running in the background, and we'll revisit that. I'll be open sourcing the tool. I've got a Jupyter notebook. like an IPython notebook that explains from start to finish exactly how to go about going from this idea through this machine learning lifecycle to build a predictive model, deploy it, evaluate it, do model corrections, and so on and so forth. One of the things that I've seen from being an attendee is a lot of times that we have very technical, sophisticated very good speakers, talk about these really novel ideas. And as the attendee, I'm sitting
there, I'm like, OK, well, where do I get started? And that's kind of sort of the premise of the Jupyter Notebook, which I'll show you, so that you can do that for yourself as well. My background, I work for a security startup in San Jose. I'm the security analytics lead at PatternX. Before that, I got my start at a college at Northrop Grumman. I spent five years doing security research. I was a security guru myself. And I was on a team that was doing machine learning for intrusion analysis. And that's how I got my segue into machine learning. I wanted to thank my employer, number one, for covering my travel expenses, letting me be here and hang out with you guys, and letting me
open source the work that I did for this presentation. have about 30 or 40 predictive models that are far more sophisticated and cover spatial and temporal features across different data sets, coupled with graph analysis to streamline root cause analysis and breach scope. So I said, hey, I think this is something that doesn't bother anything that we're doing that I think people can use and learn from. And they said, absolutely. So thank you to PatternX. Why B-sides? I like the In a smaller venue, if someone has a question, we can have a brief dialogue. At Black Hat and RSA, it's someone talking at a whole bunch of people. I don't really like that. And I've also found that a lot of the audience members are very good security practitioners
and curious minds. So I like B-Sides. A lot of the security researchers I follow on Twitter, they routinely present at B-Sides. So that's why I'm here. And why machine learning? I mean, I'm still pretty early in my career, but I've never seen something so overhyped that's underrated and misunderstood. And so that's what I'm kind of here to kind of show you that if you're, you know, we live in this data centric world. And when you look at security operations, you've got arguably the most dynamic data sets you can find anywhere. And there's a lot of talk and jargon and hype and then some actual operational feasibility with machine learning. So my objective for this is also to show from a very high
level how predictive models work and how you can solve a security problem with it.
A little bit of me understanding the audience, is anyone by show of hands familiar with the certificate transparency log network? One hand, two, three, four. You might be a little confused by the Twitter check mark. Twitter has nothing to do with HTTPS. The premise behind verifying accounts on Twitter is what lends itself well to explaining how this works without getting into the nitty gritty details of SSL and HTTPS. So for example, if I made a fake Twitter account at real Mark Wahlberg and told everyone I was Mark Wahlberg, there's probably another 50 accounts that that claim to be Mark Wahlberg as well. But the one that has the check mark is the one that people kind of,
you know, reasons being that's actually him. When I came into the country, I had a passport. It was a US government issued document that had my name and all of this. And that was kind of like a trusted authority that verified who I am. Conversely, if I had just written that on a napkin in hand of the guy, I wouldn't be here today. And so again, not getting into the details, but that's what they use SSL certificates for. They say it's secure in that it does encrypt your communications between your client workstation and that server, but it doesn't validate that the server is actually Google. It could be some other fake website that looks a lot like Google. So that's kind of the idea. The problem
here is that with HTTPS, you've got lots of certificate authorities. And when they issue these verifying who people are, haven't had a way to monitor these SSL certificates and the domains for which they are issued to until they built this. And I think this was spearheaded by
top-level domains, like .ml, .tk. It's usually $10 at most for using other TLDs like com or .net. You can get an SSL certificate now for free, thanks to these SSL certificate free services like Let's Encrypt. And then you can also get an IP hosting provider like maybe AWS or some others. Google or Azure where at a minimum you can get a credit and get a free server in the cloud. So that recipe you can conduct or get the infrastructure for phishing for the cost of absolutely nothing on the topic of the previous talk of a budget of zero dollars. So it's very cheap, it's very easy, and again this notion when you have an SSL certificate installed on a web server, you get
this green lock. And a lot of people, if you talk to your aunt or uncle or parents or non-technical people, they're like, well, green lock, it's secure, right? And that link is encrypted. Secure, it depends on your definition. And I did want to highlight this. This on the right-hand side is from Phish Labs or Phish.me, which is now, I think, CoFence. And they're showing this trend. I mean, that's not linear. That sure looks exponential to me. And that's this trend of phishing attacks that use SSL over HTTPS. And that's kind of blistering. For me, I was on Twitter, again, in the middle of November, and I saw this very popular prototype by an anonymous threat researcher named XORS. And it's the first one I saw. I've
seen probably 10 more, and it's the best that I've seen. And he's essentially using keywords. If you can see the small font on the left hand and in the middle one right here, he's saying, from this fire hose from CertStream, if you see this keyword, give it 60 points. If you see this one, give it 50. 40 for YouTube, Facebook, Twitter. And on the right-hand side is some of the logic for the scoring. So if the score is over 100, put in big red font. If it's just over 65 between 80, say potential. And when I first saw this, I was thinking, OK, if I'm going to use this, you can do a git clone. You can run it. And if the scores were zero or just question marks
or I had to fill them in, how would I know what to do? And that's kind of what got me thinking about machine learning to build a similar prototype. When it comes to machine learning, I immediately thought, is this the most optimal scoring framework? If you run it, it works great, but I don't know if it's the best, and I don't know as the trends for phishing change over time. For example, we weren't really talking about blockchain or bitcoin even a year and a half ago, or at least three or four years ago. How do you know that's the most optimal scoring framework? Do you have the best keywords or some unnecessary? And then looking
at Levenstein distances, which is a fancy reference to similarity. So if you have PayPal and Pavpal, the Levenstein distance between those is one. There's one character substitution. So if you take these targeted brands, a lot of times you'll see typo squatting. And when I'm building a predictive model, a really good feature is to look for words that are similar to these top 30 or so brands. And I'll get into that in just a moment. As far as machine learning, especially with predictive modeling, you need historical data. Do we have training data? Do we have examples of what phishing domains look like? Yes. Can we characterize what I look for in these domains into features? The
answer is also yes. So with that, I thought I could build a predictive model, and that's what this is. By a show of hands, has anyone here built a predictive model before? A couple. So there's no real strict definition. the definition of the life cycle. I've loosely identified this in four different categories, so I have this idea for what I want to build. I have to wrangle the data, so data normalization is really important if I'm ingesting URLs or just domains or hosts, or fully qualified domain names. I have to normalize them and then send it into my pipeline. And then third is this feature exploration, looking at keywords, looking at brands, looking at Levenstein distances, and I'll get into that as well.
evaluation, doing the deployments, seeing if I need to do any correction of the model, and it's an iterative cycle. So as far as supervised learning goes, we'll start in the top left, just a very brief primer. Let's say I have 5,000 malicious domains, phishing, and I have 5,000 benign ones. That's in the top left. And then I have labels for those. Zero means benign, and one means phishing. And then feature vector, you'll see in the top right, a really fancy word for characteristics. So if you were to describe what I'm wearing, I'm wearing a jacket, I've got brown shoes, I'm wearing a belt, I have a watch, I have this thing that's in the corner of my eye. That feature vector is just kind of like a
sparse matrix for each one of those, I guess it would be people, or for me it's each one of those domains. How do you describe them? You've got categorical features like present or not present, which is usually zero or one. And you have continuous features, which is a number of dashes. It could be one, it could be six, and you can compare those. And so essentially all you're doing is taking that feature vector with a label when you're training your model as these features right here are all malicious, these are benign. And then that magic, the unicorn that a lot of the vendors will talk about at RSA in, I guess, a month or two, That's all done by the algorithm. So it can determine the optimal
weights and distinguish between those two classes. When it's done training, you have a predictive model. And for me, in this use case, what I'm doing is I'm plugging in that search stream feed from each domain or from each host. I'm extracting these features that I defined from training. And I'm passing that to the predictive model. And it says, phishing or not phishing. Any questions on that? Yes.
I took the prototype that I showed earlier by that security researcher XORS. I looked at, there's a GitHub, a very popular Swift on security that has a phishing regex. And I just said loosely to cast my net, run these against that fire hose. I don't know what's good or what's bad, but I need something to look at. So that's what I did. I looked, there's another resource called Phishtank, which has a bunch of different URLs. And I looked at the top Alexa domains to know what benign looks like. And that kind of gave me where to start. Again, I still didn't know ground truth, which I had to kind of dig through myself for a
lot of these domains. The thing that's nice is it's not like malware where I have to go take it and maybe put it in a sandbox and wait five minutes. I can just look at it. And you'll see that from this as well.
As far as features, I had, again, a rough idea of which keywords to look at. What I ended up doing was I I went through that massive list of domains. I split out the ones that I thought were malicious. And then I just took each domain and I did a regex to split by special characters. And it gave me this laundry list. And then I wanted to count the most common ones. And you'll see I didn't take out the TLDs, but when I was doing this myself, I just skipped those. But you can see account, Apple, Apple ID, PayPal. I ended up with about 150 that kind of encapsulate what I would look for, I'm
looking for phishing domains. Some popular brands. Again, I took some from that prototype. I added a bunch more that I didn't know to look for from Phishtank. And what's great is you can search by the targeted brand. So GitHub, lots of banks that I hadn't really thought about or that maybe aren't popular in the US that are still prime phishing targets. And I added a few banks and email services myself that were missing. And I searched through, again, the initial net of uncharacterized data to kind of seed my training sets. For Levenstein distances, I looked at, I identified probably 20 or 30 different words. And again, this is just a measure of similarity between two different strings. So I've got two examples.
I was doing a Levenstein distance of PayPal. And you can see it looks PlayPal, there's ParPal, PayPal, OnePayPal. And same with Apple ID. So another big problem is if I'm using signatures to try to catch all these, I ended up finding over like 700 different variations of PayPal within a Levenstein distance of two. And I didn't know what all those were, but I know if it's right in front of me, I'd be like, well, obviously that's not PayPal, it's phishing. Especially if it's coupled with some of the other keywords. Any questions on that? Okay.
A few other features, this is a really important one for me. looking for these targeted brands, their presence in the domain or in the subdomain. So on the left-hand side I've got a bunch of benign ones and I mean on the surface I don't really know what these are. SP authoring all that support that services that Microsoft.com that was flagged as like definitely phishing from my first few iterations but the reason why like maybe it's a misused service by Microsoft but it's not phishing because it's actually my Microsoft is the second level domain and com is the top level domain. Conversely, if that Microsoft.com is present as a host on a different domain, that's definitely phishing. So on the right hand side, you've got
apple.com and then verify account information, that's bogus. That's not real, that's phishing. And the thing that's great is that I can characterize what I'm looking for using Python code. And the 1,000 to 5,000 domains per second, I can't eyeball and look at it. But if I get the program to look for it, it can do it a lot better and a lot faster than I can. A few other features. Looked at the TLDs, the top-level domains, .com, .net, .co.uk. And then I was creating categorical features. And what's nice about machine learning is you can use this one-hot encoding, which, based on my training data, will create all the categories for me. So I don't have to hard code under different top-level domains, I can say, here's the data
that I think is representative of everywhere, and then it will split up the unique values from those categories into their own individual columns. Looked at number of dashes, number of periods, domain entropy, I kept seeing this Apple ID verify support whatever whatever dot, like 12 random numbers dot TK. So looking at the domain entropy, I kind of figured that the phishing examples would have high entropies. The normal ones would probably not. One of the things I found also along the way is examples of URL padding. So the top two domains, or those are fully qualified domain names. If you were to look at that on your computer, you'd be like, that's obviously phishing. But what's tricky is if you look at it in a
mobile device, it kind of looks like it might be legit. And especially when you're talking about users who aren't tech savvy, it's not unreasonable to think that they would actually log in or that it does look legit. As far as the algorithm, the secret sauce, the thing that if you've never built a model, people might say, what's the key ingredient? And they immediately point to the algorithm. For me, I use logistic regression. It's fast to train. It's good for sparse data sets. I'm doing binary classification, so linear algorithms work well. But honestly, that's not the key ingredient. It's really knowing what features to look for and then having that data. When I swapped out this
with random forests, I tried a neural net just because that's what everyone talks about. It's completely unnecessary for my use case. I found that this made the most sense. And the reason being is that for each feature, it's assigning a weight. And one of the metrics I'm looking at here is when you're identifying the positive class from the negative class, This is showing me the features that have the highest weights. And you can see if PayPal Levenstein distance of one is there, it's very indicative of phishing. If you have a security, I can't really read that. If the TLD is ML, if the login keyword is there. On the other side, if the TLD is
com.au, if there's an Alexa match right there, those kind of distinguish those classes. So when I go back to that static regimented prototype, this is great because I don't have to determine that feature weight. The algorithm does it for me based on how I tell it to look at these two different classes. So again, you really need the features and you really need the training data. And a lot of vendors with especially doing supervised learning or predictive modeling, they can claim that they can detect APT attacks from seven years from now while it requires having historical data. So fortunately for this use case, I have a lot of it. performance, there's a ton of different metrics. The
only one I'll hit on here is this confusion matrix that makes the most sense, even though it's named confusion matrix. When the true label is not phishing, my classifier is predicting not phishing as well, and that's the overwhelming majority of those. There's one instance from my training set where it's actually not phishing, but my classifier says it's phishing. That's a false positive right there. Here we have false negatives where it's actually phishing. I should see it, I would like to see an alert, but my classifier says it doesn't look like phishing, it's not phishing. And then conversely, here's the other side where it says it's phishing and it actually was phishing. There's a bunch of different metrics which my Jupyter notebook, especially with the 30 minute slot we
have, I can't go through all of them, but I've added the code and I've added very specific text on how interpret it. And if you're looking at working with machine learning vendors, I would encourage you to ask some of those questions. And especially, you would throw off a lot of marketing and sales guys. As far as the demo, this is, again, this has been running since this morning. I've got, I'll show you the bottom part. This is the configuration if you want colors.
I thought the colors were cool. If you think they're stupid, you can turn them off. If you want to look at the issuing CA, I have that enabled. If you want the log source, there's certain logs that Google has for collecting and feeding in these domain names and certificates. You can turn that on. The root CA, you can enable that, the timestamp. There's a lot of classifier management. So if you want to tweak and toggle the features and train multiple classifiers, I'm storing the classifier, the feature vector, and the feature extraction mechanism in a Mongo database. So you can swap between them as you want. If you want logging, you can store them in a
folder. And I've defined three different thresholds. If it's over 0.9, I want it red. Over 0.75, I want it yellow. And over 0.6, I want it cyan. So when you run that against the search stream feed, this is what it looks like. You can see the red ones are actually, they look really good. Apple mesoceliac something.com.tw Some of the yellow ones, and especially the cyan ones, maybe they're not. This one that scored 67.3, that one looks sketchy, even though it's got a low score. There's quite a few. And I would say when I go through this, maybe 20% to 30% of the time, they are malicious on virus total. The other 70%, 80% of the time, they're all marked clean. And then when I look at
it, either from a honey client like Thug to see if it's serving anything, Then maybe I'll look in a sandbox and it'll show me that it's serving this fake login page. I'll look at the cert and it was issued that day. This is what the command line looks like. Again, I'm going to open source this. I should have it published on GitHub this week. And you can deploy it against certstream, option two. You can just run it in command line mode. So if I put appleid suspicious.account.activity,
But some random domain .co.uk should have a high score. That's 1.0. That's as high as it gets. What's really neat though is I'm looking for maybe Apple is a really strong keyword for me. I'll put apple.com, but it says not phishing. That's because I've got labeled examples of domains from apple.com in my classifier. If it's present in the second level domain, it's probably not phishing. Same with PayPal. And now I'll pivot over to the Jupyter Notebook. Again, this is what I'll be releasing as well. It's got a few different snippets from my slide deck. I'll write a blog post and put that on Medium. I'll be sharing my slide deck, and I'll be throwing everything on
GitHub. And again, this walks through everything from step one. I'm including the training data that I'm using, and I'm showing how I'm loading it. And you can do this right now. loaded that data, how I define and compute these features, that's done in this method right here. There's quite a bit of code there, but it's all there. It's commented to the best of my abilities. My core responsibility is not development, but I used to work in QA, and I know the code comments are helpful. Python's great because I think it's very, relative to other languages, it's very easy to understand. Here's a little bit of a write-up of how I computed my features, and then how I go about doing that. probably take 20 seconds while it's
running. I'll slide down a little bit. Training the classifier, this is the code that I used. Again, I used logistic regression.
These facilities are and how to use them. I found it to be really helpful. We can go ahead and train the model right now. And we can explore some of these metrics. So I've got, for my training data, I've got really good performance. Here's that diagram of the feature weights. So you can see the account keyword is really strong, TLD being win. The Netflix keyword, a big one is Bank of America in the subdomain. I mean, that's pretty obvious. But again, the thing that's great is I didn't build this diagram. I just asked the algorithm. What do you see as being the most important features on both ends? Here's how to look at the data and here's the actual data and then that's that. Precision and recall, I've got
a few more. True positive rate, false positive rate, if you want to tune the threshold based on your risk appetite. And then the classification matrix and the confusion matrix. And down here you can, I've just added, you can write your own domains and see how they get scored and you can see of these look like phishing, but even this one, api-what's-good-actually-not-phishing.paypal.com, it's got a lot of dots and dashes, but my classifier scored it at 9%, and the malicious threshold is 50%. And again, this is a 30-minute talk, trying to be mindful of everyone's time. I do have just a, I still have my slide deck.
I have a couple examples of things I found along the way.
Again, this is a lot of what I see of what gets flagged. On VirusTotal, there's nothing there. I tried to hit it from the command line where my user agent was probably filtering if the user agent is curl or Python requests. It throws a 403 forbidden. But when I visited the domain in the sandbox, it showed behind this dialogue is the PayPal login screen. And there's a certificate. And I did that on December 28th. It was issued that day. By the way, I've also collected probably 100 or more phishing kits. So where I probably am hitting that host before it's operational, or they're using a resource on that host to serve their login page. But what they've left here is a few different folders,
and I went into PayPal, and there's this TGNTV5.zip, and I just sent that hash to VirusTotal, and it was flagged as a web shell by of the vendors. And I've collected, again, probably 100. IP address resolution, you could go crazy with this. I haven't really explored this. A lot of them are sync hold, 45 of the ones that I had. You could do a whole bunch of metrics to see if it's shared hosting, if it's dedicated hosting. You could go crazy with passive DNS. You could look at, I mean, tons of things, the hosting provider and things like that. It's not a unicorn. It's not magic. It's not a silver bullet. There are some things where I haven't finished exploring how to go about. For
example, PunyCode encoding. This is like different language sets that have different types of encoding, where it might be like a Latin little l, and you do apple.com, and it's a little l. It looks different in Chrome and Firefox. It's got xn dash dash and then most of that domain. But I haven't really explored if I can support that or if I should ignore it. If the suspicious attributes, as far as it pertains to the input into my prototype, if the attributes are only present in the URI that's being fetched, not in the domain or the host name, I'm not going to see it. And then another one that I am eagerly awaiting is Let's Encrypt
Wildcard Support. So this was due on February 27th, but they got bogged down and they're behind about two weeks. So what I'm wondering is if all of these threat actors or whoever's creating all these phishing domains, of a sudden start requesting a wildcard cert, the attributes that my classifier is looking for, I don't know if it'll catch it. With that said, I have kind of spot checked and eyeballed. And probably 60% to 70% of what I'm flagging, the attributes are present in the domain. But as an attacker, putting my attacker hat on, that's exactly how I'd beat this. With that said, I do have a trained model. And if you're going to operationally deploy something
like this, and if you're really serious about phishing, threat feeds are great. But doing what you can, if you can do this analysis in milliseconds, which for me the only data that I have is the domain, if you were to deploy this at maybe a DNS server, and when those DNS responses are going out, that's when you would see the full host. So that would be cool. And then maybe you could block it, you could do whatever, send an alert, whatever your workflows are, that'd be interesting. And then candidly, the recall rate is unknown. So from this fire hose of domains, I don't know how many I'm missing. because I'm not looking at ground truth
for 1,000 to 5,000 domains per second. If you have a way to do that, please come see me after. That'd be great. And that's it for me. It's about 30 minutes. So I'll be around after we have the break now. I'll be publishing this on GitHub. And if you have any questions, just let me know if you think it sucks. If you like it. Thank you very much.
the fully qualified domain name why not analyze some other aspects of uh the like the certificates or certificates or maybe even the html for keywords yeah you could absolutely i think this is a side project with the benefit of time that's what i would like to do honestly that's what i'm looking at for a blackout submission so
I don't have the obligatory intro slide, so just really quickly. I'm from Calgary, so oil and gas. Worked in the industry for 15 years or so. Started as, you know, security tech lead and then led some teams and then kind of got sick of it. So I joined a company called ION. We are a network security value added reseller. Along the way, I started teaching for SANS as well. So I teach in the pentest curriculum, instant handling, industrial control systems, and I'm starting to teach some of the blue team courses. I'll be here in Vancouver in late August, I think, teaching 5.11, which is continuous monitoring and security operations. And there's also a SANS conference coming to Vancouver in June, I believe, which
has a few more tracks as well.
of what I've been doing over the last, I don't know, four or five years is pen testing and I'm sorry to say that I haven't had to raise my game too high yet, especially Western Canada. We don't have a lot of pharmaceutical, healthcare, insurance, finance, people who really care about their information. So, there,
It has been less challenging than I'd like and one of the big reasons for that is Active Directory. And now my clicker has stopped working. There we go. So I'd actually previously given another presentation about all the things you can do without buying products and these are some of those things. So I'm not going to spend a lot of time talking about them but if you don't know your environment or your business then you are not securing anything. is people who buy tools and plug them in and the blinky green lights come on and they walk away. So let's, how about we spend more time configuring things, using them appropriately and less time buying more. Egress filtering, again, if I can talk outbound to the internet
to my heart's content, then you really have no visibility into what I'm doing. Multi-factor auth is probably the single most important basic security measure you can employ. A buddy of mine works in Thread Intel for Mandiant and he's now convinced me that that is the only reasonable product slash defensive measure that is going to help in every scenario. It's not gonna, it's no silver bullet, but it's probably the first place to start. You know, management access, privileged access, passwords, I'll talk about passwords more, visibility, patching. They're all simple, simple things. They don't require products. We don't do them well. They make my life as a pen tester pretty easy. So I've used this graph, this picture,
graphic, in a few different contexts. But that guy's your Windows admin. That guy's your security lead. Typically, Windows guys are focused on operations. How fast can I get everybody to the cloud?
lights on, keep things moving, keep my tickets to a minimum. And honestly, there's not a lot of security specific training for Windows admins. There really isn't. And on the flip side, most of the people I run into on the security side, they're network people, right? They've come up through the network in some way, shape, or form. And so security is all about packets and firewalls and that kind of thing. But who's doing the high-risk stuff? Well, I just mentioned it, right? I've been in a large number of organizations going to some form of Azure or Office 365 or what have you. The Windows guys have been architecting this for the last two years and no one's taken a look at all from a security point
of view. So it's not a priority. I'm not saying Windows guys don't care. They just don't know what they don't know. They don't know what they're not doing and they don't have time. They're busy doing a whole lot of other things. I want to talk about how we're going to, I'm using acme.com, you'll see Wiley Coyote and Roadrunner scattered throughout here. So in the context of acme.com, what can we do better? The bottom line is Active Directory controls everything, or most everything. Even, I don't know. more than 50% of organizations, I can log into routers, switches, firewalls with Active Directory credentials in some way, shape, or form. Databases, HR systems, finance systems. If I have domain admin, I have some way into all of
those areas. So if I control Active Directory as an attacker, I control everything. So we need to be doing a lot more to protect it, and it is a massive attack surface. It's a source of truth, whether it's for identity, for authentication, for things like group membership. It's used by everything in your environment. So this isn't really about the MITRE ATT&CK matrix, but if you haven't looked at it, do take a look at the MITRE ATT&CK matrix. I put this up here really to say that I'm not here to talk about specific attacks or specific exploits or specific tools. This is really about of attack or attack patterns, things that attackers are gonna do and things that we can do to protect
against them. So MITRE is a list of probably over 100 different attack techniques like DLL injection, for example, just picking one out of the air, that you may want to consider defending against as opposed to wanna cry as something more specific. So I've got seven kind of areas that I'm gonna talk about. of security as it relates to Active Directory and how we may want to protect against the attacks in those areas. So the next thing is have a plan because I was talking to a few folks earlier today. Most of our Active Directory environments, unless you're brand new, are five, 10, 15, 20, 30 years old. Well, Active Directory is 2000, so I guess I can't go quite back that far. But
the Windows environments have some inertia. They've been around for a while. They all know, everybody knows what every single account, every single computer, every single application is there to do, right? So it's easy to make changes. Yeah? No. So we're not gonna fix this all at once either. It took a long time to build these environments. So it's gonna take a little while to repair or unravel. And you know what? We're gonna break things. I'm all for change management, but We also have to consider the risk of not changing things and having people who are capable of fixing things when they do break them. To me, that's part of the solution to get where we
need to be. And you know what? There's work. I can't go out or you can't go out. There's some great products out there, but you can't go out and buy the magic bullet to fix Active Directory, right? You have to do some work. That's gonna take time. And I've talked to a lot of people, for example, prevention. They say, well, people can take data out of my company by taking pictures and they can plug in USB and DVD and they can upload to the cloud and they can print and so I'm just not going to worry about it. I give up. Is that really the approach we want to take? Because it's the same thing with Active Directory. There's a million different ways you can own Active Directory.
So let's just give up and not look at any of them. How about that? Is that going to get us anywhere? So we've got to start somewhere. Let's start thinking about where we want to be in Well, let's see, we're just getting rid of 2003 and 2018, so 2016, 15 years from now. So let's figure out where we want to be in 2031. How about that? No, seriously. But let's think about how we can get to where we want to be over the next couple of years. It's not going to happen overnight. And I might mention the odd product, but really this isn't about products. This is about what can you do with one or two motivated people have some knowledge and some ability to affect
change, have a mandate. Because that's all it really takes. And that investment
For companies that have your typical password policy, you know, eight characters, up to lower special, change every 90 days, it's over 100 organizations I'm drawing this data from, so it's fairly empirical, that 2% of those users will have a password of winter 2018 right now. About 2%. About 10%, if you take the words on their website and mix in a little leet speak and add one to four digits, that's another 10% of their passwords. If you're familiar with Crackstation, their password dictionary, that's about 20 percent of the organization. So that's what? I mean, there's overlap there, but that's a good 20 percent of passwords straight off the bat. No, we're not sitting there guessing a million passwords. The
typical engagement, go get a list of users from LinkedIn, figure out usernames, try winter 2018 against all of them. One guess, usually something pops. and then I have to go to a second or third round of passwords, but not usually. So we gotta do better with passwords. And service accounts, again, they never change. The passwords are never changed, because you don't know what that account controls, right? Who's gonna be responsible for fixing that if it breaks? How about someone who can just do a little bit of basic troubleshooting? Is the problem here or here? Okay, now is the problem here or here? Okay, now is it here or here? Like, let's fix some of these things instead of just kicking it down the road. Landman, not on
by default anymore, but there's a whole lot of service account passwords that are stored as Landman hashes still because they don't get removed from your Active Directory until your password history rolls off the back. So if you're remembering 24 previous passwords because you don't want people reusing their passwords which doesn't really matter because honestly if you reuse winter 20 or if you're using winter 2016 right now it might take me a little longer than than if you were using winter 2018 to figure that out but those password hashes every time you change a password it rolls your history one password if you've got 24 passwords in your history those service account passwords will be in landman form for a very very long time and of course we have
the greatest password synchronization of tools of all, which is the human being. They do a great job of synchronizing passwords. So what do we do about this? First of all, 15 characters or more. It's not that hard.
As you would guess, I've had computers for a long time. From the time my kids were four, they had passwords that were 15 characters or longer. It's really easy. Tell me two things you like, one thing you don't like. Okay, well, I like the dog. where we dog sit, actually. So the name of the dog, I like to ride my bike, and I don't like my brother. There's your password, right? Combine those three words. It's easy to type. It's easy to remember. You're not going to guess that very easily. So in this case, you know, what does Wile E. Coyote like? He likes rockets. He likes the roadrunner, and he doesn't really like cliffs. That's
a pretty long password if you jam those words together. Easy to type, easy to remember. And those of you who still think it's not easy to type can go ahead and show me your text message history, and then to defend that position, because there's very few people who can't type these days anyways. So 15 characters or more, and you know what, don't change it. If your users choose a bad password today, guess what they're gonna choose in 90 days? Another crappy password. I mean, yes, you have to change occasionally, especially if you suspect a breach and so on and so forth, but 90 days is ludicrous. That dates back to 1984 when, I think it was NIST, came up with their green book that said
it would take six months to try every possible eight character upper or lower special password over whatever a dial-up modem was at that time, when you had to actually put the headset in the modem. Anyone remember those days? That's where our password policies came from. Some auditor turned that into a requirement, and here we are today. Who cares? If you've got a good 15 character or longer password, You don't need to change it unless you think it's been compromised in some way. So you do it occasionally for hygiene. I'll talk about managed service accounts a little bit later and other kinds of password tools that help us for the non-human passwords. Honestly, complexity, at 15 characters, it's less important. I would rather encourage than mandate because
if I'm mandating it, we're gonna end up with winter 2018 and maybe have an exclamation mark at the end. because that's 11 characters if you put the exclamation mark there. So, you know, it should be encouraged. And I'm now on the side that single sign-on or SAML is better than not. Yes, if I get control of your session or your account, I get control of an awful lot of things, but it's one of the few carrots to actually get people to choose decent passwords because they don't have to use them all the time.
Kind of related to that is local admin access. Number two way you're gonna lose. Still find group policy preferences containing passwords. For those that don't know, GPP is how Microsoft originally recommended you set local admin passwords on endpoints. The way it does that is it puts it in a group policy preference file on your domain controller, which is world readable, and they publish the encryption key for how that password is encrypted. Microsoft doesn't allow you to set those anymore, but that doesn't mean you don't have them kicking around on your sysval. Local admin passwords that are the same across all systems. Again, if I get one of them, then I'm logged into everything, especially when you use the same one on servers as
workstations, or you use the same one for domain admin as you do for workstation admin. So don't use local passwords, sorry. don't use the same local admin password or users with admin privileges. We still see these a lot. The migrations to Windows 10 are starting to take care of some of them. But users don't need local admin except in very controlled circumstances. And even then, there's lots of ways to give them admin access for a time period or for the task that they need it for rather than blanche because guess what they do when they have it well I'm going on vacation so I got to add Sally as an admin now because I'm not going to be here and now she
needs the same thing and then on down the road right we get admins adding admins adding admins and then you know privilege escalation is still a thing right if you look at the patch Tuesday bulletins privilege escalation is almost always important not critical but you know what Click links. We know this, right? It is not a big step to get local access on a system. So privilege escalation is still a critical challenge. The biggest thing here is LAPS. If you don't know what LAPS is, Google LAPS. Local Administrator Password Solution. Really, really simple. And this is my favorite because there are no arguments against it that are really very good. It does not impact the business one bit to to turn on laps. It's pretty easy
to implement and it just changes all your admin passwords. So basically, every endpoint picks its own admin password on a frequency that you choose and stores it in an active directory. You say, stores a password in an active directory? Isn't that a problem? Well, you have to be a domain admin or you have to delegate that and don't do that to read those passwords. So if I'm a domain admin and I can access your local admin, wait a minute, I don't care, I'm a domain admin. So just turn on LAPS. It will set a completely random long password on every local admin count, a different one on every admin count, every day, every hour if you want, every 30 days, who cares? But then if I steal
hashes or steal credentials on one machine, it doesn't matter. I can't move around the environment like that. More on local admin access. Especially with virtualization and out-of-band management type stuff,
Why do you ever need to log in as a local administrator account, other than when you're building a system? Almost never. So another way of putting that is, when is your system not connected to the domain in some way, shape, or form? If it is, well then, get a console in some way, shape, or form. You don't need to use local accounts to connect to something over a network. So turn it off, right? This is kind of goofy. You actually have to type local account and member of administrators group, just say local account. In the user write assignment for deny access over the network and deny access via RDP. Again, why should you ever log in as a local account? Then even if they are compromised, you can't
use those credentials over the network. You can get a VMware console or you can, if it's somebody's laptop, then you can walk up to it and type in a password. There's protected users.
You can designate protected users, I think as of 2012 Windows domains, and they, protected users, do not cache passwords. So you don't get SSO. If you RDP into something, you can't map a drive without typing your password. Oh well. You can't use weak encryption mechanisms for Kerberos. You can't use NTLM. There's a bunch of things that protected users can't do. So at least for your administrative users, make them protected users. It's just a group. Put them in the group and then there's a lot less that I can do as an attacker to get access to those accounts or reuse those passwords. Use restricted groups for the local admin accounts. So restricted group basically means that anytime
someone makes a change to that group, it changes back. So you can't have admins adding admins. I dictate who is a member of this group, the local admins group on my endpoints, and that's it. So usually, what we do is we create an active directory group, a domain group for every single computer, put the admins into that domain group, put the domain group into the local admins group, and then lock the group, the local group. You can't change it. You have to change it at a domain level. And there's a million tools. Everything's PowerShell these days. But check some of these things. permission, run some of these things in your environment and see what comes back. Get GPP password. Find the passwords on your sysval. Get system.
Are there unquoted service paths? Are there other things that will allow me to get sysadmin or administrator level access on a given endpoint? Or check everything. Invoke all checks. These are in powersploit slash powershell empire. If you google those names, you'll find them. These are table stakes, as I say, right? These are really simple things that nobody ever does. If you're not doing these things, your threat intel feed, your firewall even, is useless because you don't need to be looking at threat intel. You need to be stopping these kinds of attacks. So my favorite is inertia.
I was watching a football game the other day, or not the other day, it was a few weeks, a few months ago. And the commentator said, and, you know, I think it was actually the Lions, to be honest. So it was a few months ago. And he said, BC is really getting a lot of inertia here. I'm thinking, you know, I don't think that's what you really mean to say. Because inertia means we're not doing anything. So there's still XP7, 2003, 2008,
less than two years away from dead, right? 2008's pretty close. We've got old versions of Reader, Flash,
around because well I don't know who uses that account but I don't want to break you know it has the name of a high-profile application in the name of the account so I don't want to risk breaking that. Land Manager, NTLM version 1, SMB version 1, all of those for every one of those Windows XP can use something newer maybe not SMB v2 sorry that's not true but even so are old, old protocols and they're abused in so many ways. Primarily because they use some form of weak encryption, whether to encrypt data at rest or to send passwords over the network or one of those things. And these organizations have been around. I come from Calgary, right?
Upgrading my Windows XP isn't gonna get more oil out of the ground, so why would I do that? Because you might not be in business. I had a guy, he worked for me when Shamoon hit Saudi Aramco. That was five years ago now almost. They recruited him using Hotmail because they had no working systems because they were running. At that time, I guess I think they went to Windows 8, so they must have been running XP at the time and ended up doing a very large migration of 38,000 users to Windows 8 on the spur of the moment. But they had nothing because... there hadn't been any impetus to upgrade prior to that. So what can we do?
Stop testing already. I'm all for validating patches, validating updates, validating new configurations to a point. If I have to spend, I don't know, hours, days testing something that takes me about two seconds to implement and two seconds to back out, what am I doing? Let's do your basic smoke testing, absolutely. Let's just change stuff, fix stuff, upgrade stuff. We've got VMware for crying out loud. Take a snapshot and revert. We don't need to test every single eventuality. Incidentally, all of this, it's a rant, but there are exceptions to every rule and there are exceptions to some of these comments. There are people who are trapped running Windows NT4 for whatever reason, but then do something about
that. Segregate it or it from the domain. Have a different domain for things that are broken. Only trust it or only allow it to trust your real domain. Don't trust the other way. Segregate at the network level. Crank up your security settings. Okay, you still got to run that XP or 2003 machine. You're doing it to run this one application and that's all you're going to run. Application whitelisting. Does anyone need to connect to it over the network? Can we dis allow all inbound connectivity. There's all kinds of things you can do. Virtualize it to make a lot of these things easier. But my favorite, honestly, is just frickin' upgrade. And see what breaks. Upgrade in a lab, see what breaks. And if nothing breaks,
upgrade for real. If something breaks then, then roll it back. But at least you know what you're dealing with. Because if your ops guys can't fix problems, then you need new ops guys. So if something breaks, figure out how to fix it. This is a big one. and functional level.
This is where, especially for the directory itself, all the security features go, is they come in with the new Active Directory schema with 2008. 2012 is in particular a big jump in terms of capabilities for security, and 2016 adds a bit more. And no, there are boundaries, like I think, 2003 servers can run in a 2012 domain. They will not work in a 2016 domain. So there are some boundaries to this, but you can run a Windows 2003 server in a Windows 2012 domain. There's no reason you can't. All you gotta do is get one domain controller to 2012 and you're good. So that you can raise the functional level so you can turn on some of these things. Yeah, you got a question? Well,
You don't have to have two, but yes. I mean, get more than one up to 2012. But as long as you have at least one, then you can run a 2012 domain. Yep, it may not have all of the capabilities, but it will authenticate users, for example, when the primary's down. I mean, if you're gonna, I would do this over the course of a weekend, get several up to 2012. I'm not saying run with just one, but You just need to cross that boundary of having one domain controller and you can upgrade your domain. With 2008, some of the highlights, fine-grained password policies. So you could specify different password policy by OU. It's kind of cryptic and you have to do it from the command line
and it's ugly, but you could say, all my admin accounts are in this OU, they have a different password policy. If you can't get your users to use long passwords, at least force it on your admins. Managed service accounts. So service accounts, in 2008 it was per server. The account would only work on one server, but it would manage the password and continually change the password for that account. Only works for certain things like IIS and SQL Server and Exchange, but it's something. With 2012 you get group managed service accounts so it can work across multiple servers. you get a UI to choose your passwords or your password policies. That's where you get protected users,
though, is 2012. So you can put users in the protected users groups and their passwords do not follow them around. They don't get stored in memory. They can't use NTLM. They can't be delegated under Kerberos. They can't use weak ciphers with Kerberos. Restricted admin mode is kind of like protected users lite. So if you RDP into something with restricted admin mode, again, your credentials don't follow you onto that server. LSA protection, you can protect, so the LSASS, Local Security Authority Subsystem Service, I think, is where all of your passwords are stored in as good as plain text in most cases. Because again, how else does SSO work? Windows knows your password. With 2012, you can start protecting that
a lot better. so that other processes and this gets better with 2016 as well in Windows 10 so that other processes can't get at those passwords that are in memory and it's backed by the TPM module if you have a TPM module which by the time you hit 2012 you should and so on. There's authentication policies which I forget what they are now. I looked it up this morning and I've promptly forgotten it again. But It's around controlling who can log in where. And then with 2016, we have Microsoft PAM, Privileged Access Management, which gives you just-in-time and just-enough access. So now we're talking about only granting access for periods of time or only the amount that's required. So if
you do one thing, well, I can't say that, but get your domain functional level at least up to 2012 so you can start using some of these features. Related to that, yeah.
or you can go to the point where it's to do this task. You get the privilege to execute this task and that's it. It's gone again.
I talked about land manager. I got a domain admin password the other day because I cracked a land man hash of the guy's password history that was a pattern that showed me what his current password was. So, The only way to truly get rid of LAN manager at a domain level, this isn't even at the workstation level, is turn off your password history, force a password change, and then turn it back on again. It's the only way to get rid of it. There may be some tools to reach in and pull bits out, but I haven't really looked. XP in 2003 support NTLMv2. So what do we need NTLMv1 for? It's typically those security appliances that authenticate against the Those are often the culprits. We made this
work back in 2000 and we haven't updated our code since and you can't run your security points unless you support NTLM v1. Do you have a question? Yeah, well, you can't use special characters then. What about all our password policies?
I know you can't read this, but Ned Pyle, he's the Microsoft product manager for SNL. He owns the whole SMB product line, so his Twitter handle is nerdpile, of course. But these are two tweets from him. Day 700 without SMB1 installed, nothing happened. Just like the last 699 days. Because anyone requiring SMB1 is not allowed on my network. There's some special characters in there. As an owner of SMB at Microsoft, I cannot emphasize enough how much I want everyone to stop using SMB v1. Those tweets incidentally, that 700 days was in September of 2016. 700 days without SMBV1. So we need to have disabled this a long, long time ago. For those that don't know, this is what
WannaCry and NotPetya were able to leverage, and EternalBlue and all those things to compromise systems back in last spring. If you have a niche requirement for SMBV1, it's probably on a single server. leave it on that, at least in your server infrastructure, turn it off on everything else except for that one. Better case would be to isolate it, join it to a different domain, find alternatives, have a jump box, all those kinds of things. And if your vendor says it's gonna break their product, then fix your procurement process, because that's one of the other things I've learned over the years is we do a lot of RFPs, especially if you're in public service. Your RFPs, yourself a
huge favor from a security point of view if your RFPs and other procurement agreements explicitly state security requirements like must support proper password encryption. I won't, you know, like bcrypt or I forgot the other one, slipping my mind. Something CKC, see? I shouldn't have even tried. You know, your product must support TLS 1.2, it must support LDAPS, it must support SMB v3, whatever those things are, put them in your procurement contracts, because guess what? When it's in an RFP, then a vendor is going to try to, if they want to win that RFP, they're going to make that happen. Over time, maybe, probably not a lot of time. So you have a lot of power with those
procurement contracts. I won't go into all of these in detail. Other things to do, disable net session enumeration, proxy auto detection, link local, multicast name resolution, all of those can be abused. Windows browser protocol, NetBIOS, scripting host, WDigest, untrusted fonts, macros, N-O-L-E, I know that one's nasty, we'll talk about that. These are all things and there are links in the slides and on the last couple screens to what these are, how all these work. The bottom line is all of them can be subverted to get you to give me credentials or to give me privilege escalation, one or the other. Things you can enable, and some of these are harder than others. Device Guard, so it's like AppLocker on steroids. I can control
exactly which programs can execute.
it helps when people walk away with your laptops anyways. So as of fall 2017, this is just an example of some of the things ExploitGuard brings to the table. So you can literally create a rule, a policy basically, that says executable content cannot be run from an email client or a webmail client. Office applications can't create child processes or executable content or can't inject into other processes. So all these kind of attack patterns that the bad guys are using, you can just say, no, those aren't allowed. That's relatively new, but these are the kinds of things that have huge, broad power.
Active Directory Forests, real simple thing, just had to put this up there. A forest... Sorry, a domain is not a security boundary. A forest is a security boundary. Microsoft says, in this way, the forest is the security boundary for the information that is contained in that instance of Active Directory. If I am domain admin in any domain in your forest, I am domain admin for the whole forest. There's a number of ways to do that. It's not a security boundary. Separate forests if you want to have separate domains. AdminPris for the domain. I did a pen test for a company with... 300 users, they had 85 domain admins.
Their physical security guy was a domain admin, and his password was Gunner.
So, too many admins. I mean, this is where we get down to the service account problem again. Back to your procurement language. I will not buy your product unless you can tell me exactly what privileges that service account needs, and exactly how to configure that account without giving it domain admin. Regular accounts as admins, I've got a screenshot of Bloodhound on the next slide. It's amazing what one little account over here that's an admin on this server, where this guy's logged in, who's an admin on this server, where this guy's logged in, who's an admin on this server, it can get me all the way to domain admin. Service principal names, long story short, if you use Kerberos and you run a service like SQL or web
server, then the account that runs that server registers it with Active Directory. form that stores something that can be reversed, not reversed, cracked into the, sorry. It's a representation of the password of the service account, or it's actually the service, that password is used to create the Kerberos ticket that the domain controller will give you. It's slow to crack, like even on my cheap five-year-old system, I'm cracking about 20 billion Windows hashes per second. When it's a Kerberos ticket, it's maybe a thousand per second, guesses. but I can still crack it. Again, when you have this old, old service account that has a crappy password, it usually takes me, well, I usually get it from a password dictionary if I do. So monitor
those, check those. I'll talk about that a bit in a couple slides.
The best accounts for me in some cases as an attacker are not your domain admins. Well, I mean, those are great, do you have administrative control over the virtual environment, over VMware? Well, how many virtual domain controllers do you have? It's just as easy for me to just copy or hold DC as anything else. Or storage, same reason, right? Back up to an open NFS share, oh, there's all your VMs, there's everything. So those administrative accounts are equally, equally valuable. Delegated permissions, you can delegate administration over an OU to, Helpdesk, for example. Now Helpdesk has the ability to add to group, reset password, whatever. Sometimes those permissions get a little bit lost in the shuffle. And people end up with way too
many permissions. Directory Services Recovery Manager. That password can be used to log in. You probably set it once when you created your domain. It's a domain admin account. It's the account that you use to do an offline recovery of your domain. Change it. And then users with just too many privileges. I mean, that's a perennial problem. I don't have an easy solution to that one, but let's start by running something like Bloodhound. So this is just a screenshot I stole from the internet. They're usually a lot more detailed. The short version of this is green is member of yellow, which is group. case this user is a member of this group which has the generic all permission in other words they've got delegated
access to this user so I could change this user's password who's a member of this group. The generic all over here who's a member of this group this slide doesn't have any servers on it but there's also a red node that goes in there which is a server so if you know if I did it that way this user is a member of this group who has generic all on this user who's logged into, sorry, who's an administrator on this computer where this user's logged in and I can steal his password with Mimi Katz or what have you, Keylogger, and on down the line. Usually I run this and I see hundreds of accounts that, you know, even companies that say, well we have separate accounts
for administrative purposes. I'll find hundreds of accounts that are not their special, you know, dollar sign accounts or whatever their administrator accounts are. So run this occasionally. Like this is honestly, in pen testing and remediation, this is the biggest eye-opener for most organizations is just run Bloodhound. They're like, you mean that person could get to domain admin? They see accounts up there and they go, oh my God. And usually they phone the classes like, oh, I didn't realize that all of my geologists are in a group.
More auditing and validation. There's a tool called Kerberost, basically pulling those service principal names out of Active Directory. It has some very distinct, or a very distinct signature that's on adsecurity.org, how to test for that or check that in your logs.
and I'm gonna talk about this in multiple contexts, I don't like the idea of using your corporate Active Directory for authentication to network switches, firewalls, routers, VMware, storage, all of that stuff. Spin up two domain controllers over here. Call it your service domain or whatever, your operations domain. and use that to authenticate to your network infrastructure. Now, because I'm gonna compromise your Windows, your primary Active Directory, it's just gonna happen eventually. So use a separate domain for managing, for the administrative interface to all these things, because otherwise I'm just gonna log into VMware and steal everything. Same for out-of-band management, your Drax, your ILOs, all that stuff. Protected accounts mitigate some of the Kerberos delegation issues. presentation specifically about Kerberost and
Kerberosting and the Kerberost delegation pieces. Talking about tiers or separate domains, in theory, we want to have three tiers at least. These are Microsoft slides. Tier zero is your domain admins, so they can control down, you cannot control up. Any CISSP's remember Bell Lepadula? No? On one hand.
Yeah, I don't remember anything more than the name either. It had something to do with going up and down, but yeah. Anyways.
So yeah, you want to be able to control down tiers, but not up. So tier zero would be your domain admins, for example. Tier one would be your server admins. Tier two would be your workstation admins or your... Honestly, if you can get to even the second of the tiers, you're doing well. Let's start with your domain admins. Same thing going the other way. authenticate up, so I can authenticate to a more trusted domain but I can't authenticate down to a less trusted domain. So your domain admins should all be in a separate domain, it's a trusted domain. Everything can authenticate against it so you can log into all your corporate stuff over here to do your domain admin activities
in a very granular fashion. Nobody can log in the other way. So even when Sally has a password of winter 2018 that lets me get remote access to your network and then, I don't know, you're still running Windows 7 so Mimi Katz pulls out domain admin credential from memory, it's not there because it's part of this other domain. I can't get at it that way. So Microsoft has this concept of a privileged access workstation. which is a hardened and locked down workstation designed to provide high security assurances for sensitive accounts and tasks. In other words, this is the thing you perform all your admin activities for. It is dedicated, you have no software installed other than what you need to do your sysadmin. If
you have a browser, it's a locked down browser, no plugins, no JavaScript, ActiveX, whatever.
And in an extreme case, it's a physical box that is in a room that has a Windows firewall or a firewall, endpoint firewall on it that allows no inbound connections. It gets its updates from WSUS or Windows Update by pulling. And it's powered off even, right? You have to walk into that room or you have to have some process to get there. That's the extreme case. In a lot of cases, what we see is people have a host system, like their desktop is their privileged access workstation. and then they virtualize their day-to-day workstation. So your day-to-day activities, your email, your web surfing is from a virtual workstation, and you leave that and work in this outer workstation to perform your administrative tasks.
But this is where you say, yeah, no inbound connections, no one can log in. Don't allow domain admins to log in anywhere except these workstations, those kinds of things. is the enhanced security administrative environment, which is where we get to the different domains and different forests. So dedicated administrative forests allow organizations to host admin accounts, workstations, and groups. And so these privileged access workstations go into the other domain that has stronger security controls in the production environment. So it's almost like, you know what, we give up on our production AD, let's just start a new one over here and do it right and just put all our admin stuff in there. and you trust it from your regular domain but not the other way around. Or you could have
more than one. You have your admin forest which is your true admin forest for the highest level stuff like your domain admins. You could have a separate forest which is for your individual admin accounts or your individual privileged accounts. And so these guys can only log in at tier zero. These guys can only log in at tier one and tier two. And then you have your users and your regular user domain down here. said it's a journey this stuff doesn't happen overnight but it is it's you know most organizations are going through some form of Windows 2016 migration right now so start thinking about these things and position yourself to get to these points in the future or when Windows whatever it's gonna be 2018 2019 comes out I
guess it wouldn't be 2018 um
There's privileged management for directory services, so for active directory. This is with Windows 2016. So privileged access management accomplishes two goals. It allows us to reestablish control over a compromised active directory by having a separate bastion environment that is known to be unaffected by the malicious attack. So this is our goal with having that isolation. So you put your privileged accounts in there to reduce the risk of those credentials being stolen. And that leverages just in time and just enough admin. So with Windows 2016, I haven't dug deeply enough into this, but this is built into Windows 2016. So you can have those containers within your Windows 2016 domain.
Number six is endpoint. So we do have to harden our endpoints. That's where most of the bad stuff happens. We've got PowerShell, we've got phishing, we've got drive-by downloads, unwanted software as some people call them. Attackers who can pivot from machine to machine. We've got macros, we've got all kinds of executable content. It's like playing with TNT. So what can we do about all of that? Well first of all, I guess PowerShell 6 is out now. Get to at least version 5 of PowerShell and remove PowerShell 2.0 because if it's there I'm just gonna run everything in 2.0 which has zero logging, or not zero, but not very good logging. language mode so you can
only make certain Windows API calls from PowerShell that's part of Device Guard. Bug your vendors about the anti-malware scan interface. This is a Microsoft API that will feed every bit of PowerShell that's being run into the antivirus product. Very, very few of them support it. They're slowly adopting it but not really. Log everything with PowerShell. So you know one technique I heard used was log every PowerShell module that's called, basically every function name. Log them all. Then, you know, take a look at them, then whitelist those as the things that are typically run from PowerShell. Then anytime someone calls a PowerShell function that's not in your whitelist, then raise some form of alert, validate that, because most
software does not use the PowerShell APIs to inject DLLs into memory, for example.
in except for your antivirus software which you're gonna have to whitelist you know and there's so your antivirus software better be good because it's like the attackers just sitting on your head if they if they take over your antivirus tool you know it's just an attacker sitting on your head sysmon awesome tool there's lots of resources out there now for sysmon configurations will basically log every command line every socket every process your parent processes all kinds of information about everything that happens, which you can then mine for, at the very least, mine for anomalies, look for some very simple, some very basic detections can have a lot of power. Is that on this slide? No, it's on another one. Sorry, it's coming. Network segregation.
So what endpoint needs to talk to another endpoint? What Windows 10 system really needs to talk to another Windows 10 system? Maybe you'll come up with some chat client or some VoIP client or video client that needs to do that. Well then allow that port, whatever that is. But we've got GPOs, we've got Windows firewalls, just don't allow inbound connectivity on your endpoints so that when I land on one of them, I can't just go scanning everyone to build my Bloodhound map or to hop from one to the other looking for privileged credentials or what have you. There's no reason for endpoints to talk to each other anymore. And yes, there's exceptions. Put an exception
for your administrator's subnet or for your SCCM subnet or whatever that is. But you don't need endpoint to endpoint communication by default. Just doing that makes such a dent, put such a dent in your attack surface, it's massive. Private VLANs will help within a subnet as well if you want to isolate endpoints. You can use ACLs as well. I like the endpoint firewall personally because you can control by group policy, push it out to everything, and there's no inbound communication allowed except for my management subnet. Turn off macros, because that goes over well. But who really needs macros? And this is where some of the exploit guard protections come in as well. Yeah, I get
it. Finance runs on spreadsheets. Okay, production accounting runs on spreadsheets. Those are two groups, but really, should you put it this way? Should your HR department be allowed to run macros? No. Who opens more documents of unknown formats or dangerous formats than HR? But they have no reason to run macros, none. Or provide them a place where they can do that in a controlled manner, but but they don't need to run macros. Turn them off and then turn them on selectively for the people that need them. You can target that. With Office 2016, you can disable macros for downloaded content. So you can allow macros for stuff that came off of a file share, but not for stuff that came from the internet. File
associations. This is a short list. There's a link in the notes to a long list of all of the file extensions can be used to do malicious things. There's a very long list. But again, for your average user, if they double click on those in an email, hopefully they don't land in their email, but if they do and they double click on them, just open them in Notepad. Again, your average user doesn't need any of this stuff. AppLocker, application whitelisting. Start slow, start by logging. Let's get a baseline. then maybe I can block execution from the users, from C users. Nothing should execute from there. Thank you, Chrome. We'll exclude that one from that policy.
Chrome updater runs from your users folder. Block execution from temp folders, or you can take the reverse approach and permit C windows, program files, program files x86, blacklist the rest. You will have to fix a few things along the way, but then do that. SOC and they're a high functioning SOC or even a medium functioning SOC, collect all these logs. Because guess what? Just flag something. Flag everything that gets blocked.