Finding Haystacks in Your Needles: Threat Hunting Problems in Real World Data - Miller

Name: Finding Haystacks in Your Needles: Threat Hunting Problems in Real World Data - Miller
Uploaded: 2017-05-16
Duration: 29 min 35 s
Description: Resources such as SANS's "Know Normal, Find Evil" and MITRE's ATT&CK framework are a great starting point when looking for malicious activity on a host ... but what happens when you actually start diving into the data? Is finding malware really as easy as just looking for network connections from No

BSides Boston29:35798 viewsPublished 2017-05Watch on YouTube ↗

Mentioned in this talk

Tools used

Carbon Black SentinelOne

Frameworks

MITRE ATT&CK Framework

Vendors

Palo Alto Networks

About this talk

Resources such as SANS's "Know Normal, Find Evil" and MITRE's ATT&CK framework are a great starting point when looking for malicious activity on a host ... but what happens when you actually start diving into the data? Is finding malware really as easy as just looking for network connections from Notepad? (Spoilers: It isn't.) This talk goes through a number of real scenarios where legitimate applications behave just like malware, and how to improve behavioral detection.

Show transcript [en]

feed so sorry for that we're going to actually kick off here for Sarah so a little bit of Sarah's background Sarah is a current employee of carbon block another fine Boston company so she's a threat analyst our threat Intel analyst excuse me previously she worked as a security operations team manager or just an analyst I hold that ms:i a from Northeastern University with that sorry for the delay on my part Sarah Miller

okay so last talk of the day thank you for coming I know we're all excited to get on to the closing ceremonies in the networking event but this talk is called haystacks in your needle threat hunting in real world data you probably want to know who's giving the talk so that's my Twitter picture it's a disgruntled Hedgehog and I'm a threat Intel analyst for carbon black I started I was an ESL teacher originally and then I thought computer security sounded really interesting so I went to Northeastern I got a master's program and then I started in the carbon black sock three years ago was on the blue team for two years and then got moved over to the

product side so my current job is analyzing malware looking at his hat campaigns and trying to find behavioral signatures that our customers can use to detect attacks malware in their environments so this is a talk about false positives I just said my job is to look for sort of behaviors that you can use to identify attackers of malware and it turns out that this is a really hard problem it's a really hard problem for two reasons the first reason which I have this picture there's actually an octopus in that picture pretending to be a piece of coral the first reason that this is difficult is because malware is trying really really hard to look legitimate and

actors are trying really really hard to look legitimate so the closer that they can get to looking legitimate the harder it is for us to detect and that's why they're trying really really hard to do that the second problem is that you have a bunch of really smart threat researchers who are all looking at malware and looking at attack data in very isolated clean environments and they're actually not going out and they're not sitting in a sim all day dealing with all of the weird false positives that come out when you take these beautiful signatures that detects malware in a lab and put them in a real environment it turns out that real environments have all kinds of weird

things that your users are doing your IT admins are doing and the people who wrote your software is doing your security software is doing and you have no idea kind of what what noise you're going to get until you actually start testing stuff out in the real world so that's what this talk is about this talk is about why it's so difficult to come up with signatures that actually work so because this talk is basically a bunch of unconnected examples I needed a little bit of a framework and everybody loves the kill chain so I went there if you don't know what the kill chain is you probably haven't been to a lot of talks today but um for my mom who is

here and knows nothing about technology

the kill chain is basically a series of steps that attackers used to get into a targeted network I've used the I show a screenshot of the mitre kill chain here because I work with endpoint data they focus on sort of a lot of the different tactics and techniques that happen once you're on an endpoint trying to achieve your objectives and stay hidden and originally when I gave this talk it was 50 minutes long I've cut it down to 25 which means I didn't get to do an example for every single step in this one but I do have a few examples and they are organized in the in the order of the kill chain so we're going to be

going through those okay so the first thing that we're going to talk about is a reconnaissance false positive reconnaissance you're outside the network you want to learn about the target you want to figure out how you can get in a really common technique in reconnaissance is scanning two types of scans that you're kind of looking for one is a sort of scanning every port and I on a particular server a particular IP so you know they want to profile it they want to see what services are open they want to see if any of those services are vulnerable but it's all ports on one IP the other type of scan that you see a lot and we

generally treat it as a nuisance scan is when you have an IP that's going you have an IP that's going across all of your servers and it's checking port 22 on everything or it's checking port 3 3 8 9 so it's looking for just an open service that it thinks is easy prey it's going to look for a nice easy target but it's sort of one port across all servers so this is a really distinctive behavior our sims know about it our firewalls know about it you're tier one analysts know about it and they have sort of an expectation of what this is so we were in the sock in our sim and we had this

alert pop-up and this was kind of alarming it was an internal IP so someone's workstation or someone's server and it was reaching out to the Internet to hundreds and hundreds and hundreds of IP is out on the public Internet and trying to hit port 137 for those of you who don't know port 137 is something that you're only supposed to see in your local internal network it's a protocol called NetBIOS which is used for other computers to discover the hostname locally but it should never be going out to the Internet and the fact that it's going out to so many hosts looks exactly like one of those nuisance scans we were talking about where you're

looking for one service across a wide range of hosts so if you're working in a sock and you see this your first thought is someone got into my network and they compromised this one computer inside my network and now they're using that computer to go bother other people on the internet so that was kind of our first thought and we were like what is this we did a bunch of investigation and we actually figured out that what happened was that a developer was running process monitor and web browsing at the same time and this actually causes this behavior so what's happening is process monitor is looking at everything that's running on your computer and it has a set

that allows you to resolve any network addresses that it makes connections to it does this by using DNS but it turns out that if the DNS lookup fails it then tries to NetBIOS the host so if your web browsing you're connecting to hundreds of IPs out on the Internet it tries to do DNS lookups for all of them it fails and then it tries to NetBIOS them so kind of are after the first one we spent the first time we saw this we spent a good bit of time trying to figure out exactly what the hell was going on and then after that we'd see it and we'd message the developer and say are you

running pokémon and web browsing you say yes and we'd say stop it so it turned out to be you know not a big deal but also it looked exactly like a scan to the sim and they had no way of knowing that this was a behavior of proc mods and how would they even know that pokémon was on that end point so kind of an interesting no one could have predicted that this is going to go wrong okay so reconnaissance you've learned about your network next thing you want to get inside your network you need to deliver and exploit it in some way get a foothold a really common way of doing this is phishing emails and so we spend

a lot of time doing user education about phishing emails telling them you know if it's from someone you don't know be a little cautious if it's from if it's got an attachment be a little cautious if it's asking for your password you know throw up the big red flags contact someone it since they should never be asking for your password and email be careful of links we have all these different examples of phishing emails where they're trying to get passwords out of people so we had a user submit this to us and say hey I think this is a phishing email what should I do and it's a secure message from this company the user had never heard of this

company before has never interacted with them and it says you've received a secure message there's an attachment and HTML attachment called secure dock to open the attachment and read your secure message type in your password and if you look at the attachment and sort of just look at what it presents to you the HTML in a browser you see this very nice little envelope sort of looking thing and it has you know from and to and again you've got a secure message from this company and then a little box for your passwords if you go a step further and you look at what is actually making up the HTML you see a bunch of this JavaScript and this

is really obfuscated JavaScript it's really hard to figure out what it's doing if you've ever looked at JavaScript malware it's all super obfuscated all the time and even sort of normal JavaScript can be kind of tough to read but generally when you see stuff like this you start thinking oh it doesn't want me to know what it's doing so it has all of this gibberish in there and all of these weird variable names but it turns out this is not a phishing email this is a real product that Cisco puts out called the registered envelope service if you don't have s/mime or PGP configured you can't you don't have email security working this was sort of

an older solution for that now we have all these sort of secure file transfer things that we might use instead but at the time this is a good way to send secure messages so for us our our user wasn't used to this she reported it to us we investigated we figured out that it wasn't a fish because we saw where the HTML was connecting out to and it was all legitimate Cisco domains but it was really interesting because you could imagine as a red teamer if you found out that an organization was using this service and then you sent them something that looked exactly like this and just change the IP use that it was beaconing

out too you could probably get their passwords so I thought that was kind of interesting and if anyone here is a red teamer who ends up using that on a client I'd be interested to know how successful it is okay so we've done our recon we've done our delivery let's talk about exploits and it was really interesting being in the keynote this morning because he has a couple of examples that I talked about also and the first one of those is encoded commands PowerShell so Palo Alto did a real great article at the beginning of March where they took about four thousand samples of PowerShell malware and they look at the different arguments that it

uses to run so since they were specifically looking at encoded command 100 percent of their samples had that command-line argument they had different variations of it because you don't always have to write out encoded commands but all of them use that command in one form of the other they also had some other commands that you might some showing up in these different powershell malware samples so and most of these are either trying to hide themselves from the user or trying to get around some type of security policy or profile that you might have configured so window style hidden non-interactive no exit and no logo are all trying to hide themselves to make themselves long obvious and no exit is

actually this said that no exit is PowerShell continues running after you execute the commands so a little bit of persistence there and then execution policy bypass are unrestricted and no profile are both used to sort of get around if you had security settings configured for PowerShell this will get rid of stuff so they looked at malicious PowerShell and they said these are the frequency that these particular arguments show up in our malicious PowerShell and I looked at that and I thought that's really interesting I'm very cynical and I'm very dubious about these things because I spend a lot of time QA testing signatures so I said I wonder how many legitimate environments see this and what kind of legitimate

encoded PowerShell is running out in the wild so I took a small sample of our customers between 50 and 100 and I looked in their environments for these different things and I found that I found four percent of them had legitimate encoded command PowerShell and these max numbers are sort of in one customer environment what was the highest number of results I got when I looked for this particular command so there was one customer environment I looked at that had 1,700 different encoded command PowerShell processes running and this is that you can imagine if you're working in a sock you put in an alert 4 encoded command PowerShell and then you get 1,800 hits back you get 1,800 alerts in your sim

you're going to go a little bit not so clearly in a lot of environments this works really well and in some environments it's going to drive you nuts with false positives same with some of these other commands so hidden and execution policy bypass were kind of the most the two most common in the samples that I was looking at a few others were much less common and again this is a pretty small sample so it may not be representative of sort of organizations as a whole and I've also been discovering that there's a lot of variation in organization so what's very normal in one organization is not normal at all in a different one which is why

it's really kind of it's all about kind of what's going on in your environment your baseline your sense of normal so you say okay well what what is running legitimately and wants to encode things and we found a lot of VDI tools that we're doing it and so here's a couple of examples newton extent and then i forget an egg Hrant doesn't and we have there encoded command examples and then we have what they decode as so nothing super super malicious going on here but for whatever reason they're using be encoded command argument the Palo Alto blog post does go on to say there are certain combinations of commands that are certain combinations of arguments

that are more frequent in malware and or less frequent innerwear but as Dave was saying this morning it's really easy for an attacker to change those once they know you're looking for a particular pattern so it can can help you detect things but it can also give you a sense of false confidence okay so we're on the host we've exploited it we don't want to get kicked off too quickly so one of the things that we're going to do while we're on a host is probably some type of ization so there's a whole bunch of different things that nowhere might do for evasion but one of the really interesting ones that I like is checking for AV products this

can be both checking on the host to see if there's a V running and also a lot of times it shows up there checking to see if they're in a sandbox or if they're in a researchers VM and if they are they're going to shut down not do anything interesting so you can't analyze them so a few different examples here first him was a piece of malware that Sentinel one wrote about last summer it's really cool it's really complicated so they looked at the dll's that were getting hooked in and they had a list of which AV you use which hooking dll's they also looks for kernel drivers and they knew which AV is used which kernel drivers and I think

forum is actually the one that would see what AV you had running and then would change its behavior based on which AV you had running so it wasn't even as simple as I'm going to shutdown now it was I have specific things that I'm going to do or not do based on the presence of a V in your environment second one dry text macros a little bit more trying to detect sandboxes and researchers so it would be Canal and then it would check the IP or the domain that the traffic was coming from it had a list of different IPS or domains that belong to different vendors and so it would shut down if it was beaconing out

from a vendor and the final one was actually just a sample that a colleague of mine was looking at when I was putting together this talk so a sample called eye pyramid which checks your Program Files directory to see what you have installed checks your run key and checks to file all entries and here's a screenshot of the different various antivirus programs that it was actually looking for so you look at this and say ok I understand why malware would do this or I understand why a malicious actor would do this but I can't really think of any reasons that someone would legitimately check for 20 different AP companies that seems weird so I think to

myself okay I'm pretty confident that if I see this this is malware this is a bad actor you know that this should be high confidence in theory so we got this submitted from a customer in December December 8th which I'm mentioning that specifically because if you look at the compilation time stamp down here it's November 30th so this was pretty new it doesn't have a lot of metadata describing what it is it has kind of a innocuous name it doesn't have a description it is not signed by the publisher and its guests so 17 out of 61 on virustotal when you look at it it's a very small at the dll so it's meant to be loaded by

some and run by some other process and all it does is it has a list of AES and it checks the Avs to see if they're there and it checks what version they are and then it writes them out to an XML file that's all it does so we look at this and we go okay we know that evasion is something that malware does maybe this is a piece of malware because it's a dll we know we don't have the whole story so we sort of go back and forth with the customer we get more data about where it is I think it was in a temp directory which made it even worse and we finally are able to

confirm that this is actually part of their VPN software so a lot of companies on your VPN they won't let you connect until they've checked that your software is fully patched and you have the AV that your IT department put on and it's completely updated and your GPO is updated and that looks like this and it's completely legitimate so I don't so i scans this again on April 1st and it still had that pretty high score and virus total and I'm not saying that you talked about how silly AV vendors are because I think this is actually a really hard choice to decide if all you have is a static this is bad or this is

not bad what are you supposed to do with this particular DLL it's doing something that we know malware does but it's not malware so I don't think the 17 AV vendors who marked this is malicious are necessarily wrong and I don't think the you know the other forty-something who marked it as non malicious or necessarily they have the right answer right like this could just have easily been and based on this binary isolated by itself without knowing anything about the process that ran it there's just no way to know for sure okay so that's one type of evasion the other type of evasion which again is kind of overlapping with the keynote this morning is that if you have a legitimate

process you're using PowerShell you're using command something like that people are monitoring for it and your security software is often monitoring for it looking for malicious things based on the process name so if you change the process name they don't know they're no longer monitoring and therefore you can run sort of whatever you want undetected so the top example here is a run DLL process but it's been renamed to service host uxe so both legitimate Windows binaries but because run DLL 32 can be abused and we know it can be abused they've renamed it and this was a ransomware sample called crypt xxx that was you know briefly popular over the summer before something else took

advantage it took took precedence over it the bottom example again is a command the windows command processor and it's just been renamed to Johnson exe so I'm not sure who Johnson was but he had he was used to evade someone's security software at some point so ok this is kind of cool this is kind of interesting you start thinking how do I catch this there's kind of two options that come to mind one is you get it a list of all of the hashes for PowerShell and you compare them against the name of the running process and if it has the PowerShell hash but it's not named PowerShell than you you know do something about it um this is kind of

it's hard for it to be human readable there's a lot of opportunities for error you might miss a version of PowerShell if to keep it updated all the time so maybe not great the other option which is kind of interesting we were kind of playing around with it is to be able to compare the binary metadata the internal description of the internal name against the path or the process name so there's a whole bunch of different examples of that it works really well to detect sticky keys it works really well to detect renamed command renamed PowerShell but it does have false positives it's all positives happen when your company goes international because at that point your french computers or

your chinese computers are going to have their binary metadata in the local language so that can be kind of an interesting problem and I I like that example because you know if you have a threat research team that's all American or American and UK and Canada and they're working and thinking about things from an english-speaking perspective there might be things that are normal in other environments they don't know about so it's kind of a good a good reminder that even if we're based in one place a lot of times you know we're trying to do security all over the world so okay I think I might actually be on time this is exciting okay so

those are a few examples and that leaves us with the question there's the problem what's the solution if I can't trust what my vendors are telling me is malicious or not malicious and you know I I get stuff from the fire I blog post I run it in my environment and it has hundreds and hundreds of false positives what do I do about this and first step is not to panic sometimes when you start threat hunting or you start investigating a false positive you realize there are thousands and thousands of processes that are coming up in your search coming up in your hunt you are going to be able to figure out an answer and you are going to be able

to figure out what's normal in your environment you just need to put a little bit of time and a little bit of thought into it so first thing is really understanding what it's trying to alert on or what the thing that you're looking for that what the you know the bit of advice that you took from the fire I blog post is actually trying to find write so why does an attacker want to use the encoded command switch the more you understand about what the alert is actually based on the easier it is to tell if it's a false positive or not the other part is that the more you understand about sort of the intent of

it the easier it is to ask question well okay if I detect encoded command and I detect PowerShell is there anything the attacker can do to change those and still kind of get there their objective so it allows you to start thinking really creatively and not get too tied down to a specific set of indicators the other part is looking for patterns unique to your environment so I don't know what you've got running in your environment and I don't know if you've got a VDI tool that's running in coded PowerShell but you do and it's going to be really obvious as soon as you start looking so now you know okay encoded PowerShell in my environment

should look exactly like this and if it doesn't look exactly like this then I want to see an alert finger another really good one that I think more people should be should be profiling is PowerShell that makes network connections outside your environment you will there are some environments where it will a lot of people a lot of IT admins use PowerShell to talk back to Microsoft get updates get patches stuff like that but it's not going to be all of the organization's and you can still profile that you can whitelist the Microsoft IPs and get IPS for you know if it connects out to anything that's not Microsoft you want to know about that so you're not going to be able to

get rid of everything you're not going to be able to get rid of all the false positives in your sim you're not going to be able to make all threat hunts work perfectly so what do you do you tune out the stuff that you can if your tools aren't letting you tune things and they aren't being really granular and push back on the vendor and say I really really want this please let me keep this and the stuff that you can't tune out document it you're going to get new sake analyst and you're going to get new blue teamers you're going to get in IR consultants when you have a breach who don't know your environment and if they

can say hey why is this printer making this weird network connection and you say oh yeah we investigated that two years ago here's what it's doing here's the steps we used to find it and having this really really rich wiki that's going to make everything a lot faster so takeaways and we're just about done the vendors are there's a lot of really smart people working for the vendors I work with these brilliant people every day but that's only half the battle and their knowledge in their experience only sells half of the problem the other half of the problem is what's normal in your environment and honestly the blue teamers are the experts there so don't

feel bad about working in a sock or don't feel like you're less important than a malware researcher because you work in a sock because you are actually an expert in your own environment and your knowledge is just as valuable or more valuable than sort of the knowledge this abstract knowledge that's out here your security tools are making assumptions they have to make assumptions because a lot of times they just don't have enough information to know for sure so they have to say this is malware or they have to say this is not malware and you want to understand those assumptions as far as you can and you want to be willing to question them and challenge them I'm a Harry Potter

person so I always think of the quote in the second Harry Potter book where they say don't trust anything if you can't see where it keeps its brain and I want to understand if my skin says hey this is a situ I want to understand why it stinks it's a situ I'm not going to trust it right away false positives are not always a waste of your time this is very controversial but if you're getting thousands and thousands of alerts that are all the same yes that is a waste of your time if you've gotten a false positive it's the first time you've ever seen it in your environment that's interesting that's worth taking some time to try to

understand you know what is this false positive about why did my tool think that this was malicious what is actually going on here is it new or did the tool get an update because all of those things are either going to teach you more about security and more about attackers or they're going to teach you more about your environment and finally configure your tools don't take the out-of-the-box settings make them specific to your environment make them about the things that you care about automate what you can this can be hard and then document everything that is sort of weird in your environment that is not intuitive and that anything that you've spent 15 minutes investigating

to figure out what the hell it is don't just walk away from that spend another three minutes writing down what you found so that's the end of my slide show the end of my presentation I'm happy to take questions now or if people are feeling shy they can come up to me later but thank you I think I have a question [Applause]

thank you [Applause]

Finding Haystacks in Your Needles: Threat Hunting Problems in Real World Data - Miller

Related talks