Threat Hunting: The Difference Between Safe And Sorry

BSides Bristol · 201924:3359 viewsPublished 2019-07Watch on YouTube ↗

Speakers

Dan Pitman

Tags

CategoryTechnical

StyleTalk

Show transcript [en]

yeah my name is Dan Pittman I work for that logic as a an architect doing various things cloudy staff and Scootie stuff obviously so I'm gonna have little talk about threat hunting and some of the practical stuff that we do within our research team and our soccer teams so obviously starting off with a sort of definition or at least talking about what the difference is and there's only one difference as far as we're concerned right between detection and hunting and that's where there is known and unknown threats detection obviously you know what you're looking for and you can write good signature and it's nicely tuned and so on and so forth whereas hunting you know you may not know what

you're looking for you're trying to find stuff that's unknown but you still need to kind of know which bits you might want to look for something to search for and I kind of kind of talked about that a bit today obviously if anyone disagrees you can not ask any questions later maybe posted on YouTube so look let's set the scene right so this is the way we see security is that you've got some really highly trained people in the middle and they're you know you've been training them for ages and they're really highly specialized but you haven't got very many of them normally and then you've got a bucketload of attacks happening right because a lot of

most attacks these days that we see in our socks and we run a a sock has about 4,000 customers we have a lot of data we see a lot of attack attempts and some attack success and you know that water mation that drives those attacks is what's the biggest challenge really and so I'm going to talk a little bit about why automation is critical for threat hunting even though obviously that people are still are still critical so where do those things go we've got our school team we've got some attacks that we were able to thwart but still we've got these inbound threats right and generally speaking you're getting overwhelmed a lot at the time especially

if you run out a small stock and you've got a lot of data or you've got large Maya systems to manage it can be very difficult and just a switch stalls program you know what we want to build is our nice analytic defense right so see the good guys here and it gets a bit confusing thanks to George but building that analytic defense of building your clone army is going to be important right because you can know some of what you're going to look for and ultimately digging through all that data can be the biggest challenge so what's that analytic defense automation look look like you want to match the behavior in the techniques the attackers necessarily

not looking for a specific thing in a signature you know can be relatively trivial to do at times if you've got a known attack and I'm going to talk about that but they're different ways to apply that automation that's like I says the core difference industrialization of attacking is a very real thing now we've got countries where unemployment is very high around the world and there they have teams of people that are probably more organized and some of the companies I've been in to see running as attacks and passing things around the world and attacking our our customers and so we really need to make sure that we can find these things so we talk about a

known attack and this is the shot with bug from 2015 it was relatively easy to find right so we had we knew that the content we were trying to look for we knew to look for this admin HTML and forwarded equals we can look for that and within the URI there was a community rule out I was a bit more tuned and this tool obviously so you know if you know if you see these things then that's the attack being attempted and you know you need to go and dig into the data and that that can be quite an easy thing for someone to do in their sock right you take the community rule maybe you tuned

it a little bit you build it into your IDs and you escalate it to someone who knows what they're doing but threatening is different right it's about finding what you don't know like I say it's not the same as zero day hunting of zero day something there isn't a patch for but what we're trying to do here is find stuff that before hits the public domain so one of our big success stories in that case was one a cry week downloaded the shadow brokers packet we were looking at it months in about a month and a half in advance able to understand what was going on our customers were protected from that I'm not going to go on about one or try

today obviously everyone's talked about one across forever so one of the key things though is the automation is pervasive within all of it right don't it sort of labor on the differences but definitely you need your own clone army of things to try and find things out and you can use automation in both of these cases but we're going to focus on threat hunting today obviously so what you need to do to prepare right you need something you need obviously need data right so if at star wars now we've got a lot of the rings I'm not going to do Harry Potter don't worry and your automation is only as good as the data you've got I would say right so we have

a lot of data one of our cheesy snacks we throw around we've got bit we've got more data than Netflix and that's the quantity of data when you've got a large sock 4,000 our customers as I said the amount of data that you collect even just our event data that alone log data is massive and so our challenge to sift through that and get into it is a really big deal for us and that's why we need more ember smoke tools we need very very clever people working in our research teams and our our sock and that automation is critical so what are we looking for in that data there's some obvious things right source IP destination IP etc we've

got a sort of generic payload on on the right on your right and we're gonna focus on this payload body and the URI today but you know for that known attack we know what we're looking for we know we're looking for things in the RI we can find out very easily but when we're doing threat hunting we don't necessarily have that luxury we might know some little bits here and there you know the admin HTML being here it might give us a clue but obviously that's gonna throw up a ton of false positives because that's hid when people are doing administration against that against that system perhaps so you want to collect a lot of data and but why not all the data

right so there's a few reasons for them I love this this tweet ages ago by by James and kind of highlights why Big Data isn't this of panacea to everything one of the big reasons for not collecting all the data or at least trying to manage the data properties cost there are some vendors who entire strategy is based on pH we're selling to people run seam tools and want to keep swishy financial organizations will want to keep their data forever or at least seven years and so that means a huge amount of data being stored don't necessarily want to compress it cuz obviously that takes longer if you want to go to back and do post instant

analysis if you do collect a lot of data then you need to do some good stuff with it right you need to normalize it you need to know that the timestamps are normalized between different systems maybe they're in different parts of the world or that just drifted a little bit so you need to do that and obviously you need to categorize right if you want to search you know data and this is a big challenge for us than just searching carte blanche throughout your entire data set will take a long long time so you want to be able to categorize that and look for specific things and that's the kind of thing you know you can do

with increasing volumes and diversity of data is making sure you categorize and normalize appropriately so what strings can we pull out there's some obvious ones right and boring stuff like source IP and destination IP source he for us is one of those interesting things because obviously we all know that keeping IP reputation has a diminishing kind of return on investment because attackers shift IPs but it's much better than to try and look for their the IPS that are in there controlling infrastructure we think about if you saw my talk yesterday talked about the dropper systems the infrastructure that they're managing that will host the web shell or host a piece of malware that they pull down

when they're doing a remote code execution that's unlikely to shift you know that's going to have a more static IP not completely static obviously but they might use a domain or something like that we can keep track of those things rather than looking for the specific attack right because that's just going to change all the time and there's other bits and pieces you know like the head in the headers Apaches flats is quite a common one for look for in headers and so on user age and obviously they can spoof this but you know you can often tell if they're not particularly bright whether they're whether they're automating things we're going to focus on payload body for some

of the examples I've got we find it definitely to be generally they're more useful when you're looking for a taxi looking you're doing threat hunting because it contains the richest amount of information and this is where that expertise comes in right this is where some of the guys they're here today from our sock really add value to us because they're able to dig into this data and understand and start to tree out as far as where they need to look and start we can start to return that information back into our research and content teams to generate what we would call telemetry signatures or more wideband signatures to pull in more data to do some great

threat hunting so I'm going to talk about these two relatively obviously that means they're there they're still out there we all know there's the vulnerabilities don't disappear within within six months I think that struts one is the one that was really most damaging in that year so we've seen successful attacks for both of these and they're still being exploited like I say so there's obviously an exploit DB record those of you don't know exploit dps is great resource for attackers and defenders alike so you can go on there and you'll see proof of concept code sometimes and so if you're looking for a known attack and it can be very useful and you sometimes you get proof of

concept code like I say but often you'll get some kind of payload information like this we're not going to go into the specifics of how this works but we can see that there are certain potential contact matches within this so we can see that different parts of we could use to build a signature or do since we're hunting on or specifically find this attack one of the interesting ones in here is the process builder so this is the way that the Java execution is run remotely or within the payload and so you can build up this process basically and run a command and this is the one that you know it was useful to us in these examples so now we're

starting to talk about how we're going to hunt right so this if you start to build a signature using process builder there's kind of two outcomes if you've got low data volumes if you're you know a relatively small stock with a relatively small set of systems you know maybe it only fires four known attacks it's not really adding value on top of your existing signature set it might not fire at all it might farm some benign use maybe you're actually using process builder internal need to do some stuff between your applications so you just turn it off right and sort of waste feel like you might waste at a better time don't do something else have a coffee

whatever if you've got high data volumes obviously you can just get swamps right so you might you might feel the same that you're not really getting any value out of it and or you might just sample bits of data and start see benign use and discard the process again no value go do something else right you might be able to use that data later for forensics and obviously that has used but again we've got that data storage problem so this is why automation is really important so back to this back to this view of our attackers and whether or not we've got enough manpower is important right so we want to build that automation out so what do we need we

need dedicated resources to be investigating the data we need then supported by dedicated tooling this is something obviously where we can do a low logic because we build and develop our own tooling so we can build and develop tools for our sock and for our researchers whether that be automation to go and grab this of malware that we've seen attempts attack attempts being used within or whether it's you know within the stock itself and searching through data using different tool sets or to be ready when this WebLogic thing drops to the world and actually dropped on a Saturday you can see probably not in that little Chinese thing as a Saturday released on this Chinese blog so that's how we found

it we dropped in and we were able to find this Chinese blog we were able to obviously translator and speech people internally and work out what was going on it wasn't just on the blog it was on Twitter as well so how many people are monitoring Twitter for for new attacks or emerging threats as we call him but it wasn't all so bad so but ultimately it was on the 23rd of December so not fantastic right but our guys were there and working and they were trying to work this through and understand what was going on within the attack and again we saw this process builder thing right so we started to realize that this was

obviously a good way of detecting attacks not just these specific ones but other attacks that might be happening [Music] this was months later obviously after the WebLogic stuff after the struts stuff so yeah mary coasters we knew that we knew that process bill that was involved again and we can look again and see what what we could do to use that for Cynthia hunting right so let's look at that WebLogic stuff so let's pretend we didn't know about it we didn't know about web project we were still collecting data for hunting purposes we'd used the struts as to inform us that we needed to look for process builder and we wanted to have a look at

for a generic java signature to detect attacks against java from the outside and so we have this payload right so we'd never seen this async response service before we knew this process builder was in there and we'd there was this nasty PowerShell command coming in through the string within the payload so not fantastic but we knew maybe that we wanted to look for a process builder so we really need to pay attention to these three things and then we start we're on a threat right we should start to do some analysis so if we do we're doing a blind analysis we maybe won't be sure about what was going on within this payload wouldn't necessarily know the

paper process builder was a nice way of finding an attack what do I do do I try and run it myself to this does the your eye look anything like we have internally and it was a 500 response which is kind of important a lot of time people assume is 500s and errors not success but actually in this case 500 was an indicator of success and so because we've done this research beforehand we knew that and we knew that was important and it's that classic needle in a haystack with needle in a needle stack as one of my colleagues sometimes calls it right at that point of time right in that at the end of

December in 2017 the amount of traffic was very very low so we wouldn't have been having high hits from a more tuned signature but the point is is that you want to be able to detect to detect these things early on if you wait until it's in the public domain if you wait and like we say six months later when the traffic was out higher and you would have been seeing much higher hit rate from that signature and maybe it would have drawn attention in you're very tuned system you know it's too late basically they've already built out the attack they've already got over the hub of things not working or their automation not really being fantastic

and systems are being attacked so having that early warning system generated from automation and generating from your threat hunting is really where weak and we can get valuable for our customers so the point is you need to optimize for this new stuff right build automated systems looking for generic exploit code something that you know is used externally to run commands or part of an RC not that specific you're either the part of the payload that lets you see what was happening unless you give an early warning to your sock and someone who can go and look at it that security expertise is where it's really is critical there because they're the ones that are able to sort of correlate out

and understand what that means and look at the rest of the payload and understand it and so it's imperative you need a high grade security expertise within the circle within a third party so more recently we've had done Drupal which was also quite interesting and very damaging another RCE so Drupal made an advisory out and we could see ever obviously that this was going to be significant again and this was in the core code of Drupal is something I touched on yesterday not a plug-in another way of running commands within Drupal which was exploitable extend so again we've got an exploit DB record the majority of vulnerabilities end up with one at some point but normally this is a

good way of telling whether or not it can be exploited in all whether something's going to get better or worse and so meets the same bar is the WebLogic one basically so again because he start to see some of this going on so when the call pass the ability was them trying send this get request the format and we could see a curl request pulling down file but Sh obviously it wasn't coal-fired FSH we've changed that there's a difference between of payload in the actual vulnerability here in it from the exploit DB record we could write an idea to catch the vulnerability itself but the payload was something that wouldn't have been appearing on that expert DB

records so again we can start to look at this data and work out ways of understanding how to catch other similar attacks using Perl or something else and you can get that from looking at the actual exploitation traffic not just from you know the vulnerability data that was released by the company or the export DB record so what's the next steps for hunt right so P they're using curl what about W get maybe they're using that you might wanna add rice and data capture mechanisms to look for that Wars out of this type e has the IP address hit and anything else recently obviously what's that payload attempting to do is it running another payload is

it downloading something is it trying to deploy some kind of crypto a mine or something else and can you use that to generate more threat hunting data collection methods and important obviously what does the command control look like can we write detect code for that and can we find out what that happens the stages after that initial curl attempt so none of these steps really require attack to have being successful this could have been an early warning your threat detection is generating leads for your threat hunting hasn't necessarily detected the success of the attack but your people are able to then go and look for other attacks that come along later and that's what's important is that yeah we can get a

signature from a community store we can get a you know how they're from our vendor their supplies our IDs and we're going to understand necessary the output to that may be tune it for our own purposes but for hunting is about finding things that we don't know about all right and finding things that might be suspicious and somebody a human being can then go and investigate without them trawling through huge amounts of unstructured data and obviously useful for it for for for forensics you know if you see that file SH on any of your boxes or were there any other vulnerability or any other traffic then you can start to understand that maybe that's a system

attack this in the lumber so we generate what we'd call a virtuous circle so you understand one threat you start to look at that threat whether it's in the URI or somewhere else you strap the payload you start to dig into the payload you may be trying to execute it somewhere build an environment there's different aspects of the attack analyze the command and control camera background have another look at the threat extractor your eye analyze your eye and you end up with this nice virtuous circle of trying to find attacks that you don't know about that aren't necessarily in the public domain she's always too late really to analyze the siient that cnc and perhaps release it

more content for that and build up a nice virtuous circle basically and within your process and you some people in in more operational roles might call this continuous improvement and we've largely automated a lot of this and operationalize that for a hunting capability within your dataset so we end up with our clones fighting clones right so attacker uses some new vulnerability they might drop the same payload we've caught it right because the payload was similar they might use a new payload in the same campaign so it's using a similar type part of the payload that process build or something we've caught it again and that's the point you end up with you know ever sort of increasing amounts of

increasing detection capability basically via your throw hunting mechanism another way of looking at it is this threat tree so you can imagine that this carries on off the bottom of the slide obviously that you build up there nice idea of a certain threat and you can understand the new payload that was in that threat and start to understand a new threat from that and they'll get a new pillow down of that one and carry on and going maybe not forever but certainly there's very little new you under the Sun which we should we should realize and attackers in it are inherently lazy most of the time so they're going to reuse stuff right and that's often what we see and

so you assume each a plateau where there's very little new to be discovered and you were able to automate that for hunting so summary of what I've talked about so far can give you advance notices of attacks right and going on that you didn't know about something new or something weird or a different approach to an existing exploit that we're out there and obviously really easy to throw these up in bullet points but that capability is directly tied to the amount of data you have right and how you manage that data most importantly it's all very well having huge reams of data but the automation you invest in it and the people that drive that automation

whether its traditional or machine learning and ultimately you can't necessarily be ahead of that tacko but you can walk right directly in in shadow so hopefully you're seeing the attacks when they're attempt it's not successes and which is really our goal a little logic if you think about you know having four thousand different companies within our common platform and all the data that comes in and think about the IP address space of our of our of our customers it's very long I suppose you could put it that way and hopefully we're going to see something very early on build detection content and then discover it for the rest of our the rest of our customer base before they get to

some vulnerable system that's down here in a normal organization or with a much smaller IP address space you have a much shorter amount of time and thus that's what I was once a customer via logic and that's kind of what I bought into right I had a much better chance of being protected from attack as I most likely be down here if I perhaps reasonably regularly it's a better summary of kind of our process really and last fight there's no there's no sales lies in this one so we'll take data from a lot of different sources and that's the point so it's not just about the network obviously we look at the cloud now everyone's moving into clouds and most

people have got some kind of cloud platform so taking the logs that audit logs the access and change logs from the cloud is very important as we have traditionally done on-premise and pulling in data from containers containers are obviously much more widespread than they were even a year ago and so monitoring now is really important and we built a native container solution that you can take out the repo and drop into your hosts and it monitors that into an intro container traffic obviously different connected devices firewalls etc and taking logs from those and feeding it into our back-end platform through some of the processes that we've talked about today to make sure that we can analyze that

data and escalate incidents into our sock so we were we raised a significant I have incidents through our sock and talked about some of those numbers yesterday other than the point is using that they're different mechanisms whether they're signatures and rules and traditional methods or whether it's more modern human or system anomaly detection or some machine learning stuff for false positive reduction we can understand whether something's medium or low what was informational and ultimately you know you don't want to be woken up at 3 o'clock in the morning you want an email or something like that or whether it's critical and high and needs to be looked at by a person as quickly as possible

see we've got this 15-minute SLA we give them which is pretty tight to determine whether or not we should wake our customers up if it's out of hours and get that to them in a way that describes their next actions right ultimately our customers understand their systems the business and we understand security and those two things that work well together that's me I'm a little bit ahead of time and we'll see if anyone's got any questions so I can take them on hopefully some use I so two questions if I may one is how to deal with HTTPS are you just on the one side of the TLS offloading and then my second question would be how do you know your customers

network so one of the issues we face having recently merged with another company is we see stuff like WordPress attacks and then we then have to question what we've been using WordPress does this matter that sort of thing okay so yeah tell s or SSL so yeah we're we're the other side of your float generally speaking it depends what we've deployed so if we deployed our wife we can obviously man and middle that and have their signature certificate on their wife or we're going to decrypt it on our agent on the house so we have different mechanisms well we'll have an agent or plug into a spammer either way we're going to decrypt that we need to we need the

certificate it buries customers networks that's obviously a big challenge it's easier in the cloud because you've got a lot of API so you can hook into we're going to use our scanned data so we deploy our vulnerability scanners and they'll hopefully run regularly and so we can use that scam data to build an asset model if you've seen our system you see we build up a nice topology map of the environment and we can understand that but that's where Security's always a partnership right between engineering and operations and security to make sure that the language you've been used and the way you communicate their lives in context but that scan data would tell us if the WordPress is in the environment

one of the things I talked about yesterday was our emerging threat process and that's where that scan date is really useful if we see a new emerging something hot like that Drupal one was then we can use that scanned data to proactively reach answer hey we you're vulnerable right you want to pass this this is going to get bigger and going to get worse so there's a mixture of approaches any more questions no sweet thank you very much thanks [Applause]

Threat Hunting: The Difference Between Safe And Sorry

Related talks