
hello everyone thank you for joining us at b-sides tampa 2022 i am excited to introduce our next speaker christopher peacock he will be giving a presentation on the process of detection engineering here's just a few of his many experiences cyber threat intelligence analyst cyber threat hunter tier three sock analyst incident responser he also has experience in multiple industries and energy finance health corp healthcare technology and defense please welcome christopher peacock thank you for that introduction really appreciate it thank y'all for being here let's see if we got slides up okay there we go thank you all again for getting that set up really doing a great job today guys um so the process of detection engineering first off who am i i'm christopher peacock i started my journey at alpha 2 where i did help desk network administration system administration that whole jazz i worked up i went to general dynamics where i started working um as a sock analyst and conducting purple team back in that time you know powershell was just coming out as something that was like really used by pin testers and also adversaries but we didn't really have edr's yet so it was how to you know go out and enable the powershell logging around that uh from there i went to raytheon where i did intelligence i did incident responder sock 3 led the threat hunt team which is now more kind of what has become detection engineering right we've the threat hunts at the end of the day you end up with a query that you can actually deploy as an alert so it's kind of like an automated threat hunting or detection engineering however you want to label it multiple ways to call it but that's what it is and then now i'm at scythe where i do adversary emulation and detection engineering so that's actually going out finding what bad guys are doing extracting those procedures you know we're getting down to procedure level in the ttp phrase uh and then we work on doing detection engineering so we map those to sigma so that you can deploy them wherever you need to or we create sigma rules and contribute that uh to the community so one of the goals of this is to find suspicious activity and one of the areas that we like to focus on is post delivery you know reconnaissance that's where someone's scanning things like that you're not going to get really good return on investment there delivery and exploitation we're really uh relying on our vendors you know we're relying on our firewall our waff relying on the email gateway things like that and then once we get to installation and command and control actions on objectives that's where we want to catch them before impact you know ransomware isn't click and spread everywhere that was one of cry that happened once but most of the time they're actually going through a whole chain attack chain in the environment and you can actually catch them pretty early before they get to impact so that's what we're about we're about catching those adversaries early before impact what are the strategic drivers of this so when we look across the strategic drivers of this we have two of the foundation uh which makes the base here so we have the operational uh capability or capacity and that's you know how the analyst can work with the data and then we have the data collection so those two things are the foundation but then when we add threat understanding to it that's what involves us into being a threat informed threat hunter or detection engineer so with data collection we need to understand what data are we collecting and then from there we can you know figure out where we can do analysis on if we don't connect or if we don't collect name pipe then we're not going to be able to do detections around that we have to understand our gaps and what we're collecting and then finally where is it collected oftentimes with an edr you have the data in the edr and it might only send alerts to the sim and the sims when they get in alerts so you can't do your detection engineering in the sim you have to go to the edr so we have to understand that and then finally we need to know how we prioritize our data so one way we can prioritize our data is actually understanding what data sources are applicable to the most techniques and sub techniques so here we see command execution process creation file modification those are all very important data sources to cover the attack matrix and then the other one is the operational capacity and this is just understanding that we need good tools and we need good analysts to work together that's really what it's about if you have a tool or it's spread across multiple tools it's going to slow your analysts down the other thing to take into consideration is the time factor and that's the thing is when we have inefficient tools or it takes too long for us to run queries it's going to slow us down as detection engineers or threat hunters and then finally we have threat understanding and this is where that cti bit comes in of knowing what goes um what's going on in the threat landscape what actors are doing against us our org if i'm uh elementary school i don't really care what north korea is doing to south korea that's not my threat profile i need to go out and study what ransomware actors are doing and then we want to focus on the procedure we don't want to focus on the technique level because if someone tells me that they dumped lsas i don't know how they dumped l-sas so i can go out and i can grab an atomic red team test which is great but i don't know if that actually aligns with what my adversaries are doing at the time and then we also don't want to focus on iocs or threat feeds because these change very often so the process looks something like this where we get direction from the cti we know what's happening in our space we do our collection after our direction also direction hopefully includes some sort of purple team exercise uh so that you actually have those um that data in your production environment you want to see what it would look like in your environment so that you can also test do we have alerts around it or do we not and then we do our collection and then our processing and finally we disseminate it back out to sock and our stakeholders so cyber threat intelligence this is a big area i've heard multiple times people say i pipe my threat intelligence right into my sim i'm like really because threat intelligence is very hard um so one thing to think about is iocs are not threat intelligence what we're talking about with threat intelligence is actually understanding who's going to attack us who's in our threat landscape and then what those threat actors are doing at a procedure level okay and why we do this is because the procedures that are ran adversaries have habits they have training they have tools they also um have guides if you haven't seen the kanti playbook literally it tells them step by step of what to run and like every time there's a conte incident they're doing nl tests net all these different stuff so you can catch them early um but yeah you can check out the guide it's awesome just from an intel standpoint to actually see it and lining up with things like the d4 report which is a great resource to have and i think i went out of there we go so direction we have that cti we understand what our adversaries are doing at a procedure level and then we say do i have detections already because we need to actually emulate those procedures so we can see if we have detections or not and then from there we need to figure out how to catch it but first we'll go ahead and run it so in this example we'll go ahead and run the procedure and see if we catch it or not and that's how we get direction is actually running those procedures and this is what it looks like at the end for the direction side where we've put together a plan of what the adversary is doing at procedures level we've mapped it to alerts and then we see where we have uh alerting gaps as well so that we can work on doing engineering around those next we have collection so with collection we want to go ahead and start seeing what we have in our events we want to see what's actually in our data sources and with this we can go and we can leverage detect to actually map out our data sources to see where the detections might be and then we also want to identify and catalog any visibility gaps so like i said before lots of edr's they won't actually log the pipe names creation so you can't do certain detections around that and as we look and we start understanding the data we can start hypothesizing so here this is one way that we can map this if you're new to it where you go to the miter attack page and we see that power shell it's going to have they call these detections but it's really like log sources so we see that there's a source of process and we see that there's a component of process creation and so you can go to any of the technique pages once you have your procedure you map that procedure to the technique and you can figure out your data sources needed the problem is is attack ends here and you have to figure out what your actual event logs are in your environment and the way you do that is you leverage detect so with detect you can add your data sources in so here we have process creation and i can see my products of where that actually resides and then i can quantify my data quality in there and this is what it looks like you know cisos like love this type of overview it shows actually where we have visibility into certain data collections this isn't to be construed as actual coverage from an alerting standpoint or a response standpoint but this is actually a good way to show where we have detect not detection gaps with logging gaps so hypothesizing this is where we've started analyzing our data and we need to look at ways of doing the actual alert engineering so we start out we cast a wide net because we want to get as much malicious activity that we can and then we start tuning that out and then we narrow it down and we want limited false positives if we deploy a lot of false positives then we're going to be in trouble our sock's not going to be able to actually respond to that alert and then from there we also want to embrace so that we're going to have a few false positives because a lot of what we're looking at is looking at living off the land techniques which are using things built into windows so we want to find suspicious activity and it might get flagged by a system administrator or something like that so developing a hypothesis kind of looks like this where we actually have uh microsoft delivers threat actor uh targeting solarwinds with a zero-day exploit so with this we see that we have the mshta application making a call out to a public ip address so we can start saying if we know what mshta does in our environment does it normally connect outbound or not so we can start thinking about is that a detection opportunity we also see who am i execution so we can start thinking like do we need to catch that we also see a couple things that we'll go over when of how we ask more questions to start developing our hypothesis or our search query so one of the things we want to consider is when we're looking at a procedure is what is the procedure doing and how is it made or what makes it malicious how is it used maliciously what's the threat actor really doing here so what that looks like here is we see that command prompt is launching who am i they're enumerating what they're running as and then they're piping it out to a text file so now we know what the adversary is doing and then we can say well you know what do what do our system administrators do or what does our help desk do is our helped us actually pipe out who am i to a text file and we can start thinking about ways of hypothesizing around that and then the other thing is when we start hypothesizing we look at how often does this happen in normal operations so in my environment how often does cmd launch who am i or like we said before how often are they actually piping it out that redirector is not very common for who am i in most environments if we're at a small company you know a few hundred workstations no one may even use who am i so that's another thing to consider as well so these are just things that we're starting to try to baseline in our environment and then we look at are there parent processes so when we look at the cmd process chain that spawns off we can look at what actually started this process and once we understand what starts this process in our environment we can tune it out potentially or we can look at how often does cmd launch who am i you know maybe who am i my environment launches but it only launches from a you know a different process so i just tune that out and i flag on any who am i execution because that's one of the first things attackers do when they get on host they want to know who they're operating as and then we can also look at are there common child processes so what we have here is just a common parent-child relationship well not a common but a commonly malicious one uh where one word is spawning command so that's one that we can tune into we can look for you know rare processes coming off of one word that attackers are using such as command prompt or power shell or run dll things like this we also see excel spawning run dll32 so these are things that we can start looking to as we do threat hunting and detection engineering is what are the child processes that i can tune into or out of uh if you want to catch you know an apache zero day like probably nine times out of ten the apache exe process is going to have some sort of scripting interpreter off of it such as power shell spawning off that apache process because the attacker needs to run commands so the other thing we need to look at is command line parameters often when we're trying to do tuning we're looking at a certain process that's used across the network because it's living off the land uh binaries and scripts and if you're unfamiliar with that that's just common applications that are built in to windows or there are other windows executables that attackers can bring in so we need to look at what are suspicious command line parameters around those and then we can tune into those potentially like in this case or potentially if i'm looking at one setup with an application and one process or multiple processes keep having the same line the same command line parameter i can tune that out as well and then the other thing we want to look into is the users so a lot of times you know certain processes might run a system so if those processes are running as something that's not a system then that's interesting or if they're running as you know a typical user and they're not supposed to usually run as a typical user that's interesting so we can look at that the other area of suspicious users is we can say all right for who am i that's typical for help desk to execute but should someone in hr probably be executing who am i let alone should someone in hr probably be launching a command prompt that's probably a little suspicious because i don't know too many people outside of it who launch command prompts and then finally we also look at does the process make network connections this is a huge one um for almost anything inside to see windows folder path if it's going outbound you probably need to start baselining that and look for suspicious activity but in this example we see where powershell is going outbound so that's just an example where what we want to do is we want to baseline does this process does it call locally to localhost is that common sometimes you'll have certain processes where they don't call local host and you need to start flagging if they do call a local host or we look at private ip spaces and say this process only should talk to private ip spaces you know in our land but it shouldn't call outbound so that's another area that we can look at and then finally you know external ips so we need to start understanding what this process communicates to on a common basis so that we then can flag when it's suspicious and what this looks like it looks like when we're tuning is we cast the wide net first so we cast the net and we usually get a bunch of benign events and that's where we start and that's what casting a wide net kind of looks like in a visual reference and our goal is to get down to this aspect where we have a few benign events and we get most of the malicious events and the way we do this though is through testing and we start seeing a bunch of different ways that the attackers are using those tools um you know run dll-32 things like this we're starting to get a bunch of data generated around that and that's why we want to do emulation uh in our environment in our production environment to actually verify our detections and if we need to change them one thing we like to talk about too is don't go too small or too precise if you go too precise then you end up with something like this where you're actually missing a lot of the malicious events and if you do this it's just it's not good because you're missing the bulk of the events that the rule could be covering and what this does is it either allows you to miss alerts that you should have had or it means that you're gonna have to create a bunch of different layered alerts and then you just have a lot of alerts and that slows down the systems one thing we also want to mention is that uh with the the nature of finding suspicious activity sometimes when you actually go out there and you look at it in your environment you might think yeah i'm gonna catch this one guy and then you realize that you have a bunch of benign events that come along with it so i just want to say not every procedure that a adversary uses is going to be a good detection opportunity sometimes you just don't have a good detection opportunity and you have to accept that but at the end of the day we wanted enough lasers in our vault scanning everywhere so that when you know a hacker comes in they trip one of those lasers so a quick example of this is the uh tuning of wmak so in our environment we went out and we looked for wmak it's a common uh living off the land binary and script built into windows and what we have here is we look for the parent processes that were spawning it and as you see we have an amazon agent that's uh spawning it a bunch but then i have my scythe emulation campaigns and a powershell that came off of one of them those were also spawning it so this is a quick win in an environment where i can just go out i can look for wmic should it be spawn in my environment okay it's spawned by this we tuned that out and now we're already finding suspicious activity so that's a perfect example right there and then we get to dissemination as well and with dissemination we're delivering to the stakeholders and this could be delivering to sock obviously in the form of an alert if we did good threat hunting we have a tuned alert at the end of it that we can deliver to sock and now we can go threat hunt on something else instead of having to threat hunt on the same thing you know month after month and then one of the other things we want to do is maybe we give it to management and one of the things we give to management is you know we can take our developed alerts we can map them to mitre and we can throw it up on a graph for them so that's one thing to think about as well and we can also give it to the cti team then we can also document log sources so we know what tools are giving us the most value and then finally this is always a circle that's ever continuing because uh you know adversaries are always updating their procedures and then also if you have a red team you can tell the red team that hey we have this this rule now or