The Best Defense is a Great Offense: Leveraging Automated OffSec to Build Proactive C2 Detections

Name: The Best Defense is a Great Offense: Leveraging Automated OffSec to Build Proactive C2 Detections
Uploaded: 2023-05-10
Duration: 34 min 33 s
Description: The Best Defense is a Great Offense: Leveraging Automated OffSec to Build Proactive C2 Detections Sam Manzer, Mike Parowski Do you need comprehensive, rigorously-validated detections for evolving Command & Control (C2) threats? This talk is for you! Come learn about push-button C2 infrastructure th

BSidesSF · 202334:33443 viewsPublished 2023-05Watch on YouTube ↗

Speakers

Sam Manzer Mike Parowski

Tags

CategoryTechnical

StyleTalk

About this talk

The Best Defense is a Great Offense: Leveraging Automated OffSec to Build Proactive C2 Detections Sam Manzer, Mike Parowski Do you need comprehensive, rigorously-validated detections for evolving Command & Control (C2) threats? This talk is for you! Come learn about push-button C2 infrastructure that simulates complex attacker behaviors and high-fidelity signals useful for developing generalized detection capabilities. https://bsidessf2023.sched.com/event/1Hzux/the-best-defense-is-a-great-offense-leveraging-automated-offsec-to-build-proactive-c2-detections

Show transcript [en]

I'm Mike perowski from the network threat detection team at meta and my co-presenter here is Sam manzer on the purple team and over the last few months we've been fusing our defensive and offensive skill sets to build an end-to-end automated uh offensive attack framework that enables us to validate and tune our detections and so we're here today to talk to you about that process some of the challenges along the way and the successes we're seeing as a result so just to do a little bit of bookkeeping first I'm going to first introduce you to my team's hypothesis and kind of reason for existing and what we're working on and then I'm going to hand it off to Sam to defend that thesis

and then if we're going to dive into the process of detection engineering on our team uh how Sam's attacker on attacker on demand infrastructure helps us achieve those ends and then sort of walk you through what it actually looks like to use the system and how it enables us to build highly effective High Fidelity proactive detections so first the thesis and I do anticipate some eye rolls here so these days it seems most industry standard detection and response strategies rely very heavily on application logging and host level logging as a first line of defense only then pivoting to network Telemetry to gather additional context and evidence my team's uh my team's hypothesis is that this Paradigm is backwards that we

can actually rely on network Telemetry first as it is inherently reliable comprehensive and generalizable and we're going to dive into more about what that means but I just want you to keep that in mind as we talk through it and uh be the judge of it yourselves cool Sam hi there y'all so let's talk a little bit about these Network level detections and how they relate to EDR that sort of table Stakes endpoint host based protection uh that you got in your organization whether it's from Microsoft defender or crowdstrike or whatever y'all like to use so you yours pretty good these days uh EDR will generally make your life a lot more difficult if you're a red team or offensive

practitioner of some sort you can get around it but it gets harder and harder every year and it's pretty smart at catching the obvious stuff these days so nonetheless I want to talk a little bit about how Network level detections can provide a lot of additional value and really complement the organization security strategy so the challenge with EDR is that there are a whole bunch of ways to trick users into disabling it um this is my favorite one so if you uh go on GitHub and you go use whatever cool new JavaScript framework or new tool a lot of them will tell you to do this and in fact uh to install the product and in fact some of the security

products out there will also tell you to do it this way and this will basically bypass a lot of edrs because it never hits the disk because you already elevated to sudo so if you um want to try something fun you can just put a little shell script in there and uh the uh they wouldn't let us put in the exact one liner for this but you can ask me after the talk uh get disabled a lot of interesting edrs just with this couple lines drop a commodity payload that any EDR should catch won't get caught and so I think this uh this sort of little tidbit is sort of an example of the general class of issues that relying

solely on EDR can give you which is that it's fragile it's on the end point the employee has access to the endpoint and if they have sudo and they're really busy there's a whole bunch of stuff they can do that's going to remove that layer of protection for you the other challenge is that if you're in a big organization um you only have you probably have a whole lot of different kinds of endpoints and they may not all have EDR um and even if they do EDR may not work the same into the same level of quality and protection on one type of endpoint as it does on the others so because of the fact that EDR exists in

this hostile environment where you've got users who are always just like one right click away from doing like run as an administrator or what have you or otherwise somehow bypassing your protections it can perform well in certain scenarios but in other scenarios it will struggle so let's talk about how how well EDR can work in different scenarios within the organization so I like to think of Bob as sort of like the role that EDR is really best designed to protect um so you know about a lot of y'all if you if you all worked any kind of I.T support role in the past you think like oh well Bob's you know Bob's are the source of a lot

of my problems Bob's always trying to upgrade Adobe Flash Player and it causes me all kinds of problems um and that's true no one can completely protect the Bobs out there but EDR does a good job of catching a lot of the type of stuff um that Bob is going to click uh to get your endpoint owned um but the challenge is there's a whole lot of other kinds of endpoints and a whole lot of other kind of workflows out there so your devs for example probably using Max in a lot of shops that's okay your EDR says they support Mac uh but the actual performance of a lot of the detections uh may be

drastically different uh than what you'll see on your windows Fleet for example in addition the devs are probably more likely to be the folks who are going to curl to bash with sudo and so they have a sort of unique set of threats that come from those types of workloads now we've also got folks doing really high-end r d work and the machines that they're running which will probably look a whole lot different maybe they don't even have EDR at all on them because someone forgot to set that up um or maybe again the EDR doesn't work well because they're weird BSD machines or something like that so it doesn't stop there right your Docker

container is also an endpoint um if you had supply chain attacked and you get you know some evil npm package that drops an implant inside that Docker container well it's contained in a certain sense can probably scan your whole internal Network though and there's probably like a whole bunch of other nasty stuff it can do too um if you're running some EDR on a MacBook and you're running Docker for Mac which is a Linux VM inside I almost guarantee you it's not getting scanned at all so there's some challenges there um and then of course there's every single other kind of device that's running behind your perimeter it could be a switch could be the CEO's Apple

watch all kinds of stuff um these this wide variety of different endpoint types and the challenges in getting uniform EDR coverage across that leads to a situation where Network threat detection can especially shine as a compliment that gives you that breadth of coverage and so Mike is going to tell you a lot about how we get a lot of value out of that thanks Sam for the good info and justifying my employment I appreciate both of those always so uh as you mentioned yeah and as you can see in the diagram here what we are really proud of is the fact that as long as you intelligently select where you are tapping the traffic and you ensure

that you're utilizing very natural choke points that we expect everything to flow through we will see it all in the network independent of device or operating system or newness to the network or what have you and so we leverage this advantage and attempt to detect these three ttps mainly we are exclusive or not exclusive we are mostly targeting command and control traffic anomalous data exfiltration patterns and network service Discovery or mass scanning inside and around our Network and we do this using mostly z-clogs so for those of you who aren't familiar Zeke which is formerly known as bro is an open source and passive Network analyzer I like to think of it as a connection-oriented

Wireshark so rather than inspecting individual packets you're seeing the full session between any given set of endpoints what we really like about this is it formats our logs in a highly digestible way so just right out of the box we can ingest them into our internal Data Systems and query them intelligently and build our detections around that additionally it's very extensible so while it comes with a series of Base packages that we rely on pretty heavily we can add things onto it over time as we have new protocols being introduced or new detection methods that we would like to try out and look out for in our Network so to use the software we position we we

own and operate a fleet of sensors and we position them all along the perimeter of our Enterprise Network and what this achieves for us is this natural choke point that I mentioned before where all traffic egressing and ingressing our corporate environment into and out of prod we see and into and out of the public internet we see in addition enter DC hops within the Enterprise Network we also catch so what we have here again is are these natural choke points these locations around our Network where we are virtually guaranteed to see all traffic go in and out of independent of where it's coming from and the platforms it originates from and so we can use

sort of this this advantage to build really high fidelity detections on top of assuming we can overcome some of the challenges that we're going to discuss um before I get there I did just want to give you a little bit of an example uh before we get too deep into the weeds this is what uh your typical z-clog will look like so this is the dns.log which we use pretty heavily for DNS C2 detections and as you can see here you get the basic fields that you would in any sort of netflow with the source IP desk IP and the ports but you also get all of the very specific DNS Fields parsed out completely in a very

queryable way so you have the the actual query name and the class and the queue type and so on and so forth and so these are the fields that we can sort of intelligently pick which ones are relevant to map after attacker behavior and sort of model the behavioral detection around and so the reason I showed DNS first is because this was actually the first success that my team had inside of our Network so we were monitoring for anomalous spikes in DNS text queries which is sort of the canonical Network Telemetry as a detection Source example in case you're not familiar DNS text queries enable arbitrary blob data to be communicated back and forth and it's

difficult to block because you would essentially have to disable DNS across the network which breaks a lot of things or everything so we took this concept began modeling what the DNS text queries look like in our environment and set some thresholds we wanted to see when there were really large spikes in that behavior and uh alert when we see them and so this was the first big win that we had that came out of that as you can see here we had a task generated with all of these really fun looking domains that at first glance look scary like there's a ton of important encoded data being transmitted across but also with these very fun dictionary mashed up

words like beneficial skull cap and life in the DMZ which seems a bit conspicuous for your standard attacker so I went ahead and tapped the shoulders of our offensive Security Group at meta and sure enough this was a covert red team up and they were X filling some data and using DNS text records as a way to command and control so after this Victory we started to feel quite confident that okay we can use these network uh this network Telemetry and develop you know pretty high fidelity detections after it and now I do just want to nip this one in the bud because we get this all the time and I feel like I'm constantly

addressing this so right out of the gate YES Network detection is hard because a lot of things are encrypted however this is not a reason this should not be a reason to not use Network Telemetry and build a detection system on top of that data it often gets cited as well if everything's encrypted how much value can we really get out of it I'm here to tell you a lot there's a lot you can do with this data because there are a bunch of remaining unencrypted protocols some that will uh will surprise you you know which is signal on its own and then you can also use the metadata even in encrypted traffic to get some pretty

interesting results for example time series heuristics which we are going to dive into next is what we're using for some pretty cool C2 type beaconing detectors so before I explain that I just want to show you this is the SSL log it's just a legacy name but it encapsulates modern TLS traffic as well this is the default z-clog for SSL traffic same thing you get the standard netflow traffic plus some of the algorithms used the server name search chains and so on and so forth and so my job in this detection this specific detector we're going to talk about was to identify which Fields here we could use even in the case of an encrypted C2 traffic and so really what

we really all we needed was the the flow traffic the time stamps and then some of the server name and endpoint uh information which we can all get pretty easily here regardless of the encryption used so to go forward I'm going to lay out sort of the scenario that we're working with here so Sam manzer a very honest and hard-working employee at meta saw this email and felt the absolute need to click it and as a result now has a little implant installed in his machine that is going to be communicating out to with the attacker infrastructure using SSL as it's as its C2 Transportation Lane uh a common tactic with these C2 Frameworks is to include a repetitive a

periodic beaconing function essentially a still alive to communicate out to say we still have this foothold um and give the attackers the peace of mind that they continue with their operation at their Leisure there's some nuances to it you can introduce Jitter in either direction but essentially what you're going to be looking at is traffic that looks something like this just very periodic spikes in traffic to some location it's not going to be a lot of data being sent but it's going to be very patterned this deviates in theory pretty significantly from standard traffic going on inside of the corporate environment so our logic is okay let's try to encapsulate this behavior in a little SQL query Target our SSL logs and

see uh See if we what we can find and so that is exactly what we did and we uncovered a few different you know pretty significant challenges the first one here is that large corporate networks particularly the one that I've been working on uh are very loud there is a ton of noise in every direction and for every detection idea however Rock Solid it might be you're going to have dozens of corner cases and edge cases for every engineer you could possibly think of saying that they need this tunnel exiting the network and egressing all of this data which drives me crazy but we have to acquiesce sometimes so that's the first challenge we run into

um second one is you can put all your effort into developing these really intense you know logically sound queries and you end up with something that looks like this uh albeit obfuscated uh SQL code the problem really comes around when it's you can dedicate all these resources but you have no idea that it's actually correctly targeting what you want you could be getting 100 results per day but how do you know that the actual beaconing traffic that you're looking for is in those results every day so that's a problem we're trying to address and then this last challenge that I want to discuss is something that should be familiar to pretty much everyone in the

room here as Security Professionals although I do feel that in detection engineering it is particularly particularly front of mind and it's that we are all constantly trying to prove negative if our detections don't fire it means there was no attack or was there we're sort of always waiting to justify our existence by saying hey look what we probably most likely stopped here's the proof that this thing happened out in the wild and we have defense there but if we don't get that sort of signal then we don't really know if things are working if we're actually building the moat that we assume that we are and so you know these are all the challenges that we are really hoping to address

with the system that we're working on I'm going to hand it back over to Sam to discuss that thanks Mike so yeah let's talk about testing detections um and so before I was uh getting into security again to have all this fun hacking things uh I put in a lot of years just as a straight up software developer uh and um as the devs in the audience are all too familiar with um testing whether it's unit end to end manual QA in many cases is a very well standardized part of the life cycle of software to have um we want to bring that same level of rigor to testing detections we want to know that those detections are actually

going to perform when we need them to and that at any given time we always have that level of protection that they should be providing and the way that we want to do that the highest Fidelity way that can really give us that confidence is to simulate real attacker activity and see how the detections do okay so the real issue and why this is so important to do is because the detections are at scale are themselves these very large software products and data pipelines with a whole bunch of infrastructure dependencies and a whole bunch of things that can go wrong and so when you try Landing really complex detections depending on complex infrastructure there are so many

different mistakes that can be made and the pro thing that makes it uniquely challenging is that unlike when you make a mistake writing some endpoint as a web dev almost always the failures are just going to be silent and you might not know anything is wrong there's not going to be something throwing 500 errors out there to the world that is going to make some user angry and somebody's going to tell you so we really have to take an extra degree of care to make sure that all this stuff is always working well so let's talk a little a little bit about how we do that purple teaming type exercise for C2 detections so the

easiest way is just throw up the real C2 server of some kind and generate an implant the way that into hacker would pull that implant down and detonate it on some device within your corporate perimeter such that that traffic will Traverse the sensors that a real attacker's traffic would Traverse and then generates a more signal Drive some post-exploitation ttps through that C2 channel the way that an attacker would and see what comes up see what logs you get see what detections fire or don't fire and then act on that basis to continue to refine the model and Harden those detections so I want to shout out the good Folks at Bishop Fox for making sliver we're also

sponsoring this conference I believe shout out to them uh we love sliver uh it's really good for this type of use case because uh it's golang it can compile implants very easily to any platform so we can drop them all different places within our environment test the sensors within all those different paths and it just has a nice API with golang client that you one can just use so you can just write go in order to automate very complex attack scenarios so we're big fans of that fully open source so you can spin up as many of these as you want no licensing headaches and it's just real smooth to get set up and use so shout out to them

we really like this and um yeah we use this for a whole bunch of stuff so what this looks like in practice usually is just a couple of cloud resources basically you're spinning up whatever you need to do that particular type of C2 transport so maybe I just need some domain some TLS certs um and then the C2 itself and all of this stuff can be automated really efficiently so you can use all the cloud native infrastructure as code automation terraform we like Tara grunt which is a little bit extra on top of terraform it's really cool you all should check it out and then we've been building up a whole bunch of code around

this to basically turn this into uh like a command you just run one command and all this stuff comes up and you do the test so I want to call out specifically why we're using the world uh the word ephemeral here so the nice thing about using this type of cloud native automation to drive the test for these detections is that you can spin up a machine that machine comes up from an uh comes from a Ami if you will a virtual machine image in the cloud that is always constantly rebuilt and fully patched does what it's supposed to do gets spun down there's no footprint out there of a C2 server that's connected somehow to

your internal Network that then you know one has certain um sometimes strenuous responsibilities to keep that thing patched make sure everything is legit there it's comes up does what it's supposed to do comes down low operational overhead for the team so you can focus on the goal of securing the company so we've been building this out for quite some time and the vision here is really to get everything automated end to end so you do these C2 tests that get spun up in the cloud you drive the ttps as the attacker would you get that signal and then you check that that alert actually came up that's what gives you the confidence that my detections are reliable nobody broke

them by pushing some bad code that just dropped some field and destroyed the whole pipeline there we are consistently protected from a wide variety of threats and we can speak to how that was rigorously validated so we're basically involved in building out all these different arrows right now and this is um going to give us give us that confidence so we are in the process of getting uh this stuff open sourced hopefully um and so uh please stay tuned uh we we do we do want to talk more about this in the near future so I'm gonna hand this back over to Mike Mike's going to tell you about on the detection side all the cool things we've

learned by repeatedly going through this process of rigorous testing cool yeah let's look at some data see what this actually looks like uh so I want to remind you of what our very obfuscated and not real query looks like but the idea is that uh when we're targeting this SSL beaconing this is probably a somewhere between 100 to 200 lines SQL query to get all the numbers right where we need them and I want to show you the first bug that we encountered when we were running the system so keep this in the front of your mind and then let me show you the block of code that was broken and so these four lines which are just specifying which

perimeter we're looking at and what the the host name is is what broke on our very first test I spent about an hour digging through the logs trying to figure out why the hell didn't this thing fire it turns out that turns out that our attribution pipeline which Maps our IP addresses to host names failed in this very specific instance for whatever reason very silently as Sam said it would and so in this case I completely missed the traffic and all I needed to do was remove this hostname is not an alt line to see that traffic so this is an example of an extremely small thing that you might not ever think to test and yet

can have a massive impact on your ability to detect these sorts of things uh the second one going back to the loud corporate networks lots of noise for every detector so once I did remove that parameter we saw 68 results and it was cool to see Sam manzer's laptop on there proof that the logic itself actually works but in addition to writing the detections I also hand alerts off to the on-call on my team before we escalate it out further and if I hand that person 68 alerts every day he's gonna think I'm a huge pain in the ass and it's not going to be a good time for us so it's really our goal to wean this number down as

small as we possibly can and to do that we need to come up with sort of clever patterns to look for to rule out the obvious and so the first one of those that I identified after digging through this data was holy cow this is how many events that we saw coming going to work with increased intensity.com from Sam's laptop over that week zero zero zero all of a sudden a lot and then zero again and so I'm thinking okay that's probably a reproducible pattern if somebody's laptop gets popped inside of our corporate environment it will quite suddenly begin making relatively High volumes of queries to some domain that we are unfamiliar with so we can

probably incorporate that logic and combine it with the beginning logic to see if it gets us anywhere um looking at the spikes alone just the number of events going out to SSL domains was also quite loud just want to give you a little example this list gets much longer but as you can see work with increased intensity was near the top quad not quite at the top but still we're thinking that combining this with the beaconing logic itself should be able to whittle down as few Whittle down our results to as few as possible you know it's very unlikely that drivesafemadison.com while it was very new and random and that day is also exhibiting the repetitive periodic

beginning traffic that we're looking for so that's exactly what I did I took that big long query and I added this little clause in there to say okay we want everything that's actually beaconing out and we want things that are not high prevalence essentially um things things that no more than I think in this case yeah 10 clients we're connecting to it's likely going to be much less but just to give us some breathing room so this is actually a third parameter that we can add into this so now we have the beaconing logic itself the sudden spike in traffic outgoing to this external endpoint and low prevalence so very few people connecting to it

um putting together all three of those pieces into the same query got us down to eight results which is awesome all of a sudden we you know dropped from 68 I think or 67 uh to eight which is still not quite perfect ideally I only want samanzer's result to be in this and so I can hand off a single true positive but that's not the ideal world we live in but we continue to iterate on the point that I really want to get across here is that the more effort we put into this sort of repeat repetitive tuning and validation process using live uh emulated attacker data the closer and closer we can get to these

super high fidelity detectors relying only on the network data something that's particularly cool about using this approach to catch attacker behavior is that it is completely generalizable and what I mean by that is periodic beaconing looks the same whether it's sliver or Mythic or I don't know say I'm gonna have to give me the long list of commodity malware's they have at their disposal but uh it all looks the same to me it's just periodic beginning over the same protocol analyzer to the same random and new endpoints and so by putting all of this logic together we can end up with these High Fidelity detectors that work across the whole Fleet independent of the device that is

popped and independent of the malware that's being used and so this is what really excites me about this system and it's sort of what I just explained but uh you know as we were working through this process and iterating through the detection development uh I essentially had a big table with all of these parameters that we're using to tune the detector and a list of Trials one two three four five so on and these are the parameter values we're using and I can you can actually calculate okay given these parameters how many true positives were generated AKA did we catch Sam's traffic how many false positives were generated what's all the rest of it all the noise and

what you're left with there is a fairly simple optimization problem how can we minimize the number of output results given the parameters that we have here and so with this fully automated system this end-to-end emulated attacker traffic into my defensive detections framework we can essentially have our detectors become self-tuning in a way with base logic and a well-defined set of tunable parameters the entire process you know can fade into the background letting us detection Engineers focus on the cool stuff developing new detection ideas and working on new queries and adding them to this test queue and so we're really excited to continue iterating on this and getting there um that's most of the meat of the talk I

just want to leave you guys with a couple of uh bits of food for thought uh the first one that I really want to drive home is as the network threat detection guy Network Telemetry is awesome it adds a lot of value to proactive detections and don't write it off just because things are encrypted or it's a little hard to get the passive tap set up we really have seen a lot of success and oftentimes beat the application and host level logging guys to the punch uh second one and I think this is fairly obvious at this point but these detectors must be continuously valid validated over and over again you know Network behaviors shift the entire

environment is incredibly Dynamic so continuously detecting these uh continue continuously testing these detectors is absolutely essential so you can continue to tune parameters and have the tightest possible detection uh third one is because all of this can be automated it should again to allow all of us security Engineers to do the cool and exciting and important work of advancing the state of what we can actually detect working on new query logic working on new pipeline systems so on and so forth that's about it for our talk I really appreciate you guys all coming out and I guess we can open up the room for questions

So when you say you should be automated so it's not required there's some kind of machine learning approach to figuring how the parameters should be the constantly changing parameters yeah I think uh what I realized going through this process that last slide that I was talking about where I was defining the table and working through the optimization problem I had this moment of like I think I'm kind of just doing machine learning by hand right now I think the the word itself gets a lot of you know hype to introduce AI into these systems but it really is just statistical minimization and optimization problems uh so yes when this thing is fully automated we could

probably call it to some degree machine learning but it really is just sort of solving that same minimize the number of alerts given the parameters that we're providing

did you uh I did not I inherited the stuff the tech stack that we have but we'll check that out sounds interesting

wanted to shout out one thing real quick before we hit the next questions um you know we emphasize automation here a lot there's all kinds of cool stuff that one can do with that and uh love to talk about that with anybody who's interested um but I did want to add one thing which is uh if you um if y'all don't have a lot of folks in your org who selling terraform or do these other things um it's been up a cloud instance and give this a try manually you get like a lot of value and um it's you know that first test will tell you a lot

yeah so we have um I don't know the exact number but uh a few quite a few dozen uh bare metal hosts that run both Zeke and cerakata the proactive detections that I'm talking about exclusively query the Zeke data but we also compile the cerakata data in for more kind of signature based stuff ideally you know we put all of the pieces of the puzzle together and query it all together but we've had pretty good success just ingesting the Zeke data alone being Mr uh it's actually just an optical tap so the light going through the the wire just gets split and we see everything as it should be yep

uh we were hoping well everyone else was hoping only Sims were genuine I'm always hoping for a little bit more fun with all the queries in that specific case everything else seemed benign though

uh yeah they they will if we have anything to say about it um so uh we uh I think you're absolutely right um a lot of the attack simulation stuff out there is very Windows endpoint focused um and is really good at that um and that's really great work that those folks have done um and I think uh it'll be neat in the future like right now it's kind of an exciting time for that um for this area because um with these purple team attack simulation efforts um that we're doing and a lot of good Folks at other companies are doing too um we're starting to push into these other Realms and network is definitely a

big Focus for us that we want to continue

cool all right that looks like all of them thank you folks

The Best Defense is a Great Offense: Leveraging Automated OffSec to Build Proactive C2 Detections

Related talks