Will Schroeder & Max Harley - Meet Your Nemesis: Fighting Data With Data

Name: Will Schroeder & Max Harley - Meet Your Nemesis: Fighting Data With Data
Uploaded: 2023-10-09
Duration: 57 min 51 s
Description: Nemesis is an offensive data analysis and enrichment platform designed to unify post-exploitation data across multiple C2 frameworks and tools. The system automates operator tasks like vulnerability analysis and encrypted credential cracking, performs complex offline analysis including privilege esc

BSides Augusta · 202357:51239 viewsPublished 2023-10Watch on YouTube ↗

Speakers

Will Schroeder Max Harley

Tags

CategoryTechnical

TopicDFIR Threat Intel

DifficultyAdvanced

TeamRed

StyleTalk

About this talk

Nemesis is an offensive data analysis and enrichment platform designed to unify post-exploitation data across multiple C2 frameworks and tools. The system automates operator tasks like vulnerability analysis and encrypted credential cracking, performs complex offline analysis including privilege escalation detection, and enables data-driven research by ingesting raw structured data from engagements rather than compressed tool output. This talk covers the red team challenges that inspired Nemesis, its architecture, collection methodology, and demonstrations of automated analysis pipelines that link related data sources.

Show original YouTube description

The offensive industry is about exploring what’s possible. Part of this is observing and taking lessons from other disciplines that have already solved a myriad of related challenges, from proper software engineering practices to using graph theory for offensive problems. But despite various leaps forward over the last several years, the offensive post-exploitation community has yet to fully embrace data analysis and enrichment pipelines beyond basic log aggregation and searching. If offensive tools were structured for automated processing instead of solely human consumption, we could unify post-ex data to exploit the known (and unknown) relationships within the data our offensive tools emit. Imagine a system that could ingest data from any C2 framework or post-ex tool, and could not just automate common operator tasks like binary analysis for known vulnerabilities and hash extraction and cracking of encrypted documents, but could perform complex offline analysis like host privilege escalation. If we could unify all post-exploitation data from offensive engagements we could improve operator workflows, provide tradecraft assistance, facilitate automation of onerous tasks, and uncover new data-driven research opportunities. A year ago, our team embarked on the development of just such a system, and we are excited to introduce the result of our effort: Nemesis. This presentation will start by detailing the various red team challenges regarding data, leading into how this influenced Nemesis’ architectural decisions and design. Along the way we’ll cover various time-saving automations Nemesis can perform along with offensive data enrichments and analytics the engine can produce. This is the start of a true universal operator assistance platform, with operator guidance contextualized by data as it comes into command and control platforms. Beyond this, Nemesis will enable the emerging discipline of offensive data analysis, which we hope will unlock possibilities we can’t even imagine.

Show transcript [en]

got a couple guys here their title of their talk meeting your Nemesis fighing data with data we've got will Schroeder and Maxwell Harley please welcome them and enjoy the talk right how's everyone doing having a having a good con hopefully um we're we're really excited to talk about Nemesis it's a project we've been working on for close to a year now it's been a lot of Blood Sweat and Tears that have gone into it um we think it's pretty cool but we're going to talk about we'll cover the agenda here in a second but kind of the tldr with Nemesis it's an offensive data analysis and enrichment platform it does a whole bunch of really cool things that we're

going to get into and we'll go ahead and kick everything off cool so for the agenda today uh we're going to start off with some introductions talk about some some existing red team challenges we have you know currently face you know we have uh we work at Spectre Ops we have a pretty you know fairly large red team uh so we're going to talk kind of about the challenges we have uh and we're going to talk about the goals and Visions for the Nemesis platform you know trying to solve some of the issues we see in red teaming currently uh we're going to talk about Nemesis how it's architected uh we're going to talk about collection

data collection and data analysis do some demos talk about why it matters and then end with some questions and we do have some cool giveaways for questions at the end for any any uh the most interesting couple of questions that people might have so who are we I'm will schroer I go by harm jojoy on GitHub and Twitter X or whatever else and all the social medias I'm a researcher at spectr Ops I've been with the company since the beginning I've talked a lot of conferences over the years I've trained a lot of different places I'm also a pretty avid open source developer for offensive tools specifically so I was the co-developer of power shell Empire uh

Power exploits I worked on for a good period of time with specifically Power view and power up if anyone had used those I was one of the co-founders and initial Developers for The Blood Hound project um I built a lot of the tools in ghost pack which is a collection of c c offensive oriented tools and one of the more recent things that I did with workmate Lee Christensen whose bio is up there he's not here today but he is one of the other core developers on Nemesis but he I did the active directory certificate Services the certified pre-own research from a couple years ago with all the fun adcs ponage type stuff so and I'm Max Harley my CV is not

nearly as extensive as Wills uh but I do uh Consulting offensive security Consulting uh wrote some infrastructure automation like red team infrastructure automation for uh spectr Ops uh do a lot of software engineering developed a few projects satellite uh jaw3 transport uh Go Fish Notifier and Clemson grass so go Tigers we're playing Wake Forest today I think so hopefully we win and for people not familiar with spectr Ops we're a consultancy we have a commercial tool for blood hound Enterprise and stuff but we've been doing Red Team engagements and blue team engagements for a number of years so we've probably done hundreds and hundreds of engagements now from both pentesting and red teaming so a lot

of the challenges that we've seen over the years and some of those engagements is really what inspired us to start building the Nemesis platform yeah so here's a some of the challenges we Face uh the first um you know we we have a large team this is uh tradecraft especially trying to teach new people you know tradecraft that uh people might not be familiar with like tradecraft is hard to scale uh you know every operator kind of has their own skill set we have a lot of people on the team who are really good at things like uh secm abuse whereas you know other people are not as good so how do we take that knowledge um you know it's it's a

difficult problem How do you take that knowledge and then uh allow or enable uh you know the other members of the you know other operators on your team to perform that tradecraft uh when new tradecraft comes out you know maybe not even things that you've researched when new tradecraft comes out uh how do you you know how do you scale that how do you uh teach people that information uh another problem we see is offensive data not being unified there's a number of C2 I mean if you look at C2 platforms that are on GitHub you know the the list is like 500 there's a million c2s out there's a whole C2 Matrix uh that has all the different

ones uh and the problem is those c2s they Silo their data you know they're uh you know their logs are all stored on one platform but we want a unified view if we're using multiple c2s or multiple tools maybe we're proxying Tooling in um you know we want a unified view of what that uh you know what does our current operating platform what does our environment look like we don't have one singular view for that for example on most engagements that we do we almost always use multiple agents or platforms whether it's Mythic whether it's Cobalt strike whether it's something like customing inous and ideally if I do if I'm listing a directory on one platform I don't want

to have to list that on every other C2 platform as well right if you already have that data one time it should be unified in that unified view but nothing currently does this right everything's siloed whether it's metlo Cobalt strike brute Mel whatever else stage one all different types of things everything it's its own little Silo of info and uh another challenge extremely difficult is file uh tool output triage so just doing file share triaging on a host uh file share triage you know it takes the most time you know you're looking through all these individual files you finally get an agent you know you're looking through files on the system uh it's like a 70 80 90% success

rate of finding uh important information on you know a host that's on this Network or maybe passwords in a docx or PDF or something like that uh and it's just annoying it's tedious it's inconsistent uh you know especially for hosts that you know if you log on to a it uh you know infrastructure Engineers machine and you can look at all of the hosts that are on the system uh you know maybe that's not actually useful for you until three weeks uh later you know you might see that you have something about a host that's super important but you don't realize it's important until 3 weeks later so and if you're kind of triaging files how a lot of teams

currently do right you're downloading everything you sync it to your host and then individual operators are just going through and reading documents and maybe making notes in kind of unified place you know unified note sharing app or something like that but this is definitely inefficient me you know there's just a lot of data that ends up either being lost ends up being overlooked because one person manually can't go through that amount of data usually that gets downloaded so it's definitely a big problem it's one of the we view is kind of one of the really big meta problems problems for for red teaming that we've run across and it's super successful we want to be able to

do this at scale we have to do it every time but it's very tedious and there hasn't been a way to automate or augment some of that stuff as of you know this point and we're red teamers we like building tools that help make our lives easier so that's one of the things that Nemesis kind of aims to do yeah so one of our long-term goals for Nemesis is to empower analytic capabilities uh so you know we want to model things uh offline on an an off Network model um so you know being able to have these analytics these uh this data in some sort of uh you know our own data uh data warehouse our own data data

area um you know we want to model things that are Beyond uh active directory so uh you know Blood Hound does a really great job at modeling active directory you know what if we were able to do this but for the entire system you know the whole host the whole network um we also have workflow specific triage functionality uh and operator assistance so uh you know we have all these workflows especially you know think like active directory certificate services adcs like there's a very specific you know we we analyze the templates we see if we have enrollment rights uh and you know check for a few parameters and maybe we can enroll in a certificate

like there's a there's a very specific workflow uh that you know we might be able to automate um or provide operator assistance and this is something where we think there's going to start being an emerging discipline of kind of offensive data analysis like we've been doing this on the defensive side forever right you know people get structured data they do their analyses you have your virus tools your edrs you have your elastic search all those types of things we want to take some of those lessons from the defensive side and start applying them to offensive data analytics as well so workflows isn't that's not necessarily the sexiest thing right when you just kind of describe it like that but it's

trying to think about how operators actually digest information and get to the next step in an engagement and what are things that we can do to help facilitate that uh another one of our long-term goals is to enhance knowledge sharing coordination and Reporting uh so this will be something like suggesting uh actions or some sort of you know analytic technique for operators um so you can kind of think of this like uh an evil clippy that you know it talks to you and says hey I see you're trying to like escalate privileges have you tried Gathering you know this type of data have you tried analyzing the registry this way uh so that's kind of how how we think

about it like a you know an evil clippy so imagine kind one kind of canonical example I like giving or we like giving is say you download a some kind of you know AWS or like Azure based token from a host right you download it and we ideally would want a system that would detect what that type of file is or if that token is valid what the service is for if it's still usable and then ideally actually link to say like hey here's our playbook for other additional tools that you can actually abuse this type of thing so this idea of contextualizing tradecraft with data from the op itself is something we're trying to move towards we're not quite

there yet with the Nemesis stuff but we're building the platform that will allow us to do this contextualizing all of those all your run books all you know all these different types of things all that kind of uh all the knowledge that your team actually builds up contextualizing it with data from an engagement is a way you can actually scale this tradecraft uh you know in a in a pretty autom type way and then Additionally you know be able to track the progression of an engagement um and just all that kind of reporting all these different types of things of not just these are the hosts we touched not just for de confliction and things like that but saying you know

you think about code coverage but for an offensive engagement right did you check all these particular types of things instead of just having a checklist or you know people are like clicking through and and trying to run through with that actually having an automated system that can track the progress of an engagement over time because ideally we want to be able to use this data to actually gain insights into operators decision- making processes and then in the future try to use that as a feedback loop to try to improve some of these workflows and things along those lines and one of the other big long-term goals is we want to enable research so we've tested you know

hundreds and hundreds and hundreds of different networks but all this data and these lessons and all these types of things eventually get lost right because they're in these most of the output tends to be in this unstructured text based output from different tools as things kind of come in but if we had all of that structured in a place that we could actually run analytics run analysis and actually find Trends in a lot of the data we could derive a lot of future kind of uh ideas for where we might want to perform research whether it's particular you uh vulnerability research and you know different drivers or binaries or things that get pulled down potential new attack paths that

might emerge in the host either for escalation or for you know even act directory attack paths and things like that we just don't have the ability to get the data in a form that we can currently to do that type of analysis right so this is part of the reason that we wanted to build the structured data enrichment type thing that we're going to go into a lot of detail on so we could actually start building offensive data sets to do uh future research to do statistical modeling you know any opportunities for machine learning and things like that on this type of stuff so the vision for Nemesis kind of the elevator pitch I've been talking to a

lot of people in the hallway about this but the one way to frame it is it's a centralized data processing platform that ingests enriches and performs analytics on offensive security assessment data another way to kind of look at it is it's almost like in one way it's kind of like virus total for offensive engagements you submit a binary to virus total it does all this different like analyses and all the different engines and things like that you know break stuff down does y signatures and all that imagine that same set of repeatable automated um like enrichment actions but with an offensive Focus uh someone else I was talking to was like oh this is almost kind of like

a like an intelligence platform for red team operations is kind of what we're building and one other way you can kind of think about it is like I mentioned before we've been collecting this type of structured data on The Blue Team side so we can understand adversaries in in environments we've been doing this for years right we want to take that same idea of collect all the data from a host during a red team engagement and use that to help guide operators uh going forward so that's a couple different ways of looking at it I know some people still might be asking like okay like what the heck actually is this thing so we have a nice diagram and before anyone

groans yes it is kubernetes there is a reason that we use kubernetes which I already hear some some people snickering but the the main reason we bu chose kubernetes is the core infrastructure is we wanted something that could run locally on an individual host but also something that can scale to lots of engagements so you could run one instance of Nemesis for like each engagement that you're doing or you can run one instance of Nemesis and have every C2 from every engagement actually feeding into the platform so starting on the left we have connections or connectors for different C2 platforms that Max is going to talk about in a few slides we have things for OST stage one

we have connectors for Cobalt strike which is one of the main C2 platforms we tend to use we have stuff for Mythic the connector from Mythic which is another C2 platform that we it's a like a C2 front end if no people aren't familiar with it the different agents can utilize uh we develop it in-house it's all open source as well uh we have a connector for Metasploit all that kind of fun stuff so as data comes in from our different C2 agents the connectors will take this data and post it to a web API endpoint the Nemesis web API this is how the data gets stuffed in and starts in this enrichment pipeline for example as

you download a file through Cobalt strike you don't have to do anything it'll just automatically shove the file to Nemesis for enrichment and we'll show this in some of the demos um for file processing itself data is either stored in S3 uh encrypted KMS all that kind of fun stuff or minio which is kind of like locally S3 compatible storage in the kubernetes cluster itself then the enrichment uh pipeline starts to kick off we use rabit and Q for a queing system in the back end so as data comes in you have different message types and we have little microservices and containers that will'll do enrichments uh which we'll go into details here in a

second um after that so some of the uh example enrichments for anything that can have plain text extracted from it so for any Office document type we use Apache Tika to extract all the plain text this eventually gets like indexed and analyzed and all this kind of stuff so we have the ability to basically have a text based search engine um also a semantic based search engine for every document you download on an engagement just one type of thing it does we use gottenberg to convert any Office document to a PDF that can be rendered right in the browser so you don't have to download all these files you know all your open offices and all these

different types of things and like open up in a bunch of different programs you can just browse the file details and it just lets you actually uh just visualize that PDF you know for whatever converted format it is we also have things like Canary detection we convert everything in PDFs in a an a non- networked uh container doesn't have access to the internet so Canary aren't going to call out for at least office documents we have some basic ml stuff with tensorflow so it tries to detect if any of the text looks like a password we do different types of Rex checking for different known um text based patterns using something called nosy Parker which is

like a Rex Engine with an offensive Focus we have NL natural language processing stuff for semantic search which I'll talk about in a bit we have a whole container for like net processing lots of different little things right anything that you would want to repeatedly do every time to a type of file that comes in you can build a micros service for it and that will happen every single time you ever download that type of file after these enrichments happen um the data is stored in a semi-structured form in elastic with cabana instance so you can do any of your flexible searching with that and it's also stored in a highly structured form in postgress and I'll get into some details

here in a second but data comes in goes through a bunch of enrichments gets put in different forms in the end for to consume and we try to do our best and properly build this to where we have fluent D for logging Prometheus for metrics grafana we also have slack alerting so if certain interesting things happen in files or data you actually get alerts up into slack um oh I forgot to mention we have a password cracking container so if hashes are extracted automatically it'll try like 20,000 common passwords and then like alert you if it gets cracked all these different uh fun different things this is the core core infrastructure lots of different moving pieces but as an

operator you don't have to know how these pieces work you just have your connector connected up you have a dashboard that we're going to show data gets stuffed in all the magic happens and you see the rendered information uh in the dashboard in the end yeah and you know just one thing to note is you know this P the password cracker is a pretty simple Service uh you know the whole point of this is we have a ton of different microservices that we you know are trying to enable you to write your own uh so you can imagine a more powerful password cracking uh container here where you know you have your own custom password cracking infrastructure

any hash that comes in you can uh you know write your own microservice to send to your own custom infrastructure so this is all kind of plug-and playay uh if you want to know you have workflows that you want to add hopefully this microservice structure gives you the ability to uh to perform that yeah and we have lots of examples we have a good amount of documentation right now Nemesis is an we we're calling it like an alpha State we're finishing up some of of the backend modeling stuff we're going to have a a 1.0 release before the end of the year but uh it's it's definitely a lot of moving pieces so it's been it's been interesting to

actually build and part of the reason we have all the fluent D logging and all this kind of stuff is debugging something of this scale and complexity isn't always the easiest thing uh we've been developing this for a while so as a as uh to ensure that we didn't go crazy we tried to build in a lot of things that help you troubleshoot as much as you can so an example enrichment flow let's say you download a file that ends up being a net assembly Nemesis will first detect that it is a net assembly then it'll go through a number of specific net or net specific enrichments for example it's going to pull any standard metadata out of the

file version info Imports type refs you know signatures any any of that kind of stuff and again this will be put into elastic so you can search across all binaries or all assemblies that you've had uh downloaded for the entire engagement and one of the the really cool things is a common vulnerability that we or class of vulnerabilities that we tend to find on engagements are deserialization attacks for net assemblies so for anyone that's it's you like the binary formatter and all these different types of things it's kind of a quick easy win for a lot of specifically inous develop. net apps they tend to have these code execution vulnerabilities that you can use for

prives or lateral movement or whatever else so we have code that will check every single dot in assembly for any of these common deserialization BS and that will alerted to you in slack it'll show up in the dashboard and it's searchable across everything we also have the ability to check for command injection just kind of that it's not every single vul that's possibly going to happen in a net assembly but it's the level one stuff that we tend to do every single time that we check an actual net file we also decompile all net assemblies to source code zip it up and we'll actually show this in one of the demos so again this isn't any this isn't Magic this is

stuff that we all know how to do but having it automated that with within 10 seconds of you downloading a net assembly having all of this done and alerted to you is definitely going to save you a lot of time then after all those specific enrichments happen again everything is going to be stored into postgress and elastic then you can search the elastic stuff through Cabana and you can interact with the structured data through the dashboard um some other example enrichments we have uh be besides just the file processing we have a huge amount of stuff surrounding uh dpapi the data protection API for Windows we'll show some forwards and backwards uh decryption with that here

in one of the demos we have things like a process and a service categorizer which you know there are tools are you know there are plugins for things like Cobalt Striker and met that will highlight like oh this is Av or this is a remote access or whatever else this will do it for any process data coming in from any C2 it's Universal across everything because it's the Nemesis isn't aware of a specific C2 platform all the data is just shoved into the front end API and it doesn't necessarily care exactly where it came from um but also one of the things I forgot to mention for anyone that actually cares about this uh data Provence type stuff

every single piece of data that's ingested in is tagged with the time it was ingested the operator who is actually running the C2 an an identifier for the agent itself um and an expiration date so we actually have a backend service that runs several times a day and checks if any of the data is passed its expiration date it gets expunged so for us uh being professionals are the way we structure a lot of our contracts you know we have certain retention windows for different types of data for engagements so you can set that and not have to worry about uh data retention we also have like a Yara API that Cobalt strike or Mythic or

whatever can plug into and actually run a common set of yard rules across any file before it's uploaded or actually run on the host we have a whole bunch of stuff around chromium cookies and logins we'll show that in some of the demos um so lots of fun stuff and and this is just an example of some of the enriched files know the text is a little bit small but you have your you know like assembly info company names file descriptors file versions and all that again for every single assembly that's ever uh run through this you have all your standard hashes for everything you have magic types already extracted um you have if it's plain text if it's

binary if it's office it'll have like links to PDFs if it gets converted to PF and all that kind of fun stuff and here all the way on the right are some of the example d serialization results to where if there are actually Gad vulnerable Gadget calls within the net assembly itself that gets all this is what we mean by the end structured data this is essentially a Json dump of what shows up in uh elastic search now all right collection yeah so we're going to talk about we kind of talked about Nemesis itself uh we're now going to talk about kind of the data collection so before it hits Nemesis and then the analytics that Nemesis is

actually performing in just a second um so here are some of the challenges we face with data collection uh number one is this kind of batched verse incremental ingestion so we're pulling data from a lot of different sources uh you know I guess first to explain there's an idea of like batch batch collection so this would be you know think about how Blood Hound Works uh you know technically it kind of does incremental ingestion but normally when you use it uh you collect all of the data up front you know it gives you the zip file and then you plug that zip file into uh blood hound and it parses all the data you know it's uh one batch

we've we it's a single batch of data that we process and analyze uh in Nemesis it's a bit different we do this thing called incremental ingestion so we want to perform these analytics in real time uh while we're operating so uh you know we can collect data from C2 right as it gets emitted and perform analysis on it uh it presents a really significant issue because we never know when our data set is complete uh you know there might be a situation so like abstractions are built from multiple sources the uh Windows Services if you're not aware are derived from registry keys it's like current control set Services uh maybe the operator just performed a query for all services that

start with a letter W uh you know that's not representative of all services running on the system that's you know some subset of sources um so we don't know when that data is complete or not uh we can also you know we can get data from different sources so maybe we perform a registry query uh for those services or maybe we use something like the wi SVC wi 32 API uh library to grab the services from the the service control manager itself uh you know maybe we can also just perform like a registry Hive you know grab the registry Hive itself upload that so there's a number of ways we can collect uh you know a

single view like sources uh or like services and you know we have to be able to uh you know incrementally ingest that data you know not know it's complete so that provides a bit of a challenge uh we also have this problem in you know the current collection methodology that we we typically do of information compression so historically uh offensive tools do a lot of data compression you know if you think about the tool power up uh you know powerups extremely powerful it gives you uh you know the ability to find privilege escalation on uh your system uh under the hood what it's doing for something like service abuse is it's navigating through those registry keys

uh you know those Service uh Services service registry keys and then uh you know only selecting and returning services that are vulnerable um you know there's information compression happening there's uh you know these services are being queried uh but never returned to the user they're just seeing a small subset the ones the services that are defined as vulnerable but we may want to go back and look and see you know what services are actually running that tool performed that analysis uh but didn't return all of the results it had for for services and this definitely can come into play when if you discover a new attack path whether it's on a host for a privas or whether

it's in a network or whatever else if your tool had done all the analysis previously on host itself instead of returning the raw data you don't have the ability to retroactively do that new analysis on the same raw data so it's compressing all that stuff down and only returning the results that it deemed interesting to you usually in plain text not in a structured format and we don't have the ability to retroactively do this type of stuff yeah exactly and you know we we do want this data for for research uh we may be interested in it during the assessment um so it's just kind of a different way you know uh to think about data collection on an

assessment it's uh you know we want to return raw data um and then process it in the back end instead of processing it on the host so how do you get data into Nemesis uh you know our uh you know our goal with uh collection you know integrating tools is to obtain information from all Tooling in the assessment uh because we eventually want to provide one unified view of how this this data looks it's super simple to integrate with Nemesis there's two post requests that you have to make uh one post request is for importing uh or submitting uh structured data the other one is for submitting files uh so if you have a tool and you want to

integrate it into Nemesis uh you have to write one maybe two post requests ingested data is well structured so this includes we have uh types for things like file listings file data uh process listings you know Services all these kind of things so if you have a tool that you know collects something some data like this uh you know you just have to use our format we have a custom format um and you know and you know uh fix your your your data to this format and then make this post request and you're you know you're integrated in Nemesis your tooling is compatible uh we have connectors built for a VAR various capabilities we have a Cobalt strike

Mythic Metasploit OST stage one uh and sliver built out already for you uh but of course if you have your own tooling that you'd like to integrate you know feel free please make merge requests to our project we would love that we also have a chrome PL that we're going to talk about yeah in one of the demos yep uh okay and now we go from collection to analysis our goal here you know after we perform our data collection we want to analyze it automatically uh our our first goal what we're trying to do is automate away this level one uh analysis kind of the boring stuff like we talked about uh you we're starting with file share triage um and

trying to you know that's extremely boring it's extremely tedious uh we want to automate away as much of that as possible uh the this approach permits analyzing relationships between different data sources so you know like I mentioned maybe we can query uh Services through either the service control manager or registry keys or something like that we have a number of different data sources that we're unifying all into one uh centralized location um to accomplish things that used to you know we have to manually analyze yeah so one of the goals that we're trying to get to by the 1.0 release is recreating all the analysis that PowerUp does but doing it offline on Raw data so instead of having to run

your offensive tool and your offensive code on the host itself you just need to pull that raw data down and ideally you should be able to do any of that type of privilege escalation analysis offline right you don't have to do all that stuff running offensive Powershell or C or anything else on the host itself yeah and we want to be able to provide feedback to operators uh you know in suggestion so either you know I talked about evil clippy you know you can kind of think of part of this project as like evil clippy uh you know that would be something like suggestions giving operator suggestions or you know being able to retroactively analyze an

operator's tradecraft and say something like you know this operator did really well here and did poorly here you know we've collected this data and you know we can see that they've uh you know they were very stealthy or they used interesting new tradecraft um or they did something that you know maybe they shouldn't have or something like that uh so once we collect this data and we analyze it um it gives us a lot of opportuni yeah some of the actual cool stuff like on the ground like some of the cool neat things that it actually does from the enrichment standpoint yeah uh yeah so here's some of the automations we currently have uh this is

not an exhaustive list this is uh you know this isn't the only things that it can perform but these are some of the ones that we have written currently uh we'll show demos for a lot of these uh but we've talked already about the net assembly decompilation and finding vulnerabilities automated vulnerability scanner uh we have things like files being scanned for dpapi uh and blobs uh being carved and then having that forward and backwards decryption uh you know some things we don't show yet are uh uh archives that are extracted so if you if you have a zip file archives will be extracted and then all the contents reprocessed so you know you download a zip file uh you don't have to

manually unzip that and then submit each individual file to Nemesis for analysis it'll automatically unarchive it and then reprocess it submit it back into that rabbit mq q and uh you know reprocess that data so these are just a few of them and we have demos for for most of the ones listed here so how people actually interact with nesis uh most of the time it's just going to be through the dashboard that we're going to show this is still kind of version one of the dashboard it's built on a back end called streamlet it's not the most performative thing but we wanted to start building out some of these workflow assist type things that

we keep talking about so this is a place to kind of view triage and perform analysis of this data in the Target environment this is just showing the number of different process files and this documents you know cookies logins and all that so we'll we'll go through these bit by bit we also have this is just an example of some of the more details of what gets surfaced to a user when a file like an Office document is downloaded gets Auto converted to PDF it's displayed at the bottom we have you know the path that it came from the machine it came from timestamp uh you know sa Shawan all that kind of fun stuff also if you see these little

thumbs up thumbs down question mark uh triage Parts in that top right so this will take feedback from people on whether a file was interesting not interesting or kind of undetermined and it saves this right so we're actually able to start building label data sets to eventually do statistical or machine learning type of uh processing these types of things we're getting our operators to actually label data itself and if multiple people are working in the interface like all this is sync for everybody same with like comments and files so someone can say you know wasn't interesting or had this credential or whatever else so again nothing super super crazy but it does we've been using

this on a handful of engagements so far and it's greatly helped speed up the triage of files on on host so actually showing we got we got five demos they're all about one to two minutes so this first one is going to show the net assembly stuff that we keep talking about so this is Cobalt strike we have the connector already hooked up we're going to download our sim. exe and it's going to say here pause that okay the file was automatically uploaded to Nemesis I know the text is a little small um and we actually get feedback from Nemesis back saying oh this has deserialization here's a link to the file in the dashboard so this gives us

just basic information of you know shaan magic uh file magic type for you know what what the file actually is and then there's a handful of things get that get alerted on whether it's dpapi der serialization things like that then we're going to swap over and actually go to the Nemesis interface and this all happens within just a couple of seconds the file is already processed through the entire pipeline we're going to go to the file triage view so we see that assembly we see that nice little D serialization tag with these little icons here we can download the file we can view the raw information we can download the extracted source which we're going to

show here so again stuff you can do yourself it just is done for you every sing time and we see here this is just an example toy this is a toy example of a an assembly with a vulnerable deserialization kind of issue it has a binary formatter for the file stream that gets read in so that got alerted on you already get the source everything gets pulled down without having to manually throw it into D Inspire or something like that then a little bit more some of the detailed file information it'll show you know some of the same stuff in the cars but this lets us actually do that triage stuff and this is the raw elastic search dump to

where we we see all that metadata information and the Imports and all that type of stuff that you can start doing searches across all files in the engagement for um also with this anything that gets Tagged so it says dualization this optionally gets shot back out to a slack uh web hook if you want as well so we have a different you know like web hook or Channel set up for every engagement we're on and we start getting alerts coming in as interesting things to start to pop up that's the first demo now Chrome extension yeah so this is the Chrome extension it's uh uh yeah you can see it here basically what what it does is take downloaded

files and then automatically submit them to Nemesis this is especially useful if you're proxying in a browser uh you know you have an agent on a system you proxy in a browser and then uh you know anything you download maybe from like a an internal Google drive or SharePoint or just any sort of web-based uh file server uh it'll automatically grab the data and submit to Nemesis this is the configuration this is the agent ID for tracking purposes uh and then you pretty much just set it up with the nesis URL the username and password you know we have password authentication for this uh submit it configuration is saved now we can go and download a doc

file a doc file you can see Nemesis will automatically upload it uh for you to or this plug-in will upload it to Nemesis for you and then is immediately processed by the Nemesis engine we can see we now have one processed file when we look at it we can see we have the doc we're able to redownload the file uh you know for other operators if they want to look at it you can view the file as raw so this is more useful when you don't have a doc file uh you can look at it as a PDF so this is pretty this is extremely useful uh you know instead of having to download office every single time uh if you just

want to see a file you know maybe there's a uh you know someone this is a watering hole attack or whatever you know uh it'll automatically process it uh into a a PDF that you have the ability to view extremely

easily we can also uh check the detailed views the detailed view of this uh you can see It'll extract as plain text um and yeah give you just the detailed file view there's also a document search built in so because we have we extract the plain text out of document files uh we can search do just arbitrary searches for all the files that we've downloaded so we look for passwords and we can see that this document has a password string in it so this is definitely useful if you're going after like Project X or information about system y or something like that right as you're downloading these huge swaps of things you basically get a Google style search engine for for

every single Office document that you've downloaded again any file that can have plain text extracted from it gets indexed same with we have a separate view for Just source code and then for the semantic search we actually use a small language model it's an expert model to do sentence embeddings and we like chunk everything up and actually do semantic uh vectorized type search a little bit of natural language processing stuff as well multiple yeah yeah it's it's paginated it's like any arbitrary number so the question was if there are multiple files multiple hits yes it'll do like an arbitrary number and you can like go Page by Page and this is again this is just straight text search but then it'll

if you click like the path it'll link you that detailed file view that shows you the host it came from then when it got downloaded and all that kind of fun stuff okay let's see next one is nosy Parker and hash cracking so we're back on our Cobalt strike interface we're going to download a couple of files one is an encrypted zip one is a sample config file that has some known uh token patterns in it so give it a couple seconds everything gets processed we're going to go back to the file triage view we see in this Google config it has nosy Parker results which means it likely has some kind of password or token or whatever

else contained within it this is all sample data from the nosy Parker site as well so we see if it's text we actually have a Monaco Visual Studio style editor at the bottom so if anything's like source code it does syntax highlight and all kind of fun stuff but we see we have a tag and you can in a in the S the file search you can like filter for all the tags and you can search Any Which Way in order and all that kind of fun stuff but we'll actually see we see the elastic info we see all right there's some of this weird structured nosy Parker results we actually have a nosy Parker view that shows up if there are

results and it shows oh this is the rule Google oath access token that's the matching text and here's the context for the document so nosy Parker awesome tool it has uh tons of different offensive Focus reject Like rules built for it for any number of different things there's like an example open AI key and this again will be alerted in slack as well so the moment that any interesting passwords tokens whatever else gets discovered in any file that you download it'll all just automatically alert you and you can hop back in here uh this is an encrypted zip example it tells you it's encrypted we we don't we support about like 20 different encrypted files where expanding this you

know every couple weeks we have some more but if it's like an encrypted dock encrypted zip encrypted whatever else we use uh like Doc to John or whatever else you know the hash extracting tools tools from John the Ripper extract out a hash that you know the hash the hash sorry the hash of for for that file that can be cracked on the Ripper it gets Auto ingested back into Nemesis and we try 20,000 most common passwords against it and that and you could build it like Max mentioned you could build a different service that takes that and like shoves it into a password cracking Que in the cloud or something like that so we see our hashes it was

extracted Auto cracked we would have gotten an alert and slack saying that that that password was actually cracked password one two three bang so again nothing super crazy you couldn't do kind of yourself but it's just happening for every single time without you having to do anything manually all right now one of my favorite ones is Chrome dpapi stuff so for people that aren't familiar with dpapi the data protection API is the supported way to store data encrypted at rest on Windows systems um I don't have time to explain the entire entire system but the tldr is you'll have information that's protected by Keys called master keys these master keys are protected by a user's password and also a backup key

that exists on the domain controller so here the example is we're going to download the login data file for a chrome instance this is where your saved Chrome passwords or stored we download that then we're going to go up one level and we're actually going to get the local state file so the login data is protected with a key that's stored in the local state file that local state file that key is protected with master keys and the master keys are protected with domain dpapi backup keys so if we download the local state we download the login file and we download all the user master keys we can't actually decrypt those Chrome logins yet because these master

keys are still locked up but we're going to show in the interface that everything was detected everything is linked so all these are linked together so if we show the uh chromium view we're going to go to the logins and currently they're all encrypted so we're actually going to click the little slider that says show the encrypted logins we see okay it is not decrypted yet now if we I have da in this example I'm going to run sharp de API to extract the domain DP API backup key from the domain controller this text through the Cobalt strike plugin is going to be recognized as a domain dpapi backup key it's going to be Auto ingested into

Nemesis and then that backup key will retroactively decrypt any master keys those master keys will retroactively decrypt any local state files and that will retroactively decrypt any logins so we have forward and backward if we download new Chrome logins or cookie files from the same user those will automatically be decrypted but because all these pieces are linked the moment that you have all the puzzle pieces everything will just be ripped apart so last bit show I'm not lying lots of stuff going on in the background right but all it's all gets like retroactively decrypted and we see we have our decrypted Chrome log and this works for cookie values as well we can just actually show in the

dpapi view for master keys that we have all these particular master key files shows they're decrypted and shows that this one has a link like State file same with any other dpapi blobs that get Auto carved from files as well all right and last demo all right and then here is the uh collector uh it's a bof if you're familiar with Beacon object files this is written as a bof uh it collects registry keys from uh the system and then submits it to Nemesis for uh analysis so pretty simple to use we can uh use B frge collect specify the hive and then the target path we want to collect from We're collecting from current control

set Services I've mentioned this this uh path before this is where the this is the services database for uh uh services on windows again we're just collecting the raw registry values here yep so it uploads this file B reg collect nemesis this gets uploaded to [Music] Nemesis and then uh now we're showing off the postgress uh we're showing off postgress so we talked about you know this stuff gets uh added to elastic search uh it also gets structured in a more you know structured format besides uh you know just rock uh you know elastic search uh and gets put into a database so we what we can do here is Select uh all the keys from the registry

uh uh table when this gets poed you can see that all of the registry keys are here so now we can perform SQL uh queries against a um you know all of the data that we've collected what's also cool is that once these registry keys are added they're reprocessed by the uh by the Nemesis engine they're resubmitted into that que for further analysis uh and so one of those the kind of abstraction we have here is the services abstraction so uh those registry keys we it we can tell the the key path it's hklm and then current control set Services we know that these are now Services we can then insert them into uh the services table and when we

perform this this query we can see that these get uh added as Services you know highly structured uh gives us the exact description display name uh command line parameters for how that service is going to get started you know all of this data was in the registry we can either query it by registry keys or this this more abstract form of a service and then the next step like we said that we're currently working on is doing analysis of saying is this service vulnerable or not for the purposes of privilege escalation yeah so all of the uh you know unquoted Serv everything that PowerUp uh implements we're going to implement also in in Nemesis I think that's it yep all right

so that's that's it for the demos we have just a couple more slides and we'll take a few questions so why this matters or why we think this matters so enrichment pip enrichment and analytic pipelines are not a new idea people have been doing this on the defensive side for a long time we'll say no one has publicly done this for offensive security we are very we're big fans of the phrase you know if you can imagine it it's probably already been done we're very you know it would surprise me greatly if these types of things weren't done by private groups or governments or these types of large scale Analytics pipelines but for us it's not just what Nemesis does with

file processing and triage that's neat but it's what it's going to allow us to do with things like operator Guidance with building these labeled data sets and all we're really excited for future work over the next year or so of having this nice cleanly labeled structured data store of offensive data is going to let us do and this is also a bit of a potential paradigm shift for red teams where instead of running your offensive code on host if you just want to pull raw data offline and process it in an engine like this this offers you a number of different advantages so we can Cally update operator analysis workflows so we have one workflow for every piece of data

that comes in no matter who's actually downloading it also enrichments and analytics that are added exist in perpetuity for all operators and all engagements when we were beta testing this several months ago we just had the Der serialization check for net assemblies when we were on the engagement we actually found like kind of a command execution jscript execution type bug in a manually when we're analyzing an assembly a modified our analysis code to check for that same type of thing and now every single assembly we ever download forever is going to have that same check so it's definitely kind of a force multiplier offline processing allows for that retroactive analysis like we kind of talked about where if we come up a

with a new thing that we want to check that we could derive from that raw data we can reprocess all of existing data and have all those new analytics again this minimizes the footprint for offensive tooling on an endpoint you can have a very dumb agent that just has the ability to download files or pull raw data and then you can throw it into this and like we mentioned before that collected kind of structure and unstructured data will hopefully help guide us for future research so thank you I think we're just about on time we're just about at uh 51 minutes so if there's any questions we have two giveaways we have a lockpicking kit and

we also have a practical deep learning book so okay question yeah

thiss yep uh the question being you know we have data that can be ingested in have we thought about pushing data out that is absolutely something we're interested in we just had a bunch of conversations last week with some of our internal teams about doing that we definitely want to build that in especially with um you know as you're triaging files even as a basic example this is interesting or not having that data fed back into the file view and Mythic or something like that we also have ideas of if we're facilitating um like doing statistical modeling for things or file share Mining and all this we absolutely want to have that operator assistance fed back into

C2 tools uh we just don't have it built out yet but that is absolutely going to be on the road map so would you like lock picks or deep learning lock all right right I would have picked the Deep learning book but I'm a nerd so all right any other any other questions yeah we got one up here

oh yeah so the the question being you know we have so the way the purging currently works works is you don't have to put an expiration time in right so you can just it doesn't it's not going to remove data unless you tell it to so by default when data is ingested you can say you can give it a number of days until the data is purged so you can just set that as negative one or a million or whatever else um the reason we did that is because of yeah contractually we just have to C we have to purge things after a certain period of time this isn't an issue for say government teams or

internal teams like different people have different data retention kind of policies and issues as far as anonymization of the data we're we've been had lots of meetings and talks about this it's harder than you would potentially think to do it completely correctly even something as simple as a file path could have a username it could have the domain name with the username and the do context it could have a project name it could have something sensitive in the file itself and it's difficult to come up with a way that you can guarantee that things are going to be anonymized but it's something that we're definitely exploring and each piece of structured data is going to be a bit different and

how you would actually approach that anonymization so we have the ability to collect you could use Nemesis for every engagement you have and have it all in one Central store and if you don't have to Pur your data you could do really cool analytics across all your different engagements right all that type of stuff that would be my pipe dream but contractually and like ethically we can't do that for our particular engagements but um that's some for that's something we're hoping to explore in the next few months with um also some other teams that we've talked to of how what's what's a good that we can start anonymizing and scrubbing some of the stuff so would you like a book all right

any other questions uh yeah go

ahead oh yeah for files um yeah I think the time stamp is imprinted on further data I believe so if you have derived data oh derived dat yeah it's uh the time samp gets filtered down so as we like build a service abstraction based on registry data that gets you know it we we're we are guarant the way we have our schema set up for like postgress is you are required to have that expiration time and original time stamp for every single row of every single table so like it's built fundamentally into the schema because that's something we it was a core problem that we thought about from the beginning so couple things one one of

the challenges

yeah current So currently it's a good point so currently we the way we currently use Nemesis is we stand up one instance per engagement and we tear it down with the rest of our info to like guarantee guarantee um we purge everything from postgress I forget we were working on purging stuff from elastic I don't know if it's completely there yet or not because I know it's immutable but I think you can still remove the individual documents after they get stuffed in but that's I don't think we've completely solved that part yet yeah cool any other questions what's

up that is something where current so Lee who is not here is um H he's working so the hardest part of that problem is the representation of a host when you're pulling in C2 data from lots of different things because how can you if you have some tools will report like a domain or a host short name or net bios name some have a fully qualified name some have an IP so we spent a lot of time trying to figure out how we can actually model that across lots of different c2s so we have to solve the host-based abstraction problem first which we're pretty sure we have a solution for it's being coded right now

and Lee had also we've started to address the temporal issue or like a process listing is the great canonical one right um we intend to support that the code is not there yet but it should be for the one.

release um so we we have like an an alert message type that you could build an additional micros service and like it's literally submit Like A rabit and Q message of type alert and then it'll filter up to whatever your learning mechanism is we're also working on we want to get Jupiter notebooks in like long running Jupiter notebooks with a bunch of supporting packages and stuff so you could run like okay anytime plain text comes in let me check for this term or something like that so it's something we absolutely intend on supporting we got three minutes any other questions

one yeah yeah absolutely we have documentation for it that's why our that's why we try to do five or six different plugins to show because literally all it is is your code just needs to take the info from the C2 whether it's a file download process listing or whatever and then it just makes a post request with that data to the Nemesis endpoint that's it so it can be completely self-contained like the Nemesis connector is a small Docker container that just runs completely separate from Nemesis and then post stuff into the the AP yeah so it's that's 100% a thing we wanted to support for everybody we have documentation on building your own connector and what

five or six examples so cool awesome uh if you have any other questions you know please find us around the con we'll be we'll be here we have a a spects booth out there that we'll be hanging out at if you want to come ask some more questions but thank you everybody thanks for uh listening to us talk about data for a good hour so

Will Schroeder & Max Harley - Meet Your Nemesis: Fighting Data With Data

Related talks