Sharing Open Datasets with the World to Develop Detections from Home | Roberto Rodriguez

Name: Sharing Open Datasets with the World to Develop Detections from Home | Roberto Rodriguez
Uploaded: 2020-11-02
Duration: 47 min 35 s
Description: Rodriguez explores how defensive security practitioners can share datasets generated from adversary simulations to accelerate detection development across the community. The talk introduces Mordor, an open-source project for centralizing and distributing security telemetry, and demonstrates practica

BSides Delhi 202047:3534 viewsPublished 2020-11Watch on YouTube ↗

Speakers

Roberto Rodriguez

Tags

CategoryCommunity Technical

TopicDetection Engineering DFIR Threat Modeling

StyleTalk

About this talk

Rodriguez explores how defensive security practitioners can share datasets generated from adversary simulations to accelerate detection development across the community. The talk introduces Mordor, an open-source project for centralizing and distributing security telemetry, and demonstrates practical strategies for collecting endpoint and network data using local VMs, cloud environments, and tools like Sysmon, PowerShell, and Azure infrastructure.

Show original YouTube description

BSides Delhi 2020 Sharing Open Datasets with the World to Develop Detections from Home As a defensive security practitioner, researching a new technique used by real threat actors to compromise an environment is not as simple as copying, pasting, and running a query. Besides learning about the internals of a technique and ways how it can be executed, eventually, one would need to simulate it. As you may already know, the simulation process takes time and preparation, and usually, the time spent trying to generate data is higher than actually analyzing data. What if we can share the data we generate after simulating an adversary? How many teams can we help out there that might be struggling simulating adversaries or that might not have much time to do it? What if we can centralize the efforts to collect and share data with the community? Enter Mordor! Roberto Rodriguez Roberto Rodriquez is a Threat Researcher and Security engineer at the Microsoft Threat Intelligence Center (MSTIC) R&D team. He is also the author of several open source projects, such as the Threat Hunter Playbook, Mordor and HELK, to aid the community development of techniques and tooling for threat research. Blog at https://medium.com/@Cyb3rWard0g Follow Roberto Rodriguez on Twitter (@cyb3rward0g)

Show transcript [en]

uh welcome back everybody and hello uh roberto how are you today good good good how are you uh how are you can you hear me well yes absolutely very good very good so uh where have you flown in from okay uh so i'm from virginia from the us yeah and you're still in virginia from the us yeah yeah virginia from the us virginia um and kind of early i guess yeah for me a little bit i usually work late and you know wake up late but uh typical infrastruck hacker right yeah yeah but uh no thank you for the opportunity all you know everybody um in you know looking forward to talking about these uh

you know projects as well so absolutely absolutely so please start to bring up your slides uh your talk is on um comprehensive cross-domain enterprise threat exposure no that was the last one sharing open data sets with the world to develop detections from home so there you go and you are up and running so roberto thank you and take it away please all right well thank you very much um well first of all you know thank you everybody for joining my talk and uh this is gonna be um a couple things talking about some projects but in general focusing only on on one which is uh something that has been helping me a lot to start doing research

and this um you know first banner is pretty much a a community behind all these projects and it's something that i'm going to talk about also in a little bit and the talk today of course welcome to mordor i'm a big fan of a lot of the rings so that's why the name of the project and this is more related to how can we start sharing data sets with the community after emulating or simulating an adversary and once again my name is roberto rodriguez um i work for microsoft for the microsoft trading intelligence center team the mystic rnd team and i'm also the the founder of the otr community and also i love open source so there is

a lot of projects that i have developed in the past couple years so if you are interested on some of them um some of them have twitter handles so you can just follow them and then you know we can start also having a conversation about those and my twitter handle is cyberward doc just in case um if you want to continue the conversation um in twitter as well and agenda for today is going to be talking about kind of the basic process of research and and how that translates into the need for data and then start talking about a little bit of these things that you can do in order to generate the data and then at the same time go over how

this project started and what are the different things that you can do with it all the way to the couple things that you can do in the future as well with some of the things that are in the roadmap um so let's just start with the first part which is what triggers the detection research and this happens a lot especially from home right for those that are doing research from home besides what you do at work right because that's a little different sometimes because when you do research sometimes your organization will define some of that research right the specific research goal can be related to a specific for uh for example you know policy that

just came out for your organization or it could be a specific environment that you need to focus on the most but in reality or for a lot of people it's pretty much researchers start from twitter right there is a lot of different things going on now um in the past two months there's been all the way from awesome lateral movement techniques being released by the mdseg labs um you know team um all the way to start also dropping a few um you know pocs like proof of concepts for a couple of cbes that also start popping out there so this is pretty much where i get a lot of my inspiration as well so if i were you you're not in

twitter yet um i would highly recommend to join the community and start also trying to you know take all this research and start understanding it and you know so that you can apply your detection um approach as well so what can we do right first we consume that open intelligence and the first thing we do is start trying to google if there is any detections out there has anybody even written a um you know just an article or probably share the detection rule and then at the end of course you start thinking okay now if there is not much out there you know can i actually do it myself can i start replicating this and there's when you

start you know google and also if there is a you know poc available and of course the next question is where i'm going to deploy this i'm going to do it on my local computer do i have to set up my own lab do i have to probably spawn more than one computer all those questions are valid right all those questions are valid and if you're realizing a little bit a lot of those questions are more related towards before you even start executing your research and you're not even getting to the detection piece yet and unconsciously right and you know indirectly you start creating this methodology in your on your own so i started doing the same um i started

doing the same and i started putting together a couple of main steps um starting for example from my research goal right this specific tweet that i liked and it was very interesting and then there are not detections about and i go through this you know whole cycle now of course a lot of things um do not just go clockwise all the time right there's a couple of things as a couple of things that you will have to go back and start doing some of the you know feedback loop and trying to improve a couple of steps but behind this right because this is very high level and of course this is only you know one approach there

there's a lot of approaches out there and and people also choose what they um like to do the most but for me i actually put a couple of other um you know steps in this specific basic methodology i was working with my brother on this um because we had a couple of ideas into what should we do next what can we add as a you know step that would allow us to improve the next step and then you know once again do this feedback loop across all these different steps but when we talk about um trying to understand the adversary besides doing a lot of the you know which is necessary a lot of the

static analysis or basic and advanced dynamic analysis for example you start getting also into the part where you need to simulate it right and this is the part on that side of the methodology which we're going to focus you know today and and that part sometimes is where a lot of people actually get stuck especially from a detection side um not everybody has the expertise to actually go through all these and create their own tools their own pocs and execute a cbe just like that right so that's the p that's very important to understand where all these steps actually um fit in your methodologies so you can also probably start collaborating with other people that

might have the skills to help you in those aspects so this takes us to the simulation part which as a basic strategy right you need to start exploring the tradecraft like you need to understand a little bit of the tradecraft the techniques behind it the procedures and tools probably being used out there you need to start you know creating your plan of simulation sometimes it could be as easy as running one command sometimes it could be it could be a couple commands could be a new script it could be a whole emulation plan and then you have to do this also feedback loop in order to start understanding what it is that you can do as well to

enhance that environment that you're working with um it's not just enabling telemetry but it's actually creating more context around that telemetry um and and you know probably enabling a couple of other more configurations or probably adding more data sources um to you to your flow and then of course you know once you are close to a detection or you're feeling comfortable with some of them you know you know that needs to be documented somewhere right you either document it for your organization or you document it to share with the community so that's also up to you and something that i've been doing also a lot um is actually sharing as much as i can with the community

but there's a lot of challenges and some of the challenges that i have identified it's a lot of time and technology constraints for example once again some of these might be very easy some of these simulations might take time for example from my experience i'm not doing a lot of the offensive research but i try to you know try to learn a little bit you know trying to learn what it is that i need to do in order to be more proficient in in that piece and start building my own tools as well but it takes time right you know for me sometimes it takes probably a week or two in order to get to a really good

representation or or or a good proof of concept of a technique that was released recently right all the way also that goes along with expertise and of course you start thinking what it is that i need to collect once again it's not as easy as just running something and then expecting data to come out there needs to be a lot of different configurations around it um if you're dealing with a use case that you need to understand you know what specific actions of the adversary would trigger or potentially would trigger some of those some of those events so just to kind of get a few examples to tell you how also some of my days go

i can go with the basic creation of a local user account all the way to for example um trying to replicate a specific dll hijack that was released in the community so from a local account perspective creation there is this project called atomic red team and they have a lot of different scripts that would allow you to um you know simulate that specific procedure as well within a specific tool and you'll be able to get an understanding into what these actions actually um what data actually uh um is kind of generated for these actions right and that's very easy usually you only need one computer to do that and you only need probably just one console

you know command command and prompt um you know command prompt console you just run these specific uh specific commands and then you can just do this right but then when you start getting into this type of research where someone says hey i found a new lateral movement technique which involves for example like copying a dll um you know over smb and then also start interacting with the service control manager and then trying to query the service start it stop it and you know trying to to go through this process it is not that easy right sounds well for some people might sound very easy but for a lot of people this is not as pretty straightforward

and just to give you an example right so the way how this works is that you have two computers and the adversary or the threat actor you know performs research before you're executing this of course and identifies that there is a vulnerable service that exists on any for example windows you know windows 10 computer right so there is no need to create a new service in this case and then it starts interacting with this other computer um you know through rpc for example and it starts um you know getting a handle to the sem and trying to query the specific uh service that wants to um you know do the dll hijack for and then stops the service so you can do

it with a control service rpc method and then you you know use the smb career request session in order to um you know copy that dll that the service is going to actually look for and execute and then at the end you start the service right so this can be done through a few commands it could be done through for example some scripting languages too um or you can just do directly you know pure directly pure pc through some other you know command line as well but then you start thinking okay this now that i understand a little bit what's going on so now i have a lot of questions right so what are the things

that i need for each specific thing that is going on in here this is of course something that you can do prior to um you know prior to simulating all this because you by understanding the the tradecraft you can start understanding this uh specific model between an action of an adversary and also a specific event id for example but then you start asking what can i do from a service perspective what audit policies can i actually use to start you know monitoring my services and here's when you start doing some research for example on these systems uh system access control list cycles on the specific service or you can also create a cycle on the

sem itself right then you might ask yourself why do i need to do that and why do i even care and what makes you even think that you need one or the other one is exactly why because in this case for example the blog post that was shared by the researcher dwight um that discovered this uh specific one a specific latter movement technique through dll hijacking was that instead of querying the service directly um he was using actually a different rpc method which interrogates the sem and asks about the status of all the services so it doesn't actually touch the service directly so if you create a rule on that specific vulnerable service so that if anybody touches it

there is an event that happens well if you use a different method that event is not going to trigger so you need to also account through some of these also um you know variations of this technique and that just happens all the time right there's a lot of things that you will have to consider if you want to start building a detection around one specific procedure or you know technique so that was very interesting right um there's a lot of things that are happening here and just after telling you all this you are still in that section of a whole research methodology you're just kind of like trying to figure out how this is going to play out how i'm

going to simulate it and actually what do i need in order to simulate um and also start getting the data from this environment this happens everywhere right everybody else is doing in a community in different countries i talk to people and they say yeah we do research the moment something happens in twitter uh we just start getting together and start trying to figure out how to how to replicate it and how to get data out of it right and of course some people are successful some teams some communities some organizations are successful they can just do it right away but the problem is that there's a lot of people that are not successful for example myself there's a lot of

different techniques that get shared almost every day and i cannot replicate a lot of them and i actually uh collaborate with with other researchers that are you know great researchers out there that um you know help me with the pocs for example like how can we build a specific dll that will for example spawn on the thread of a process or probably create another process and then inject code and then create a thread to execute code so how can we start you know building all those things right so that's to me was you know very interesting because when you start thinking about why don't we just start sharing some of this data that we're generating

right if it took me a week to figure out how to do for example this like lateral movement technique um why can i just you know collect data after doing my research of course right understanding what do i need for example i talked about a little bit um these audit rules and services and then i can share that with somebody else right and then this person um or whoever i'm sharing this with in the community they might be more proficient on the data analysis piece right we have to find understand identify our strengths right where we're actually good at and that's good because sometimes you might think that because you're not good at let's say the simulation step

you might just get stuck in there and you might think i'm not doing as much and i feel uncomfortable like you know like this is some something that i don't you know something i don't like right um yeah but in reality you're really good at the other steps so to me it's like how can we empower other people in the community that have um skills as well in other parts of the methodology just by sharing the data and that's pretty much where the project mortador started and the project mordor is just a repository of data sets that um are shared after you know simulating an adversary in a specific technique or procedure and everything is in json format and the

reason why i'm sharing everything in json is because i believe that it's a more portable easier to use you know type of data or specific you know file that i could use for example with with python i can use it with powershell or i can just integrate it with other sims for example i can throw a json file to a data pipeline directly and and that could make it all the way to my sim so that was the idea a little bit and also i started categorizing all this by the mitre attack framework like specific tactics and techniques and all that and also something that also differentiates this project from other projects is the mordor project also generates

atomic samples a specific data set for specific actions just like the ones that i just showed you a couple minutes ago and then also we cover large data sets so data sets that are created after simulating a whole campaign not just one atomic behavior but actually multiple multiple atomic behaviors and i'm going to explain to you what that actually means and and how it's being done so at the end of the day with the mordo data set for example my brother it's over there at the bottom uh his name is jose and his handle is cyber cyber panda h um just in case you're gonna follow him and uh the awesome thing about this project is

that this also started with with us helping each other it it started by me saying hey i have this data set and you're actually good at analyzing data um because his background is more from the data analysis data science aspect of infosec um so for him it was very easy just to say hey can you just send me the data set and i can start building probably some detection rules or i can probably import that into a jupiter notebook and i can start analyzing the data just exploring the events that were collected after simulating on a specific technique so that was very helpful and that was also the inspiration to help others in the community the immortal

data set if you go to that website you're gonna have pretty much on the left all the the specific data sets that we have for a specific platform and then i also have this um what's it called attack navigator view where you can see just some of the the things that the data sets cover um i focus a lot on a specific technique so you will see a lot of procedures under one technique so don't get scared if you only see six or seven or let's say 10 things highlighting there's a lot of things under those techniques as well and i'm kind of showing the example that i talked about a couple minutes ago about dll hijacking

so there is the link at the bottom and you can see pretty much how it was done that also is part of the project so if you contribute on a specific data set i would love to know how you actually get the data set so that you can just add more context to someone that is actually using it and consuming it for me that's very important having metadata about data sets beyond just event logs um it is crucial for somebody to to kind of like get familiarized with what's going on and also trying to understand a little bit more of the tradecraft without executing it and at the end i also provide a little table of all the events that were

generated in that period of time now something that you might have noticed already is that you might be asking yourself why not share only the events that are actually related to that technique why do you have to share a whole period of time and to me it's because there is a lot of rules out there that if you only give it for example one event let's say event id one and you say look for this specific command line for example but that command line is also executed by other processes and you need to have that extra noise on the top of that if you want to actually have some type of validation going on as well

it doesn't occur of course in every single data set where you're going to have some potential false positives but i've seen data sets where even though i say oh this query is very good for this specific technique and then i apply it to my data set and i have more than 20 or 30 events that occur so that allows me to to see what context can i add to my query in order to improve it so having a data set with more context allows me to also have you know create you know better queries and also enhance my queries and also have the opportunity to join a couple of data sets at the same time

now some of the environments that this project comes with and and you know exactly you know what you can do with those there is a strategy for mordor and there is two things that we have identified works and i think it depends on on how much time you want to spend actually creating a data set and how quick and you know how quick you just want to share something with the community so the first one is um you know having your own local computer or it could be a virtual machine or or it could just be one machine in the cloud but i assume that a lot of people have their labs in their own computers with like

specific programs like virtualbox or or you know vmware and in in that scenario i believe that you can also get the data out of it directly and i put together a powershell script which i'm going to show in a little bit and with that one you can easily point to a couple of uh a couple of event providers like sysmon like security application system and you can just pack all the data that happen right after you execute a technique and and you can pretty much collect that data then share it with the community as well the other uh strategy is actually using a bigger environment and when i say bigger i don't mean hundreds of computers i mean probably

two or three computers some techniques require more than one computer like for example lateral movement and of course if you're if you're in a domain environment you need to have also a domain controller if you want to also capture some of that data so talking about local computers uh this is pretty much what i would do i would just you know download this script and of course you're going to have um you know the slides later but if you go to the github ot rf as you can see in the link at the bottom mordor and you go to scripts you'll be able to see that script over there so you download and you look into your local computer

you import that specific module or that specific um script and then um you can first you know clear the vm providers that you have so pretty much before you want to execute something simple clear your event providers run your exec or specific techniques or one command or a couple of commands and then you can run the last step which is going to grab all these event providers and it's going to use powershell to grab that data you know parse the xml out of it and then create a json uh create an array with a couple of json objects and then it's going to merge all those in those event providers into one array and then just copy that to a json file

i i find that very easy to do and pretty straightforward and once again i don't want to collect only one event provider i want to collect more than one at least you know security and sysmon is pretty good because you can have events also that are not created in sysmon that might be good context for your detections the other case and i'm going to show an example in a little bit so don't think that that's just it right i'm going to show you a little bit how i actually did one and then the other one is a couple environments as code and here's where it starts getting not complex well complex for me because i have to figure out all these different

pieces together so i can just share everything to you um with you but there's a couple of things going on so blacksmith is a project as well part of the community that i'm part of the otr community the open third research community and blacksmith only has configurations and scripts that can be reused anywhere else for example if i have a sysmon config and i want to create five to six environments with the same config why would i copy the config six times and if i want to update the configs i have to update it in six different places so blacksmith for me is more a centralized repository for a lot of scripts a lot of configurations

and that i can reuse them somewhere else simulan is another project which what it does it actually creates environments through templates that have a specific purpose so think about um you know every time we say for example um the emulation plan right i have an emulation plan for apt-29 for example but the plan usually is commands right there's a couple commands a couple of payloads that you can use think about simulan as the emulation plan but from an infrastructure perspective like everything that you need to deploy in order to execute that specific plan and then at the end um mordor is pretty much just waiting for the data to be generated by environments like simulink and then

could be shared to the model project so talking about simulant a little bit once again it's just a combination of of cloud templates and scripts as well that are grabbed from the blacksmith project and there are just pretty much um you know being put together to make sense and to you know kind of like replicate a little bit what a potential apt did right because a lot of the emulation plans are actually based on open um you know open thread intelligence so you know we're just working with open thread intelligence that does not mean 100 that that's what they did but you know we're just doing working with community resources as well so the next one um it's it well

some of the environments in simulink for example i talk about one which is the the attack evaluations environment and that one pretty much follows the attack evaluations program where they actually assess a lot of ebr solutions and they share their emulation plan so what i do is i just take that emulation plan create the environment and then execute it and collect data the other one that we have is cloud bridge s3 so this is just a basic scenario which would allow me to cover potential uh techniques where s3 buckets are involved and i have the difference with this environment and other cloud environments is that the environments that i create do have a data pipeline attached to it

which means that i'm not just attacking the environment usually when you see an environment out there being released by somebody to to use for you know practicing a couple of adversary techniques it's only to attack the environment but there is not a a concept of okay so i'm attacking the environment what data is being generated right so this environment actually enables like cloud trail logs for example and and they're pushed to lock stash and then are pushed to um you know to a kafka server it's just a server that allows me to centralize my data and then from there i can just take it out and then we have also the shire and the shire is just a windows

environment which is trying to replicate i would say like a very very small environment but the good thing about that the shower is that it's very modular so you can actually deploy only one computer one domain controller and then a couple of things more that will allow you to you know get the data out and then push it out for example and if we want to talk about a little bit this environment once again all this is deployed by the azure resource manager services this just allows me to share a template with a community the community can just click on it and that just gets sent to azure and then gets deployed in azure and then you

can use it directly um i was thinking on doing some you know some terraform but to be honest i'm very busy with a lot of a lot of things going on too with a lot of different projects that i rather just focus on one thing and then um you know try to understand also the the ins and outs of like new apis and things that are coming out like you know almost every month and it's like a new feature so i'd rather just not wait for other features to be enabled and just start using this thing directly so the data collection from the shire is pretty straightforward and there's just a couple of things that of course might be a little

different from what you probably would like to do but in this case we're using windows event forwarding from every single computer in this environment and then push it to a a windows event collector now all the policies all the windows event um you know the web configs for example all those configs are in blacksmith so you can just grab them from there and then you'll be able to see exactly what are the specific things that we're collecting all the data gets sent to the windows event collector and then we have nx log you know community edition sending data to logstash and then logstash automatically sends it to something called an azure event hub and then from there i

can just pull it out directly and then just you know push it to my repository so as you can see there is like you know two different scenarios right a local scenario that you can use and then you can also use this other infrastructure so it would depend also on how complex you want your scenario to be and i assume that a lot of people will start with the basic powershell script on their own vms but just to let you know there is also environments available that you can use so once again that's just the flow that's pretty much how things will go if i try to create something in simulink and as i mentioned before all the audio

policies um all the for example like powershell logging how do i enable all that it's it's in blacksmith so so you understand what data is actually being created and how is it being created so if i were you and if you want to contribute something from your own vm i would definitely you know take a look at these configurations and see what we are considering good configurations to enable telemetry right because you don't just want to start a fresh windows 10 computer and then say run my attack and then expect something to happen right so um there will be some configs that will allow you to have some more telemetry then we get to the cycle right so this

is also something that gets applied into the data sets of mordor so i have a a project with a lot of different cycles and this gets applied to every single computer um on the model environments so once again if i were you and i am uh documenting my own vm i would definitely take a look at those uh scripts as well that one in particular uh pretty much are rules that allows you to track when somebody queries for example um yeah queries those values like those register keys and you know for those that are not familiar with those uh keys those are the keys that would allow you to um to actually calculate the key a cis

key to to access the uh for example sam database right the same file so you need to have that c key in order to access that and also decrypt the contents of that and try to get information like hashes so those registry keys for example are not being audited by default right but if you audit those it might be an interesting um you know exercise for you as well to know what processes are constantly querying those those things so as you can see there is research behind the telemetry decisions and that's something that uh you could definitely also use on your own vm as well and as i mentioned before there is also cycles and services

and this is very interesting because if you're not doing this in your environment i would i would highly recommend you know for you to do it um uh for example like in the dll hijack example the service that was vulnerable and the service that dwight was talking about it was the ikea ext so what i did is i just enabled this audit policy and as you can see some of the commands in there i just grabbed that service and i apply a i just added this entry on the specific access control list and i also do it for ac manager once again because i also want to track uh you know who is doing what now you might

be asking yourself okay but i don't understand what this uh things mean right and believe me it took me a while also to you know to figure out like how i'm going to be uh playing with these settings so i also give some of these uh strings on the configurations for mordor and as you can see like there is a couple of things that are you know related to services so i want to track when a service starts when a service stops and you know it's query and all the stuff the interesting thing about this uh cycle for a service or for example for sem is that at the bottom you can see that there is this

wd and nu stuff if you track the nu like if you you know set up the the specific accounts that you want to track and and you set it to nu that's going to track um all these accounts that are coming from you know over the network so if somebody is querying your sc manager for example um you know over the network you know remotely from another computer potential lateral movement exploration probably um that's a very interesting use case because it would be nice to to know who is querying or who is touching the sc manager overall remotely not locally because that happens a lot locally right but if you set it to to end you you'll be able to capture that

telemetry so that's also another you know data point that is very interesting and actually a lot of organizations um have talked about it how they can leverage sa manager telemetry for example and that's just one example that gets applied to also you know the model environments you know the ones that are deployed as code and once again the configurations are there so you can just take a look at them and you know those are what they look like and pretty much we haven't sent as much yet you know we just kind of like collected the data configure it but then we have to you know send the data out and of course once again the destination is an azure event hub

and you can see all the configurations also in the environment so so you know of course you know if we talk about every single configuration we might go through the whole presentation but at the end the idea like this image is is very interesting because you're pretty much seeing the amount of data that can be flowing in real time and data that you can start collecting and at the end of the day i use this tool called kafkacat in order to collect that data and that too pretty much what it does it just connects to the hub grabs the data and then pushes it to a json file nothing complex like if you actually go

and read about it it's pretty straightforward and it's just a client that you can use to gather data so one quick example is this one uh there you go oh let me just click in here sorry i was clicking on the on the on the other screen so as you can see on the left we have already a compromised computer but on the right we can run the the specific command and and you can see there is just data flowing and this is just like real time like this is data being collected from the pipeline and it's just data that i can start pushing somewhere once again if you do it locally it's pretty straightforward

too but if you want to also capture it in a pipeline mode that's the way how you know i would do it for sure so let me just move to these here so the other one was just another example um you know it's pretty much the same but it's just pushing it to a file and you know which is making sure that i'm uh you know looking at the time as well but in this one for example a lot of people are asking me what about network data what about pickups and i actually use um network shell on every single computer that i want to collect data from as well this scenario is when you want to do it

for example locally on your own vm if you want to contribute pcaps to motor this is something that you can use in order to get the data um you know that pcap actually the moment you do something that will um you know flow through your uh you know flow through your nick for example if you do a lateral movement technique it would be nice to capture the pickup the network traffic in a pickup for example so you can use that command if you want to contribute a pcap you know for the model project so at the end of the day you have to use another tool which is called the etl 2 pickup ng and then that's going to

point to the etl file that gets created after using network shell and then that's going to give you a big gap that you can just start seeing in word shark or that you can contribute directly to the modal project so if i were you and you're trying to contribute to a project as well and you have a vm and it's easy for you to execute something that might make a network connection and you want to share that data just you know run those two commands and then you'll be able to share that data there's another option too through azure and this is something that you can just also read about uh more in the link below but that's just

something called network watcher agent the difference with uh the other things that you know other things that you might have in your endpoint is that this actually is just an extension that gets installed on your azure vms and you can just collect the data as if it was just another device in the middle between your computer and the rest of the network so you can just capture uh data through those commands and once again once you have the slides and you can also read at the bottom uh there is more information about it because i wrote a whole blog post about that specific thing and at the end what it does it just you know pushes

that pcap to your to your network you know watcher for example in here on the top that you can see that it's running and then at the bottom you're just gonna see it in a actual container um you know where you can easily download it and then just use it the advantage of this is that if you have an environment with like 20 computers let's say or 10 computers you can just you know run this command and you know point it you know point that to you to several computers and then you will be able to get all the pcap from all those computers and that of course is if you are not using any other

things to capture network data so this is just an option another thing that i use as well all right so that's pretty much how the model project is it's maintained how the model project is actually uh being shared from an infrastructure perspective and also from an organization perspective like you know how the data sets are are organized and all the stuff but what can you do with it there's a lot of things that you can do you can go all the way from trainings from job interviews for example you can use that to actually interview somebody because somebody could say i'm very familiar for example with lateral movement techniques great there is like 10 to 20 different

techniques where we have data and we can just push that data to a location where you can just tell me what it is what it is that you could do with it and at the end of course you can validate a couple of uh queries and you can contribute to other projects and that's pretty much also what i wanted to share i mean this uh presentation so what we're doing now is we're just pretty much skipping a lot of those initial steps in the whole methodology and running directly to the analysis and sharing of the data and that's also the goal how can i just share data so somebody else can use it and also benefit learn and at the same

time if there is not a rule created for example in the sigma project you can use it to create a rule and you can use it also to validate the rule so for example going back to this specific example like i ran through this and now i have a data set in the model project so now i can just grab this data set and i can start you know running some queries on the top of that um i use different tools to do that one of them is jupyter notebooks and there is a lot of information about it also um in in google like you know how to deploy and all this stuff um i also have a a specific website that

you can uh uh go i can just do it since i can i can see that you guys can see my uh my browser tabs but if you go to infosec jupiter book for example like i also put together that project so you can get familiarized with it um okay so for example in this role i was able to to to query for example any indicators that the sc manager was being query for example and in this case of course as i you know said before um you can track if there is a network log on but if you're just tracking any interactions with your sc manager for example you can add context from another event like 4624 from

windows security auditing you can do logon type 3 you can join those those for example um you know three events and then you'll be able to get to actually users that are interacting querying the the essay manager over the network for example so that's something that you know could be validated right away with the mortal data set and you don't have to go through the whole process just to emulate it or simulate it again and of course for those that are curious to know what those uh specific access masks or just access rights mean you can see that in this data set we actually capture the behavior that we were talking about before just by monitoring the sc manager

right you know connecting to the sem querying it starting the service stopping the service and that's pretty much what we had um in in in this whole methodology of this specific technique so that was pretty cool right because we can just you know get the data and start validating analytics this is another example that i also collaborated with the team from ndsec labs i mean in this one they were using for example code execution via dcom excel register xll and this was a very interesting one because you can actually see that interaction uh for example you can see like rpc being used and then you can see the request to that specific class and then you can see at the bottom that

you know the value also is calculator or calc.xll for for example so that was also something that i started playing with and i was able to create these rules as well where it tells me that a a process for example which is excel if excel is launched with the automation you know arguments and then the parent it's also you know running the com launch as part of their arguments that could be an indicator of excel being used automatically and it could be an indicator also that is being done over the network so once again this data is available so we can just download it and start creating rules and this was one of the contributions

very basic contributions for the sigma rule project which i grabbed this telemetry i went through the whole blog post and i was able to share that with a community as well this was another one wmi also dll hijacked using that specific dll which is called wbam.com and it has a similar behavior as the other one but this one has a little twist because if we execute wmi and this dll is just a plain dll waiting to be executed that would be harder to to to actually um you know mimic what the wmi execution would do so you have to add some also like proxy some of those exports to that specific dll so all that is being done as well

through mordor we have another rule and we have no time oh sorry all right 24 minutes yeah that's when the next talk starts oh okay okay sorry okay sorry that that was uh you were scheduled to finish eight minutes ago but you know we're like we enjoy the enthusiasm we enjoy the uh so if you want to bring up one last slide um that was pretty much like the the end so the end will be how to contribute and you can just go to that link at the bottom the slides are going to be shared but at the end the only thing that it's left is just you can join the discord server and we can talk more about um specific

model data sets and just contribute to the community so that was it that was just the last thing and please please please hang on and jump into our slack channel as well people can ask you questions there too um i do have one question for you very briefly before we move on what do you think of the amazon remake of lord of the rings well i haven't haven't watched it but um having seen what's going on with that thing but i'm a big fan so i will give my feedback when i see what's going on there but i haven't i haven't seen it okay okay replied like a true consultant um so roberta thank you very much indeed

absolutely fascinating very very uh deep knowledge there folks everybody go to uh the website there and contribute and take part use the tool um and uh you know unfortunately when it comes to uh the uh the end of the talk you shall not pass another two minutes so yeah so so um uh roberto thank you so much and you get my virtual round of applause for uh thank you thank you sir all right that's it take care bye everybody bye

Sharing Open Datasets with the World to Develop Detections from Home | Roberto Rodriguez

Related talks