Automation Plumbing - Ashley Holtz & Kyle Maxwell

Name: Automation Plumbing - Ashley Holtz & Kyle Maxwell
Uploaded: 2016-08-30
Duration: 25 min 6 s
Description: Automation Plumbing - Ashley Holtz & Kyle Maxwell Proving Ground BSidesLV 2016 - Tuscany Hotel - Aug 03, 2016

BSides Las Vegas25:06328 viewsPublished 2016-08Watch on YouTube ↗

About this talk

Automation Plumbing - Ashley Holtz & Kyle Maxwell Proving Ground BSidesLV 2016 - Tuscany Hotel - Aug 03, 2016

Show transcript [en]

hey everybody Welcome to proven ground um a special thanks to verse Bri productivity uh tenable and Amazon and source of knowledge uh for helping us be here and uh would you like to take over ladies and gentlemen hackers and mundanes okay that's more that's better right that's right okay who here does not appreciate modern plumbing okay does not appreciate it okay well then I never want to go near your hotel room because you've never flushed we're going to get started today listening to the incredibly intelligent Ash I signed up to Mentor this talk like okay good I work in some related stuff I'll be able to help this person no no it ended up being I was like crap I got

to write this down and do this stuff so without anything further Ashley Holtz talks back automation Plumbing because everybody needs to flush their data

I will in fact do a speech so uh I do automation stuff uh primarily in my job so I'm going to give us to talk about my job which will be fun for all of you so I'm talking about Plumbing I'm not talking about IR tools and you know which ones are good which ones are bad there are a lot of people who know a lot more than I do about specific artifacts so I don't want to write my own thing most of the time sometimes I do at a necessity but usually it's actually connect the things so I'm doing a connect the things talk and some ways that we can do that a little bit better

um there are a lot of great tools available already they don't connect well out of the box and I eventually want to see people waste less time running stuff over and over again so here is my ideal workflow for a sock so think about this in terms of your in your own company this is not in terms of like me being a consultant going out to to 50 different companies this is inside of your own company ideally you find out that there's something going on on some endpoint and you figure that out through some user report through some Sim one of your 500 things that you already purchased in your environment who owns 500 things I certainly do yeah so you

put it in a queue we have some metadata that's common to all of these things if there's something going wrong you probably know the IP address of where it's going wrong or the host name I would hope cuz otherwise how are you going to remediate it um you you should know some information around the event user what type of alert it was and it should go in a queue from the queue you should be able to say okay you have your level one guys usually and they're going to say this is BS this isn't BS so if it is a false positive you want to be able to adjust your detection rules and I have adjusting detection rules in a

couple of different places because once we learn more we always want backward propagation and that doesn't always happen in a lot of security settings it's we're going to wait for somebody to give us new signature or something and does that work well for anyone ever not really so if we find new things we want to learn more about the things so once we've gotten to the point where this isn't BS we we want to investigate this further ideally I would have an analyst who can hit the button and say run these commands that I would probably run so if somebody clicked on a link from a fishing email I want to say oh okay let's see what outgoing connections

are happening let's see what's running on the box right now so like what stat I would want to be able to do that remotely so um that's what I mean by triage metadata command output so commands are the commands that if I were sitting on that person's box what are the things that I would type on that box and then metadata would be things that we can easily query from a built-in API so Linux is very friendly to us we can usually get most of the things from the command line Windows doesn't work that way sometimes we have to um open up C and hook some kind of Windows API whatever sometimes from there you need

to Pivot and collect a little bit more so I want to connect a range of events from an mft instead of just I want to know that this thing is on the box I want to know what else hit the box at the same time I want to see a journal I want to see a bunch of other things so that's artifacts for forensic artifacts for people who do IR in here um sometimes we want targeted Network capture I put those on the same level because it's either pivoting into getting more information about Network commoms in general like okay we know this bad IP address is being called out from this box let's see if anybody else

is calling out there so it's getting more information that way or it's I'm getting specific files from this specific host based off of some bad thing that I found and then from there you might want even more targeted files like let me get the actual piece of malware which there aren't a lot of endpoint agents that let you just willy-nilly pull a file that you feel like pulling but they shouldn't do that uh because that's not very safe and you can debate that with me over a beer later uh so once you have all that information the last piece and the most important piece at least for me is standardizing and aggregating being able to put them into a central location

because I don't want to just say here's the outputs from 50 commands you need to learn how to look at these 50 different things No it should all be in one place it should be really easy to query it should be so simple that you hit a button and you find evil laugh at my joke there's no find there's no find evil button come on nothing can replace a good analyst and that's another reason why I call this the plumbing piece of the automation this is not analysis autom because otherwise I wouldn't have a job if we could automate the analysis away um sometimes you just want to parse something so Linux web servers they give

me nice logs there is a nice way to parse those many established open source and Commercial products can do that whatever sometimes we need to process them so something that's stored as like binary clusters something that's stored as um some like decently well documented artifact like we know a lot about pre-etched files but if you just open one up it's like the heck is this um and then if we wanted specific files do we want to store malware executables just in a file server no the malware analyst says yeah but uh I usually want to off escate those and code them somehow just store them as as blob who knows um but I want to have a place to store that stuff so

it's I figured out there's something going on on this box it's weird I get more information the types of information I get should be aggregated in some sane way and if you wanted to write your own tool as an analyst you can say okay I can hook into this I understand the output I can do more with it I can write rules to look for evil things that I know about or I can just download the results from specific tools and there are a lot of tools that do this or you can make your own tools so I'm giving you a make or buy decision here for the one business major in the room yeah uh so buy doesn't mean you're

shelling out cash for this buy means somebody else made this thing and you're just using it so a well documented open source project that's been Community vetted why don't I say oh just find some code on the internet that's no no I'm I'm saying well documented open source project that's why I call it something that you can buy it's something that's already been vetted something that's relatively well used uh so the other piece of this is you can make these so if you have an in-house programmer like me you can say Ashley go code the things uh but not all of us have that and not all of us have somebody who has time to do all of that

so there are ways to do a lot of these things but the last piece that artifact analysis Automation and some of the processing around that there isn't something that will run everybody else's tools and there's not one thing that processes all the things and we don't necessarily want one thing that processes all the things one tool to rule them all is not the right way to do this because one Creator is not going to know how to process all the things the best way necessarily so for processing automation this is after we've gotten artifacts out after we've gotten all the metadata whatever um the most automatic on demand fast way would be containerized clusterized processing we have lots of

Machines working in parallel super simple maybe they're all running the same process and you've got this queue that feeds them that would be so nice to have and so usable but we don't usually have that a lot of companies are actually way down here they've got a local VM you're on your MacBook here you spin up your sand sift and you run all the tools that you spent a week learning how to use manually well that's great and if you want to build lots of hours that's a really good way to do it but if you want to provide a product that's useful wouldn't it be great if you could just get the results out nearly

instantaneously and then have all of that and spend those six hours doing analysis instead of uh running tools I I think that's a good way to do it um some of the for-profit uh distributions I I'll call them for-profit x-ways on case they have processing for some artifacts built in and you can extend those as well um I don't consider that necessarily uh heavy Automation and I don't put it very high on the scale because again it's one Creator and you still have to plug things into it so I'm going to show you a couple of Plumbing solutions that I use and I'm very open to healthy debate but I do have a lot of material to get through so

I'll ask you to save questions comments concerns till the end so uh crowd strike has a crowd response uh freear tool not trying to sell you anything you know I don't care if you use it or not but I like it okay um you we also have Python scripts that we use internally uh we have apis that we use to insert stuff into databases so I'm going to show you what step of this workflow each of these things would live on and I'm also going to release these slides along with that GitHub repo that I had in the first slide so you can mess around with this as well some of this is super hacky this super super hacky so many

collectors including uh crowd response have a place for a configuration you can embed a configuration In Crowd response well the way to automate that is to use resource hacker and replace the resource where that config is with your configuration I got a cringe yeah super hacky um so before we deploy crowd response it it does run a lot of things that you can get through the windows API um and and it it outputs things that you would get from running commands on that machine it's just a bunch of them all in one distribution and it's a fairly small binary which is why I like it um and you can write a text configuration and embed

it there so this would be this metadata triage kind of the first thing that I would deploy it's not a get all the things it's not a get a full triage image that'll take us forever and be Giant and move over the network we don't want to deal with that quite yet so this is just a little easy tip um you have to be very careful to read the Yola of the binary that you're modifying do not just open up crap in resource hacker replace things and deploy it in your environment read the Yola none of you are going to read the Yola that's okay um the next piece of this is image processing um so if you have a hard disk

image which we occasionally get or if you're writing something that's going to run on the machine and actually get out forensic artifacts as opposed to stuff that you would run a command to get the information so I wrote a quick dirty extractor to demo how to do this there are a lot of things that can extract artifacts I'm not saying this is the best way to do this it's something to demonstrate a point and my point so you can't get the metadata uh you can't get artifacts and you can't get individual files my point is that I want same parsible Json output from every piece of the stack so how many of you have encountered a really great forensics

tool that does not give you San parsible output and you have to write something else to process the thing thank you for writing the tool like that's really solid of you and giving back to the community is really great but let me use it so um XML is fine too but I really like Json um a lot of our automation is in Python Json converts pretty easily into python dictionary so that's why I choose Json but really St parsible output that everything can process from start to finish is so nice and it makes my life so easy and a lot of the automation I end up coding is how do I make this thing able to be inserted into another

thing and I spend a lot of time doing that and I don't want you to have to spend time doing that so the little collector that I wrote basically puts all the files files in a single folder and then has this Json manifest and uh this is probably quite small print but you can see this rule field so evtx this is a new style event log right so the rule that triggered this is the kind of thing it is I don't have to read the first couple lines of the file to see what it is is which is nice um we have a hash of it uh a Time C time so Mac times sizes which data stream it was because

sometimes you have alternate data streams and if I'm going to write write a parser to go against this I want to parse the right data stream so it's it it makes my life a lot easier to have that so if you don't have San parsible output please code sane parsible output into your stuff and save somebody like me a lot of time uh the other piece of this is processing Automation in terms of using other people's stuff to insert your things into some big data solution uh I like the elk stack pretty good A lot of people use the El stack uh log stash is fairly friendly for most types if you can write a good log stash configuration

for it um but the main thing here is if other people's stuff can't pick it up easily you're doing something wrong with your output so this is part of the standardizing and aggregating and the processing and parsing piece of this so usually our executables or our tools for parsing pieces of our images or parsing specific artifacts they come in two flavors there's here's an executable it's freeware like crowd response we're not giving you the source code for this but you can run it or there's a python Library so how do I make everybody else's python library runable in a standardized way well here's the answer at least the one that I came up with um

just a basic interface type of object so python doesn't necessarily have a straightup interface class I just create a runner class and you can IM inherit that Runner class so I have a very simple in knit you take in an input and an output and then run the actual python Library so this is more of an automation piece and it's running other people's code the reason behind this is I don't want to have 5050 different uh rules like event logs there's about 20 different tools that exist that parse those do you think it's all somebody else's tool. run to process it no they're all completely different but if you can standardize it this way or

modify the source code and then contribute it back Upstream to fit into this Paradigm it's going to make it a lot easier for other people to run it and integrate it so just simple simple class so I have a run method and I have a quit method elastic search and Splunk Plumbing how many of you use elastic search and how many of you use Splunk and how many of you don't want to talk about this so um I have a couple of treats just for bsides and everybody else who's going to watch the video I suppose um I did create a log stash configuration for crowd response so for the output after you after you run the conversion tool it

takes in the CSV um and I also uh released a configuration for crowd response so remember that resource hacker bit where you can embed a configuration in there here is the configuration that you embed of course we're going to test this first before we implement this in a production environment yes we all look like trustworthy people none of you do so the latest crowd response is on crowd strikes Community tools page uh freew wear have fun for log staff don't try to use uh don't try to use the elastic search API to convert your data and insert it I made this mistake uh fairly early when I started doing automation engineering don't try to use

an API that just sends events through a socket that's going to be much slower than using tools that are created for listening and throwing events in so like log stash for instance you can have multiple workers going at a time it's not just going over this uh this one socket and possibly just randomly closing sockets so don't try to reinvent the wheel if there's a thing that can ingest data twist your data to fit the thing that will ingest it I know this sounds completely logical I spent a long time remaking log stash essentially um also with log stash if you specify column names and mutations for data per uh per type I usually do based off of

the file name uh it's going to make your life a lot easier and it's actually going to process a lot quicker from the experiments that I've done and now for something really hacky so this is my gift to you um my my mentor Kyle asked for a graph here's a graph um man bar chart so um there is a Splunk SDK for Python and it is a streaming sockets uh it works pretty well it's well documented um I found out a much faster way to download results from Splunk so one of the automation steps that I get asked to do a lot is download stuff from Splunk based off of this query and then do other processing on top of it that's

going to be slower inefficient and Splunk so my preference is to do this direct download approach the code is on my GitHub I'm not going to walk through it because well I don't think you want to sit here for 20 minutes and watch me walk through code right so um it's when I tested it it's much faster if you want to test it and give me your test results I would appreciate seeing those I also gave the benchmarking python file so you can actually run this Benchmark yourself and say Ashley you're full of crap or yeah that's true um so please have fun with it um the last thing that I really want to say is uh the automation Plumbing a

lot of people don't think about this when they go into incident response or forensics they say oh well we're going to have a bunch of tools and we're going to train our people to run those tools really well and interpret the output of these 50 different things and that's great for helping people understand what the artifacts are how they work but after you've been doing it for a couple of years hitting go on 50 different tools is very irritating so with the combination of writing python rappers around other people's python libraries um which I have an example uh in the repo or um creating output that's usable in other places so like that Json output

uh will will really make your life a lot easier in terms of automating uh the analysis steps that I think do lend themselves to automation like statistics and then the other thing too is if you're putting everything in a standardized place and you're doing hunting today that back propagation if I find something new today I also have a bunch of data from previous runs so if I find a new evil thing now I can go back into the old data and say okay well now let's see if that old data existed elsewhere maybe we have more to remediate than we thought we did and you can see oh well that's actually been here for two years we're just seeing it

now what the f um standardizing in general will speed up processing but really for me as the programmer in a room of Consultants most of the time at work uh I want things to be maintainable these guys love to create scripts and I like it when people create scripts I like it when people contribute things to open source Community that's very important to me but make it so that somebody else can reuse it or plug it into their project too um San output should connect all of the processing steps and there are a lot of Community vetted tools and workflows um there's G there's autopsy there's Plaza log to timeline there's so many things like ma

meta projects I'll call them that can do a lot of forensic things and to make use of them most effectively along with those little one-off tools that are really good individually uh you have to write a few Plumbing steps so hopefully that answers what automation Plumbing is hopefully now you have a better understanding and appreciation of modern Plumbing this is where you laugh at my joke again yeah this a good joke um and uh and hopefully uh some of the code that I've released will be somewhat useful to you uh I'm happy to talk about this all day this is my job so um if if you've messed around with any of this stuff and and you'd like to tell me

better ways to do some of this i' I'd absolutely love that um this is my Twitter handle I'm the Code Monkey I have my GitHub repo um and and I'd like to say thanks to Kyle Maxwell he was my mentor for this and I would not have had the confidence to come up and talk to you today without the mentor program and Kyle so if you see this guy give him a high five or buy him a beer

yes questions yes uh yeah uh I know the uh Intel announced an initiative where they wanted to get together with other security vendors and have people start writing to Common apis and to orchestrate better um do you know about that and can you comment about where that progress is going or what your idea on that is I mean I'm not involved with it I like that people are talking about it um but especially let's go all the way back through the slid so you see where you have all the things this alert aggregation I would really liked somebody to write just a nice open source out of the box ready to use aggregator and CER or assigner that

would be really nice to see if I were to pick where that would go and then um these little glue pieces between different forensic tools would be really cool to see so most of what it's going to do is just take existing vendors and try to make them work together and not necessarily go with open source yeah and I mean I that probably wouldn't be my favorite approach for it I I I like to give things away open source style I like people to criticize what I put out um and I do think forensics in general it kind of does lend itself to open source examination because just because I've observed something 20 times maybe you've observed

it one more time and it looks different and acts differently from how I've interpreted it so I don't like um I I don't like saying that it should be one close Source opinion but but yeah that's super cool yeah in in your workflow steps I don't think that the the more data step or the the feedback uh loop has to mean that that um folks like us lose our jobs but um I'm just interested to know have you experimented with automating this feedback loop to improve the tool um based on the simplest based on whatever data you're getting back about yeah this is real this is fake yeah so um we actually have uh it's usually just

feedback from an analyst and they'll say yeah this makes sense this actually was bad and then that will get fed back into like a hash table or something like that if I were to have like my own socks I'm I'm in Consulting so I can't really go and just collect random things from a bunch of client environments for fun but if I had my own sock where I was just writing all of this individually I'd love to build automation around okay once on this workflow step the analyst has checked that this is bad and this particular line of output is where we found the bad thing because it's standardized perhaps we could take a path pattern or all the information

that's associated with that and put it into the bad area but yeah I uh in Mass have not been able to experiment with that just informally have a little bit El have any questions anybody three two no okay okay well thank you Ashley thank you

guys

Automation Plumbing - Ashley Holtz & Kyle Maxwell

Related talks