Using Math to Speed Up Security Assessments of Windows Executables

Name: Using Math to Speed Up Security Assessments of Windows Executables
Uploaded: 2018-04-25
Duration: 31 min 14 s
Description: Cole Thompson - Using Ancient Math to Speed Up Security Assessments of Windows Executables This is about greatly speeding up risk reduction when evaluating Windows programs. Reverse engineering binary programs tends to go one of two ways: either a quick static analysis using utilities like "strings

BSidesSF · 201831:14133 viewsPublished 2018-04Watch on YouTube ↗

Speakers

Cole Thompson

Tags

CategoryTechnical

StyleDemo Talk

Mentioned in this talk

Tools used

IDA Pro Immunity Debugger strings

About this talk

Cole Thompson - Using Ancient Math to Speed Up Security Assessments of Windows Executables This is about greatly speeding up risk reduction when evaluating Windows programs. Reverse engineering binary programs tends to go one of two ways: either a quick static analysis using utilities like "strings", or a time consuming dive down the rabbit hole monitoring API calls or debugging with tools like IDA Pro. The payoff from reverse engineering can be great, *if* one picks the right targets. Over several years performing assessments in a highly regulated environment, often under pressure, it became imperative to quickly triage Windows programs to decide which are worth the effort. I found no tool to do this triage, so I gradually developed one. Eventually I settled on applying the math of Euclidean Distance and Bayes Theorem to static metadata taken from Windows executables. This can identify within seconds which executable (out of dozens or hundreds) to focus on. That triage used to take hours or days. I will demonstrate the tool, give a couple of success stories (anonymized by necessity) and explain the learnings from its evolution. The underlying approach can be applied by individuals with slim resources to many areas of security analysis.

Show transcript [en]

[Music]

okay great well thanks very much for coming and although the title here says speeding up windows security assessments with ancient math the reason I'm here is that the algorithm I developed and the tool I think can be useful for a lot of different things and I know typically when I go to a security conference I hope for two things one is a good idea I can use back at work and then the second may be a tool or two that I can also use back at work and I'm pretty sure I'll be able to give you both of those here today so what I had in mind for the next 30 minutes or so was to step through

about four things here first is uh the problem or what mom got me going down this path you know we say in open source circles what's the itch you're trying to scratch that's really what I want to address briefly then I wanted to show some intuitive examples of the solution at work and I figured rather than dwelling on the math because that could be a little tedious really I wanted to show more intuitive example of the sort of results that this will get you and you can sit aside for yourself if it's something you could use then of course a quick demo and follow up with future plans then one quick disclaimer regarding my employer so as it says here

I work for Kaiser Permanente in a cybersecurity capacity but for the purposes of this talk that's about as far as that connection goes Kaiser just wanted me to make clear that what I will be talking about is not at all a official Kaiser algorithm or tool or anything like that it's it's kind of the opposite really so just keep that in mind so with that out of the way let me talk about the problem that kind of got me working on this and brought me here today so in the security work that my colleagues and I do we do a lot of things we wear a lot of different hats one of the more satisfying things we do

for me anyway is when we're dealing with compiled binaries especially Windows programs and there's a couple of reasons why I find that particularly satisfying one is that it's difficult and anytime something is very difficult and then you're occasionally successful at it that's that's always very rewarding the other reason that I find this very satisfying is that it has a huge potential impact so in a place the size of Kaiser for example we have about a hundred eighty thousand employees and so when you're dealing with things like client programs that are being pushed out there if there's some security concern in one of these things the potential impact can be pretty big and it's um you know honestly it's sort of

gratifying their demo that it tends to be one of those jaw-dropping things when you say look at this this this is what can go wrong so working with compiled binaries is great one of my favorite things but and this is a huge but I'm putting in here doing this sort of work is immensely time-consuming and you know that's not just my own perspective people who are also good at this will tell you when you're doing things like reverse engineering or you know heaven help you you're searching for an exploit upon some kind we're talking really something in the realm of days even weeks as you're you know getting deep into a program trying to find what it is

you're trying to discover or you know get it a pop or something like that so it is a huge huge time-suck I just uh with a slight nod to the whole steampunk thing here Lewis Carroll so the we're not a security research shop and so as you know the guys saying here you know I ain't got time for this we're typically under a lot of time pressure and I would say our work flow honestly it's sort of like it's as if you have a fire hose that where they turn it off and on every once in a while we do get some lulls but by and large you know the stuff is coming at us

pretty fast so we have to be pretty smart about the work that we take on and that includes reverse engineering and exploit hunting and the last cheesy slide I promise honestly for me doing this sort of work I often felt a little bit like our pensive Alice here where I'd be wondering you know okay I know how long this is going to probably take and is it really worth firing up Ida Pro or immunity or something and really going down the rabbit hole you know yet again to see if I can get something that's worth my time some sort of security finding Arabian exploits something like that and the more that I've been doing this work over time the

more I have this growing conviction that there's a problem and that problem is there's a tools gap and typically when we use the tools made available to us when we're dealing with compiled programs no source code black box the tools available to us fall into one of two extremes we have either what I would call strings and friends on the Left it's very quick you know in seconds minutes at the most you can get some information if you're really lucky perhaps you'll see something like a connect string or maybe something that stands out as a likely credential and then you're done you know you've won but I would have I will assert that most of

the time these quick static analysis tools will miss what you can get out of a compiled program then um so when you're done with that you know what's left well then generally there's this huge leap where we're getting into the world of serious debugging and that's where we're firing up Ida Pro immunity what have you and that's where then instead of things taking seconds or minutes now the work ahead of you it's going to demand hours and days and even weeks I've heard of some guys taking months you know to get to the bottom of an exploit you miss nothing but there's this huge time gap and over time I became convinced or really craving

something to fill this tools gap and to me it felt like what was needed here was what I would call a triage tool and for me I concluded what this tool would do if I could find it would be really two things and one would be to ultimately to predict the behavior of a winders program which is it's a tall order but really we could boil it down to two things predicting the capabilities of a program that would be great and also if we could say in a sort of word cloud sort of way this program you're looking at looks awfully similar to these other programs you may already know about or perhaps at least we have a catalogue of

programs and we can say here's where it belongs because if you can predict capabilities and also say here are the peer group binaries that this thing seems to belong to that's pretty much predicting behavior and that would be great if we could predict behavior then we'd have the triage where I could say okay this thing is worth it it's worth climbing the mountain or going down the rabbit hole if you want to use that analogy or yeah you know it looks like maybe based on our predicted behavior it's probably not worth it so we'll skip it so I knew what I was looking for looked around couldn't find it there are some nice intriguing things such as

mandiant impasse the Japanese cert Center also has some interesting tools that kind of came close but didn't quite work for me I had a couple catalysts in my thinking and the first one was becoming familiar with the radar a open source debugger tool and for those who may not have worked with radar II it's it's a nice little tool it's a rewrite and unlike most debuggers it kind of embraces the command line interface and I would say the thing that really sold me on radar is that it embraces the UNIX philosophy of you know the standard in standard out standard error and once you kind of get used to working with that you it's really addictive because then

in a sort of looping or shell type way you can pipe things to make it do whatever you want as opposed to most monolithic code analysis tools they tend to be just that they're monolithic you know here it is learn it or go home kind of thing so I liked radar a and that made me think if I were to do this myself I would want it to have a radar a type of feel to it and now the second catalyst in my thinking and what really got me working on this and what get to here in a moment I was reading a fascinating story from the Cold War about how a KGB agent Yuri tatra he was able to crack the

identities of CIA agents on a massive scale and the way he taught ruff did this or what taught ruff did was he determined there were about 26 metrics or bits of publicly available information about all the US employees overseas and foreign embassies and looking at these 26 metrics and using a little deductive reasoning on tatras Park he was able to figure out quite accurately who were actually State Department people and then who were people who are probably actually CIA and it's an interesting story because of how well it worked um once the you know tatra is tagged the guys being okay this is actually a CIA person masquerading a state then the KGB could follow him for years and this

became a huge problem for the CIA they um basically tore themselves apart trying to find what they thought was some highly placed mole feeding authoritative lists of all their agents by back to Moscow and he wasn't until after the collapse of the Soviet Union that the truth came out which was you know there had been no high-level mole not for this anyway it was just that tatras technique it worked so well it was as if he had access to an authoritative list of all the agents overseas and it was an impressive story and it got me thinking so if you can use metadata in this way to actually suss out you know who's a secret agent even though they're trying

to hide you know maybe something like this could be done with Windows programs to be able to predict the behavior and I thought you know oh let's see if we can do this I wasn't sure but I thought it's worth a try and at this point then I decided to start hacking away on things just to meet my own needs and the idea I had basically was kind of twofold it's sort of in the spirit of tatra and other things I said well let's do this let's try to push static analysis of Windows programs to the limit and by that I mean just squeezing all the metadata that's at all useful out of a program that we

can hey just slicing it forwards and then a push math to the limit to say what can we do with this metadata and can we then predictive capabilities and also say here are the programs that it seems to belong with it's gonna behave like these other things you know about and then we can say we're predicting behavior without even having to run the program so my started work and getting the metadata wasn't too hard tried a lot of things but figured out what was useful and what was not it's about 20 things that were most useful and had an interesting time applying my own algorithms which was kind of fun in a geeky sort of way and they worked

decently well but I ultimately had to put my ego aside a little bit when I realized that there was a much better way to do this and what that better way is is it's a concept called Euclidean distance and basically everyone's probably heard of the Pythagorean theorem and the Euclidean distance it's basically that's the superset of all that the Pythagorean theorem is just one example and what I found really made this clique and my own twist to all this was in figuring out that in this particular case adding logarithms to the mix really made things kind of click into place and that's kind of what I'm most proud of in a way and I'll just

quickly mention here if anyone is wondering well Euclidean distance interesting you know why is it so elegant or you know what's the attraction I'll just prefer you to this website better explained comm does a beautiful job explaining you know why it's so elegant and sort of a universal truth kind of kind of way of doing things and let's have a couple examples here just so you can see what euclidean distance does and again emphasizing just an intuitive understanding so you can see if it's the kind of thing that might be useful to you now here's the classic approach 2,000 year-old or 2,000 year-old math here and we're just going to look at a group of countries because

everyone can kind of relate to that and here we're just going to say which countries are most similar to Spain and here we're saying in the classic approach well Spain and that seems about right we notice that India and China are extreme outliers they're way different apparently so okay here's another example again 2000 year old formula we're going to say which countries are the most similar to the Netherlands and when I say most similar what's going on here is we're using demographics culture climate a whole bunch of different things just a very fuzzy sort of match now here is where I had to stop and say I don't think the classic approach is quite right because

according to the classic math here at saying well Molly in West Africa is the most similar to the Netherlands and to me that doesn't seem right at all again we see India and China are just extremely different one more quick one here so same thing classic approach and you might notice that the index here is enormous you know one and a half billion again we're seeing the India and China are extreme outliers so what's going on here well unfortunately although the classic Euclidean approaches excellent in theory and a you know classroom setting when you're measuring things like countries or secret agents or windows compiled programs each element of that thing is measured very differently so you have things like a

percentage number or then you have you know a population number and it turns out with countries you may have guessed you know population numbers just tend to be big numbers and it blows everything up and that had me a little bummed out because the the proof for Euclidean distance is very neat and I wanted to preserve that sort of universal truth and whatever I was doing so I tried a lot of things and to make a long story short rather than going over all the things I tried and didn't like it turned out as is so often the case that one of the simplest things really worked by far the best and that was just to say take

the logarithm of each raw measurement and then feed it into Euclidean distance and as people may know so with logarithms everything tends to crunch down to smaller numbers everything basically winds up between 0 and 10 for most purposes and that keeps it all at our human scale while still preserved you know good accuracy and it worked great I tried it in a lot of things I tried it on some example programs I tried at comparing cars I tried to comparing countries it has this sort of universal it works quality to it and I thought this is great and I was surprised when I was googling around that there didn't seem to be much of an

exact match for this in fact I only found one clear example of where other people seem to be using this there were a couple of French guys in 2006 they had a paper math paper and they were using what looked to be the same approach but that was about it that I could find anyway and that's another reason that brought me here today I thought I got to share this this works so well on so many things and again for I don't want to dwell in the calculations there's links at the end that people can follow to their heart's content but for people who want to see the exact formula here you go I'm gonna call it besides SF distance

because I came up with it in a clean room setting so I can call it whatever I want I guess and it's basically the Euclidean distance formula except that each thing you feed into it you know X 1 through X whatever you just take the log first and that's all it is and it works great and to give just some quick again to intuitive examples if I like this so much let's revisit our country comparison again this is again the fuzzy you know all aspects of different countries we're gonna ask again with our updated but I call B sides SF distance which countries are the most similar to Italy ask that question again well here Spain again that's right now

what I like here is and there are many examples but this is just a couple of them is you can see that the developed countries kind of form one ledge and then you know Norway starts getting a little weird and cold and everything and then you get off to the less developed countries and then Mali is apparently quite different and I would say intuitively that's about right let's look at the Netherlands again this time with the updated formula and this really was reassuring to me one of you know many dozens of examples like a big Somali now instead of being the most similar to the Netherlands is considered the most different in Si yes that's you

know what we want to see we see that Germany is considered the most similar feels about right followed closely by Norway and so on then we have sort of a group of the developed countries there's kind of a step change as we get away from those countries into less developed countries anyway you get the idea it's doing what you would expect it to do if a person was telling you well these are how these countries compare and just you know for what it's worth here's Egypt and again things have settled down a lot China and India are spread out much more evenly based on what they really are not just the populations so with that I had

enough to start making my own life easier and I started working on this tool which I have a github repo I'll show in a minute started in Python because you know python is always a good place to start I switched to go after a while because go has really nice abilities of doing parallel processing and I realized that this thing was pretty computation intensive I wanted to be able ultimately to spread the workload out over all the cores available to be honest I haven't tapped that yet but that's coming and go is fast enough that they actually um has let me postpone that a bit but go was great I highly recommend it it's been

good too because the work that I get to do on this tends to be it comes in fits and starts you know most of the time there's something else I have to do but go works well for when you have to come back to your code base you also need a place to store metadata secret lights awesome for that and the tool is designed to be drop-dead simple to use again we're talking about a workflow where we're in a constant time crush basically the more you use it the better the metadata becomes and here's what I would say is a rough histogram of what the functionality of the actual tool does for me today myself

and my colleagues and I call it WR you not because I have any passion about that but just I was thinking in terms of typing at the command line what are you file and then have it come back and tell me most of what I like about it is that it tells you you know in a word cloud sort of way what is this given program most seem to resemble and how close is that resemblance then also we can see you know what's the what functionality does this given program seem to have or what can we be sure it does also gives a swag assessment which is good if you're farming out work to junior staff you can say well just look

at the ones that have a big attack surface and finally the other thing it does I put in a while back is using Bayes theorem because there are times when a given executable just isn't yielding a lot of good metadata and in those cases Bayes theorem it's great because it says well given the metadata we have and the metadata we have on all the other files we have ever seen we can predict at least with some level of confidence what this new program is going to do and I'll try to show an example of where that sort of saved my bacon and when one piece of work so yeah with that let me try and switch to a

quick demo here of the tool now that we've gone over the underlying algorithm see if we can flip over okay good can people see that okay I hope good enough okay I have let's see if we can make it bigger yeah let's make this bigger [Music]

let's come on still nothing there we go yep all right that should work okay there's always something with the demo so um let me put down the mic for one second okay yeah sorry about that so here what we have and for a lot of reasons involving NDA and I don't want to get in trouble for it we have four files from what we'll call acne medical but these are real files from a vendor and not acne medical obviously what I typically like to do is I sort of loop over things just to quickly get some feedback and then be able to side which of these things is worth pursuing and which is not so in this case let's

suppose that I've decided you know we really only have time to dig on maybe one file and we're particularly concerned about networking so we'll just run it and pretty quickly we'll have some feedback and so instead of having to go into every single file I can say okay which one am I gonna pick here we're getting our Bayesian estimates 30% on one and so on it's okay and there we are so now I instead of having to spend a lot of time dig into each file I can say okay I'm gonna go 403 because it has the highest Bayesian probability of doing some networking and we know that it absolutely does thirty networking functions the others were not sure about

so both you know I've saved enough tons of hours right there at least quite a few hours so now let me just do one thing that I think will show sort of the word cloud type functioning which may be a good way of just seeing you know what kind of work does it do so we have cygwin utilities about 380 of them that's basically UNIX utilities running on Windows kind of show you that really quick so let's um just give you a sense of the kind of matching it does let's suppose that we had no idea what GCC does now we know that that's the new compiler but you know for now we'll just say we have

no idea what GCC is so it runs and it will tell us that ok this GCC thing whatever it is it's very similar to a C++ compiler the distance is very close and also it's pretty similar to some code coverage tools and that's exactly right so without knowing anything about what's really going on within the assembly code within GCC we can say looks like it's a compiler certainly has something to do with code generation and we can actually if we wanted to get a little bit more we could bump up the number of peers and we get even more and it also confirms that sort of code stuff oh one other thing too if you run the

tool without any options it gives you you know tons of feedback on how to use it it's painful looking right now when you do that but anyway it will tell you how to use it and let me back up one I wanted to also show the Bayes thing [Music] because this is actually quite handy in some situations so here let me clear this okay so here we have a mysterious file we'll say it's mysterious Acme viewer and again this is an actual vendor or vendor file just renamed because you know I don't want to get in trouble now initially what this tool will tell us is well you know it doesn't appear to have any networking what suppose we're

concerned about networking in this context however backup and if we add bays to the mix sorry typing one-handed here people take just a little longer because Bay's stuff it's a little math but here we go so now it's saying okay we didn't initially see any networking going on in this file but according to Bayes and they're all the other files are we ever seen the probability of this thing doing that working is about 38% and that's actually a lot higher than most files and it turns out that indeed yeah there is something to it this is a I'm gonna cheat a little bit here because this is a real vendor file and I know what's

going on just it happens to have some stuff in it that will reveal yeah and there we go and so although it wasn't immediately apparent Bayes did tip me off that this thing probably was doing some networking and indeed buried deep within it there was some stuff the strings just happened to work out that way but certainly first looking at the assembly and everything it wouldn't it wasn't quite that obvious so Bayes is great for that when you really need to be sure you're not missing anything it'll alert you to stuff that might be going on so let's see I think that sort of works as a quick demo let me see if I can slip back here so we can get our

workspace back [Music]

entirely okay there we go there's uh yeah so let me then just kind of wrap it up a bit here and just a comment briefly on how well does it work for me it's worked very well otherwise I honestly wouldn't want to share it um it's gone from being a hobby to something very handy and I'll just share that not too long ago this thing really made me look like a hero once at work there was a sort of a hurry-up assessment we had we're about 11 in the morning some VIPs dropped some stuff in my lap and said you know we need this as soon as possible it was one of those situations you know where things had

sort of an unpleasant vibe to them I'm sure you know what I'm talking about that way and to make a long story short in about two and a half hours I was able to sort through the problem space that had been giving to me and then come back and say well here's your answer and basically you know these three vendors over here they all have problems here's what the problems are this other vendor here looks clean they're the ones you want to go with and so that was something that might have taken at least a week probably more and I was able to get it done in about two and a half hours and it was really thanks to this

tool and I got a really nice email back you know from some senior VIPs and that was that was very gratifying so it's uh it's proven it's worth at least once in a huge way that way and I can share - as far as the tool goes it helps me sleep better just in terms of you know did I pick the right targets am i picking the right things to work on yes the other thing actually it's good for is that it was really good at play key getting any auditors you know if they say well why did you do this and not that you know the numbers speak for themselves so it's good that way and

yeah finally then just touching upon future plans so I will be the first to admit because this tool is developed really in response to my own needs and you know just in fits and starts as I've needed it to do things and figured out what works and does not work it's grown I would say sort of like the Winchester Mystery House and it badly needs refactoring I'll be the first to admit that and one of the things I really want to get going in it is go routines really the stuff that's doing now it takes seconds but it should be almost instant so I want to get that worked in with go routines one other thing and I'm calling

it the big thing here something that's always a problem with reverse engineering I think our dotnet assemblies simply because they they're servin usual if you pull up a dotnet file into say Ida or another debugger it's it's sort of frustrating you don't get the nice import information you typically do however dotnet assemblies are very rich in human readable strings they're a little chaotic but one of my big Fink items here for the futures to revisit that and say what can we do with all that to make it do a better job in predicting the behavior of dotnet files so and then I think we'll really have something very useful so and for anyone who wishes to try it out

there's the github repo and the readme file here it will tell you everything you need to do to get it up and running on a virgin Ubuntu 1604 instance and let's see we're probably a little out of time which is fine but the underlying algorithm if anyone wants to see another example of how you can use it if you say I just don't deal with debugging or anything like that there's a spreadsheet here that shows how you can use it to predict phishing victims and I would just say if it's at all useful to you if this algorithm you know that's some good work for you and you want to use it please feel free to run with it it's

yours I just hope it's useful to others and saves you as many hours as it has saved me so I think that's pretty much it thanks

Using Math to Speed Up Security Assessments of Windows Executables

Related talks