David Moore - The Aftermath of a Fuzz Run - BSides San Diego 2017

Name: David Moore - The Aftermath of a Fuzz Run - BSides San Diego 2017
Uploaded: 2017-01-17
Duration: 53 min 50 s
Description: David Moore - The Aftermath of a Fuzz Run - BSides San Diego 2017

BSides San Diego53:50196 viewsPublished 2017-01Watch on YouTube ↗

Mentioned in this talk

Tools used

American Fuzzy Lop ASAN Clang GCC GDB Valgrind

Concepts

MD5

About this talk

David Moore - The Aftermath of a Fuzz Run - BSides San Diego 2017

Show transcript [en]

all right thanks everybody yeah start here good afternoon my name's David Moore very happy to be here first I want to say a volunteer thank you to the volunteers and the staff and participants and everybody um when I submitted my proposal for this talk I was unaware that it was the first ever uh beside San Diego so especially uh appreciative to be a part of this looking forward to many more uh real quickly just a little few details about my background um been a software developer professional software engineer since 1994 had the opportunity to work for some pretty cool companies I worked in engineering sales uh Consulting and Business Development roles um one of the highlights in my

career came pretty early I got a chance to work at NX and um before was bought out by Apple and was a you know Fanboy of Steve Jobs since I was 14 or so so that was a pretty extraordinary experience and um I was for a few months one manager away from Steve my manager was the director of web objects Consulting web objects was their web framework that they had at the time and he reported to the VP of Professional Services my direct manager left so for a few months I was one manager away from Steve um but that went okay um I was a consultant I was a field consultant and I was making a lot of money so no

problems there uh got to work with some other pretty cool companies as well I kind of we got bought by Apple and that was one of the most shocking emails I've ever gotten you're now an Apple employee and uh didn't see that coming and I ironically I was done with Objective C though I was coding Objective C that was a language we were using for web objects and I I was like Objective C is going nowhere I just have to go to Java and for my career State and so ironically now objective see of courses in every device and so that's cool so and I worked for some other really great companies too um but then I kind of

decided to kind of move away from full-time employment and went into Consulting and um it's me Consulting in Indonesia nice and uh so I was fortunate enough to do that and then I had the opportunity to move into Opera I trained into Opera for a couple years and then I was able to actually have a short career as a semi-professional moar tener I kind of took that as far as it would go though I actually got kicked out of Opera for not being loud enough and um and really at the same time I wanted to get back into Tech too I was really interested in it I saw some pretty amazing breaches and hacks that were

going on uh earlier in this decade especially even starting back with stuck net I was completely blown away by that and then there have been some you know very high-profile breaches so I decided to go back into Tech and just completely Focus 100% on security and to do that I decided to just kind of train myself I worked with vulnerable web apps lots of reading lots of trying stuff um and then I I went into bug Bounty programs and found these very very useful um companies like bug crowd Cobalt and hacker one are intermediaries researchers and um companies so that you can report a bone and not be sued or interrogated instead you might get some

money and you definitely get public recognition which is really nice so I got a few of those uh full disclosure the Google was not Google themselves it's very hard it was I actually got a couple in a couple of different Google Acquisitions so I enjoyed that but then really wanted to go into something that wasn't so impacted there's so many people doing web attacks so often you report a vul and you get duplicated somebody else already reported it so and you get nothing for that so and I was always interested in fuzzing or C memory corruption I've been working in C for a very long time so pivoted into that started fuzzing with AFL American fuzzy

lope um and then I got so into fuzzing that I decided to start a company uh called fuz station and we do fuzz testing at scale in the cloud okay so today we're going to talk about how do you go from having a bunch of crashes to getting a point where you have a good chance of either determining exploitability being in a position to report that crash or if you have to fix the crash understanding where the root cause of the crashes and so we're going to go over quickly memory corruption bugs not everybody's working in C every day so I'm G to kind of review that uh go through the workflow that I've developed uh during my research to

approach these things and then finally wrap it up with a couple of examples that I found in my research

and so first of all we're just doing a quick review the kinds of uh memory corruption bugs we see what do we mean by memory corruption fundamentally it's really invalid reads and writs some one way or the other a program is coerced into reading or writing outside the bounds where it's supposed to be these are also called out of- bound reads and wres or o reads and writes you'll see often as well and fundamentally that's what memory corruption is one way or the other there's lots of different kinds but it's one way or the other it's reading or writing outside the bounds where you're supposed to uh there's a variety of different causes of these off by one errors are

extremely common I see off by one errors really being the fundamental cause of these bugs and probably about 2third of the vulnerabilities act unvalidated input as well um and stack overflows too even stack overflows uh and so yeah there's two places in a process that store memory that stack in the Heap two areas where memory corruption can take place and the stack is where if you just declare a variable within a function local that's going to go on the stack any memory that's alect um is goes into the heat such as a buffer or an object if you're calling Malik to create uh create an object that's going to live in the heat those must be explicitly freed by the

programmer as well any memory that you Alec must be also freed by hand and we'll look at a little more of that in a second uh Heap and stack buffer overflows are common and so we do still see quite a few of these um one quick note you'll hear sometimes people just talk about a stack Overflow um sometimes the words aren't used in particularly carefully but a stack Overflow is when you have out of control recursion if you're just recursing and it keeps going and you blow out the stack that's a stack overflow today we're talking about stack buffer overflows where you're you're overriding memory and so here's just a quick example of a stack Overflow if you have

a program that takes input from the command line uh allocate a buffer string buffer character buffer and then you use a known bad function stir copy uh definitely known to be an unsafe function Stern copy is what we should be using now and then if you draw in from the argument copy it into this buff you can easily overflow so here we're sending 12 Capital lays as an argument on the command line and so that's going to overflow the stack by four bites another very uh very common and now s off vulnerability is a use after free and this is a heap Heap vulnerability and it's like it sound is when a program continues to use a free

pointer so as we said when you Mal memory you get a pointer you need to free it if you free it and then use it again that's a u UF just like it sounds like these have a very good likelihood of exploitability still there's not many mitigations against these and so and there are quite a few out there so ufs are probably the most sought after heat vulnerability that that we're looking for at this point in time these show often during error handling you can get confused if there's an error you're handling an exception you're doing stuff there um maybe you're freeing it and it's already been freed or maybe you free it there and it gets freed again

later or just in general where it's unclear what part of the program is responsible for freeing memory it's not planned out very carefully it's very easy to to do this to use it out to use memory efforts and fre and here's a quick example a pseudo code we have a pointer 4.4 bytes get Malik to this pointer x uh we do some stuff with it whatever a bunch of code happens then we free it like we're supposed to explicitly but then maybe some more stuff happens and uh we wind up printing it uh that's a uaf perhaps it's during error handling or maybe we want to log that and um we forgot we already freed

it so that's a classic trial example of uaf program

and there's a few other kinds of memory bugs to talk about a double free or an invalid free is you think of it maybe the opposite or it's related to uaf but here you're freeing things twice you're freeing something that's already conre and that's hard to exploit but it can be especially in under race conditions in a multi-threaded system another very common memory bug is when a conditional depends uninitialized memory so if you say intx but you don't initialize it and then maybe you say ifx do something that's at least undefined behavior and potentially um hard to exploit but could be exploited as well and then one other memory bug that we're all definitely familiar with is

memory leaks and this is when you don't call free at all you just keep you all it and it never gets free and this is not really exploitable in overflow sense or to get um control of a program but it could be used as a bad guy deliberately to cause a do situation and completely continually and repeatedly do something that malx space to the point where the memory is exhausted and the process calls to a

Hol Okay so we've been talking about exploitability a little bit um I've using the word what do we mean what really is exploitability and one way to think about it is that it's reprogramming an application with input data and not code if we can trick a program into executing attacker controlled input data as if it were code that's a code injection exploit another way to think about it is that input streams become instruction streams that's an idea um from halar flake pretty famous programmer his real name is Thomas dillian and he just started at Google project zero their own white hat organization he started there in November very interesting idea and in in terms of these in this kind of attack

it's about controlling the instruction pointer if an bad guy can get control of the instruction pointer and by that I mean EIP in 32-bit systems and rip and 64bit they can jump to places and execute code now in this case we're talking about a code injection attack when they can supply their own code typically in the form of Shell Code that's getting harder and harder to do for a lot of reasons um and so now we see more and more reprogramming uh the application with existing code that's already in the process so because we can't get our own attacker controlled uh execution code in there instead we leverage code that's already existing and it could be code in

the application itself more commonly it's going to code in a library that's been loaded any G an application loads quite a bit of libraries and the idea is there's a lot a lot of code out there to use and um this is also called return oriented programming or Roop and the idea is you chain together lots of little bits of code to get what you want typically popping a shell this is also has an interesting name once things get to this point and you're able to reprogram uh a system it will it's called weird machine programming technically a term call it weird Okay so that's kind of what exploit exploitability is from my perspective um doesn't matter like why

do we care if something is exploitable or not uh there's a few reasons first of all I would say in a way no I mean you have to fix memory corruption bugs they should be fixed you don't want anything crashing even under the most unlikely of input um and you don't want it getting exhausted out of memory or anything else any memory corruption bug should be fixed however there's lots of cases or a couple where it does matter certainly if you are reporting the bug to vendors or maintainers of an open- Source software uh project you need to motivate them and make it clear why it's an important bug why it needs to be fixed they have a

backlog so uh you want to be able to tell them hey this is highly likely to be exploitable and here's why almost even more importantly from the view of the researcher you're kind of walking a fine line because if you have a bug typically you either file it with an open source project as as a security bug explicitly and that's private or you just file it on their standard bug bug tracking program and you have kind of be careful if you if you if you call something exploitable and a lot of times you kind of get a little knee-jerk reaction that it's not and sometimes they're right but you don't want to be seen as over

brising or reporting stuff of security when it's not you know Bo Pride wol kind of thing you want to avoid that but even more the other side of the coin is that if it is exploitable and you report it as a normal bug and it goes right into the bug tracking uh system which is public in a lot of cases you potentially just drop the zero day so you really don't want to do that either and so you walk a fine line and uh the idea is so that's really important to understand really how exploitable potentially what your reporting is then finally for developers in general if you're if you're breaking your own stuff um the developers

themselves might have a big backlog of B of bugs so they would like to know what's exploitable we need to fix that

first the next question is exploitable by whom there's a broad range of exploit uh development skill there could be uh you know I mentioned project zero a second ago this is Google's internal group they white hat group they check any software that a Google customer might use their purview is to look for bugs in it and and they have some very top people I think they they generally have about 12 to 14 people in Google project zero and they have quite a few resources behind them being part of Google um then you have this group they also have quite a few exploit devs that are very good and amaz and you know really extraordinary resources behind

them and then there's who knows who whatever whoever else is out there nation states other highly motivated groups even individuals can come up with pretty amazing exploits in Sometimes some cases so just to reiterate we probably already know security is never 100% we trying to raise the cost of the attack to the attacker trying to raise it higher than the value of the data or whatever motivation they have for making another point is that most modern exploits especially that you're seeing in browsers and things like that nowaday nowadays it's rarely just one bug it's getting rare and rare that one bug will really lead you to full exploit uh a full remote command execution and so

most of these these exploits are are few bugs a couple bugs three or four bugs chained together very carefully in a way that will reach you to that goal of fully exploiting the system so it become somewhat moot to argue or consider whether or not a single bug is exploitable or not when in many cases it could play a critical role in a bug chain even though it itself is not exploitable and then we have some bugs that are pretty surprisingly exploitable so it pays to be keep an open mind um specific example that was recently disclosed in Chrome OS there is a library called C Aries and it is a DNS resolution library and they had a one bite

overflow uh probably found with a fuzzer and you could write past it was a heap based uh vulnerability and you got one bite right overflow so you could write one bite however that bite was always just the digit one the attacker could not control it could not put their own data in and so it does seem unlikely that this would be exploitable or be very hard at first it was uh considered by the red hat security team to be a moderate security impact which is pretty reasonable I think but um someone and they are Anonymous found a way to to get a full rce full root control exploit um via JavaScript actually on Chrome OS

with this with this vulnerability so um the way they the way this kind of thing would happen is called Heap grooming and you have to make lots and lots and lots of calls ahead of time with the goal of putting the Heap in an exploitable State now this this bug report was 37 Pages I've not read it um but uh the idea is that you really can see um it's really hard to tell really what ultimately ultimately is going to wind up to be exploitable and this is actually what what triggered the vulnerability if you simply had a trailing Escape dot after the domain name that would trigger this so this definitely the kind of thing

when you see this you pretty um it's pretty likely a fuzzer found this

okay so just a round out talking about memory Corruptions in general talk about some of the mitigations that have happened that make exploitation harder come back most of these have come out in for uh last 10 years or so and stack Canary is uh first one to talk about these are also sometimes called stack cookies these are um a random integer that's pushed in between stack frames and it's just like a canary and a coal mine if something's wrong with this stack Canary um something's wrong with the program and the operat the operating system knows that and can stop the execution so here you have the canaries in between the stack frames um to illustrate it a little bit

better here you have random integers the attacker does not have access to these integers and so if if they have a uh a buffer overflow from this Frame up into this Frame they're going to overwrite that number and the operating system is going to figure that out and so that's that's how stack andaras work um they will exit the system uh and then I want to talk about data execution prevention this is another mitigation that's come out the last 10 years or so it marks some region of memory as nonexecutable so a part of memory is never going to have anything in there you should execute like the e in most cases you this simply marks it as

non-executable so no matter what's in there anytime the instruction pointer is pointed into that region um again the operating system will realize that and throw an exception and exit this is supported by the hardware level by the NX bit which is in uh every modern CPU and all modern uh compilers put this in by default the combination of Staff canaries and data execution prevention is making exploiting the stack nowadays very hard so most of the action now is in the heat still possible there's still rare instances where you can jump over the canary things like that but most of the actions in the Heat and that sort of brings us to aslr address space layout randomization

another really powerful mitigation and the idea here is that it shuffles the memory so you have a deck of cards when you buy them they're in order by suit and by number then you shuffle them it scrambles it up to put it in a a little more concrete manner just as we map physical or virtual memory to physical memory this is another layer of mapping aslr to Virtual much like the OS keeps track of the stack canaries and knows when they have changed the OS keeps track of this mapping an attacker can't get to the mapping so even if they can control the execution pointer EIP it's very hard for them to know where to jump

it to to do what they want now aslr is not as effective on 32bit systems it's much more effective on 64-bit systems the reason for that is that 32-bit systems have only four gig of memory so there's really not that much memory the attack that you can run on aslr is to do a Brute Force attack or also called Heap spraying where if you can you try it again and again and again you just try jumping and you try to jump where you want to go uh hopefully you'll you'll land where you're trying to get to and so that's one way to defeat aslr but in general it's pretty powerful okay so that wraps up the

review of memory corruption bugs we're going to talk about how I go from having those bugs to getting it either determined to be exploitable or not or to start down the road of finding the root cause analysis uh the steps are you minimize a crash Corpus run memory corruption tools over it to get information about the crash and then finally determine the exploitability or find the root cause so typically especially if something's never been fuzzed to get fuzz before uh you can get a lot of crashes like dozens uh pretty common and so you know how do you deal with that what if you have 50 crashes 40 crashes um you need to minimize the crash Corpus and by that

I just mean the set of crashes that came out uh the first uh first step is to minimize well minimize the Corpus itself the number of crashes and there are tools AFL has a tool it's called AFL cmen AFL CM i n and what it does is that it takes the the Corpus of crashes and it will run the target again with those crashes and it will determine if those crashes are actually the same butg because often they are AFL and other other fuzzers they do their best to not have duplicate crashes but for various reasons during the fuzz run it's hard to do that completely and so that's why you need to run a tool like this this will

take you you know maybe from a dozen or two dozen crashes and get rid of some of the things that are actually the same but hopefully reduce it down to you know a minimum number of crashes the next step is that each of those crashing cases itself must be minimized and in this case I mean the fuzzer is just throwing lots of stuff um at the Target and a lot of times there's extraneous bites in the crash case there almost always this and so there's lots of stuff in the crash case only some part of it is actually responsible for the crash so crash minimization tools and afls is called tmin TM n they will take

a crashing case and run it over and over again reducing it uh you know taking out bites and seeing if the crash still happens um if the crash still happens that bite is gone so you can take a file that's um you know whatever could be could be pretty big could be a kilobyte or whatever um and reduce it down to just those btes that create the crash so that makes it again a lot easier to zero in on where the crash happened there's not extraneous bites that are doing other things with the program that have nothing to do nothing to do with the crash itself and there's one more pretty cool tool called f dupes and is available on

GitHub and what F dupes does is that it uses an md5 Hash md5's Hash every file in a directory and it Compares those hashes and if any of the two files are bite to bite identical you will take one one of the two and just delete it or it'll flag it but typically one just get rid of it and because even the crash minimization tools aren't perfect and so I've definitely seen cases where I run the crash minimization the Corpus minimization and I still have crash cases which are literally identical B to bike so I just like to run fdes just get rid of [Music] those so now that we've reduced uh minimize the crash Corpus the next step

is to dig in and just see what what went on and there's a a few tools that are really nice for that the first thing to say though is all bets are off when things go bad in a c program when memory gets corrupted um anything can happen so the tools themselves can be can have problems so it pays to be a little skeptical about what any tool tells you what went wrong most of them are pretty good but I you don't want to believe everything that comes out of a computer so the first one is called address sanitizer this is part of GCC and clang it operates at both compile time and runtime so at compile time it puts in

extra instructions to instrument the program at runtime it replaces the Malik library with its own runtime allocation Library so again it can keep track of what's happening at runtime those two items in conjunction make asand very powerful it has a lot of information it had the opportunity to instrument the code at compile time and then it can watch it very closely when it's running so it has a high degree of accuracy um I've never seen it report anything erroneous in some cases it doesn't tell the whole story but it is very accurate it's available in both GCC and C mentioned that and that's the um that's a flag you want to send to it and it finds inv valid reads and

writes in both the Heap and the stack uh it's very good at finding uaf and we'll find double freeze as well and other more minor memory corruption bugs this is an example of the output of asan you can see it found a uaf it gives you data about um various pointers um as well as where the bad read happened where the crash happened this is a stack Trace in this case we don't have function names due to how it was compiled um but it gives you quite a bit of data about what exactly went on that's where the uh that's where the bad right happened and tells you where the crash happen as well too so asan's a very

powerful tool another similar one is called Val grind or I call it Val grind Val grind is actually a family of tools people kind of use it interchangeably the actual name is m Che unlike asan M Che doesn't require a recompile can operate on uninstrumented binaries it does add lots of overhead slows down the execution quite a bit but that's hopefully not a problem when you're doing crashes if you have lots and lots of crashes that that can be a problem but um could just take a little while to get through them um it does have a lot of output and needs to be uh interpreted we'll see an example of that a second what I like to do is use it

with as sand because they both give slightly different views on the same crash and you can really um really understand what went on using the two of them however one quick not you can't use them together they don't play well together if you run Val grind on an instrumented uh ASN instrumented binary it will complain it's also pretty accurate uh but again it doesn't always tell the whole story there sometimes there's more going on like what it tells you is true but there sometimes can be more going on as well here's a quick example of M check output you see here we have invalid r aze eight a little later invalid read is size four here we are

getting the function names in the stack Trace uh compiled this with the proper Flags to get that and so in this case this is the same crash as the uaf we found we saw with asan and this is pretty likely to be exploitable you have two you have a read and a write and they're close together so it's likely to be exploitable if you can get bad data in with your control something with a bad write and then later there's a bad read uh you can do you know raise quite a bit of havoc and then uh one more tool that's important to run is just called exploitable it was developed by C it's since been open sourced so it's

on GitHub here's the GitHub page It's actually an extension to GDB the debugger GDB you can run you can be run from within GDB but it doesn't have to be I usually run it as a script they Supply a script and so you can simply run the script give it the crashes and it'll tell you what's going on this tool is written in python as you can kind of see the blue and here's an example of the output of exploitable and this is the same bug that we're seeing with the other two and what's nice about this is it gives you a really clear explanation of why it's considered exploitable that it is exploitable first of all it categorizes

them so it categorizes any crash into one of four categories exploitable probably exploitable probably not exploitable those are typically n poin or D references and unknown if exploitable doesn't have enough information you can detect that something went wrong but it just can't tell exactly what the problem was and so it'll say unknown you kind of got a net net case dig into it yourself a little bit more but there's a very these three tools will give you a great deal of information and put you in really good position to do the next

step so we've reached the point where we minimized the corpus now we have a lot of information about it we need to dig in and find out exactly what wrong what went wrong and if it really is exploitable and how before we do that it's easier if you disable aslr So after talking about what a great mitigation is now I'm saying disable it um you has white hat you have to be careful you have to you know often you run this route too so run this route and disable aslr great um do this obviously behind a firewall or a net or in some kind of safe way this is the command onbu to do this uh randomize

uncore Vore space is how you control aslr and it's under proc so as it's under proc you cannot use VI or any other editor you have to Echo zero into it to disable it if you disable it this way it won't survive a reboot and um and yeah want to reiterate do it carefully once you've done that I think it pays to even write down exactly what the critical memory memory locations were for that crash and there can be you know two kinds of memory locations that you care about instructions code locations of that and data itself and it's easy to kind of once you dig into these things it can get confusing so I have to be

pretty careful about keeping track of if the location is a data location or a code location so some examples are certainly where the crash happened um that's a code location that's you know these tools all give you that that's the place to start that's a very important uh memory location certainly where the invalid read and or WR have occurred as a data location if it's in the where the memory was allocated uh where it was freed that's a code location and also if the data has been reassigned during the course of the program the variable is re been reassigned to another one or copied things like that you got to track the data through uh through the flow from

the input to where it actually crashed and so and that's actually could be code or data location and then once you do that you kind of just have to dig into it uh use the GDB debuggers is where where I spend most of my time and typically the best approach is just to set a break point where the crash happened and then work back from there and figure out what you know how it reached that point now you saw how I use lots of capital A on some earlier slides and I like to do that it's a very common thing when you're um doing crash triage is to send it a bunch of capital A's and

then those are very easy to see when you're dumping memory in GDB so the idea is that you can track the memory kind of keep track of it see a bunch of double aa's or Capital A's or in memory what you're going to be seeing is the number HEX number 41 because that's the as number of a capital A and so the the game is and it's not easy you just kind of have to set the break point um send it different Val values larger or smaller um figure out what went wrong understand you have those other important code points where things were allocated or freed or not and and there's a lot of just dumping

the memory um we'll see an example of that a little later how that works in GDB but you have to spend a lot of time digging through and just looking at memory figure out what's going on it's not easy uh can be fun I like it not everybody does um one thing though that can make it much easier is a very cool tool called RR RR is itself a GDB plugin like exploitable was RR is a project by the Mozilla foundation and it's very open source tool tool the way you use it is you run the executable um the target executable with the crashing case and you record the execution of that RR takes care of

that for you so you while it does that it gathers a lot of information about it to the point where you can then uh debug in Reverse so normal debugger we set a breakpoint and we do step into step next step over step out of but it's always going Downstream you know time is always moving it's always moving along the time Arrow from when you started execution of the application uh to where it crashed RR is very powerful because it lets you do the opposite you can set the break point at the crash and step backwards next backwards uh all that kind of thing so it makes it far easier in many cases to find the root cause

because you're you're just going backwards and you're seeing what is happening in Reverse up to the point where maybe the that free happened or whatever the problem was so without this with GDB I find myself um just running things many many times and just setting break points kind of sooner and sooner and sooner in the uh in the exploit in I'm sorry the control flow uh execution of the application but that can be very timec consuming so RR is a great tool that uh really can really help things out with that one quick caveat about RR is that you cannot alter variables like you can in GDB like you can assign a variable of a different

value um or in GDB you can make a call from the GDB command line you just call some function and if it's memory it'll run if you do either of those things it's going to crash RR because you've changed reality like RR records what happens if once you start messing with that RR doesn't do well typically both RR and GB will crash if you try that so those are some some very cool tools to dig into the stuff but it's not easy um this this can take a lot of time a lot of practice and it can be frustrating so um persistence is key uh and you know really just do your best and then one final thing to talk in

terms of fuzz runs and managing this part of things um many crashes actually can mask another crash down the control flow so program exits up at given crash great you're working on that crash the idea is you fix that crash and then you must fuzz it again or you fix any crashes you have you definitely need to reuz the app one more time probably several more times uh to make sure that the crashes you found are not masking more crashes or other memory corruption errors um down down the execution of the

program okay so I'd like to wrap this talk up today and show you a couple of real world examples from my own research the first one is pretty trivial was in PHP it's a low invalid read uh here's my report to it I was fuzzing a PHP in file and I was doing that from the command line which is a little strange because PHP in is typically when you're running PHP in the context of a web server and it's also not necessarily a great way to find exploits because the PHP file is controlled by the Admin and so um it's hard to see a bad guy getting a PHP any file with crafted malicious data in it uh to run and so

unless the admin has been compromised or the admin is malicious it's not not a great approach although it can be we see see another one in a second this was not we have here a low read here's the crafted any file this is the line that that the the fuzzer came up with and I was using I was using dictionary so the fuzzer was just using these known uh values and the fuzzer threw a one in here and it thought it was running in a web server and it wasn't it was running from the command line and it shook out this crash but it's completely not exploitable here's the mem check output it's a bad read of size four but it's

very very low right and so this is this is is a null pointer D reference we're address hex 10 which is 16 and decimal and Val ran said this address is not stacked Mal or free and so it crashed on this but a very low bad read there's never anything down there in memory there's never anything of use in a process at uh at this this memory address

here and here's the fix that the PHP maintainers uh put in you can see they just added a line to check if it's running as a server or not and then do the appropriate thing so then I I turned my attention to Ruby and decided to fuzz the regular expression mechanism in Ruby now this is a compilation of the regular Expressions that programmers write and that's distinct from when a compiled regular expression actually handles input data and and applies itself to that that I didn't fuzz this is just um the regular expression itself so I fuzz Ruby with and with lots of weird um Regular expressions and I was able to shake out hu heat buffer

overflow and um here's the uh asan output from it and so asan you know very clearly called it a heat buffer overflow a bad read of size four I ran Val grind on it and Val grind also found a bad right of size four as well and you can see these memory memory locations are again fairly close together and so this looks pretty exploitable and it probably was um I there's one caveat to that is that it's rare that an attacker can supply a regular expression usually attackers Supply data which has an existing regular expression applied to it um there may be some cases where some apps somewhere allows you to upload arbitrary regular Expressions but they're rare so

so while this is highly likely to be exploitable the attack surface arguably is very small I me it's really hard to see how somebody's going to exploit this in the while so I reported it and um and they they did a nice fix here it turned out that if you have a character class in a regular expression you open it and then you don't close it and then you also have an octal number in it um that that weird case that weird Corner case is what triggers this but

and then one more the last one is in a uh open source um system called Netflix Dynamite Netflix Dynamite is a tool that runs between the open internet and redus and memcached key value storage systems it's an open source tool that Netflix wrote It's on GitHub and here's uh Netflix is saying that they run this real they run this in production um I believe they run it to store metadata associated with people watching movies uh they're obviously running at scale and other people are running it too it's pretty popular and so again I fuzzed the uh in this case the gaml file so it has dynamite. gaml that's how you configure Netflix dynamite and so I I

decided to fuzz that again a weird you know it's an oblique attack and I figured most people it's probably been fed head on before people probably fuzzed it with what comes over the Internet maybe not I'm not sure but I try to attack obliquely at first on things that I think have already been heavily audited so I did that and I got a crash and I also thought well okay it's probably going to be in the yaml library because it's pars the Amel and they're probably using they were using some other library and so maybe I'll get a bone in that or maybe not so I went for it anyway and and I got a pretty nice

crash and so here's here's what the um the input looked like and you can see we have some A's uh in there this is actually this been reduced and carefully crafted and this is this here is the output of glivec so when things go bad glic will also give you information about this this comes out of standard error um and then and and this is within the context of a GDB session so I started this in GDB um ran the program the dasht just tells it to uh um to read the Amal file I think and we got got a crash and I dumped the data from GDB X means examine this x means show it as hex and this is

the memory location I was interested in and you can see we got the a set of 41s here so my my T payload did make it through into memory I was able to successfully write that so I could have written anything I wanted this is also a six bite overwrite we've got two more bytes here and it is contiguous this is Intel so it's little endian so you need to reverse uh this mem locations so it actually was six contiguous bites of um of code there and so so I reported this and I was like okay whatever I I wasn't sure where the problem was and it was obviously a serious bug and you know in

time to report you know worth reporting so I did um and then this is their report here and it turns out that this was down in the string function so I was pretty shocked when I saw this they got the actual bug that they fixed was down in their string dup code so rather than this just being a bug that a bad admin could could leverage through a crafted gaml file this looks like a far serious bug now I haven't tried this I haven't tried to exploit it I haven't pled it head on um I recorded this it's been fixed for about seven months and so I was really surprised though I really I was like Wow actually

pleasantly surprised because I'm like wow this is a pretty bad bug and so I I believe this could have been pretty bad for Netflix if you you could leverage this you essentially have man in the middle between the internet and their key value stores and so this is their fix and this is the diff of the fix and they have a string library but they weren't fully using it they were kind of doing their own coding over it and that's where they got into trouble you can see again this is a classic off by one style error another fix down here as well so that was uh was a really good bug and I did report that I did get in

the Netflix Hall of Fame for that they don't pay but it's a it's a highly sought after place to be and I've got some good company there and then finally I want to talk about a few really awesome ref references that uh help me learn all this stuff rier poly Technic Institute has a course called modern binary exploitation now they've very generously put all the course materials online uh there's GitHub there's open- Source vulnerable apps to practice against and also all the PDFs and slides and everything else are also part of what they offer so so I went through the whole course and I learned a lot and it's not just exploitation they cover reverse engineering quite a bit too it's

a great course um next one hacking the art of exploitation uh this is pretty much the the book The Bible of this kind of work I bought a copy in 2005 uh the second edition has come out since it's a great book highly recommend that and then a couple of these blog posts very interesting as well how they recommend those and that's going to do it really thank you guys very much thank you so much for your attention really appreciate being here uh any questions any uh questions about this stuff question yeah so one thing that you touched on earlier at a high level was just how many libraries go into applications nowadays and so that's the

challenge a lot of my customers face is say Hey you know we're shipping a product but underneath the hood we have like 50 open source libraries so you know at a high level is there any products or Solutions or recommendations you would give me to say hey this is something that you might want to check your code against both static or dynamic analysis yeah yeah absolutely um yeah so the question was if if you're using lots of other libraries we all are we all ship code with other libraries in it and how do we know what's going on in those libraries and um and question sort of follow up is like is there Dynamic analysis and there

are companies that are doing that in some cases uh doing um handling that and you're talking about another really important use case right which is third party code um and that's another thing that's that's very worthy to fuzz if you have time or one way or the another I mean it's important to fuzz that stuff or run static analysis tools over it as well um and so you know so absolutely there are companies that are doing that there are there are some that are doing sta analysis on JavaScript libraries like in node um because there's so many of those those and uh and there are certainly you know there's obviously pentest companies they'll fuzz stuff for

you and there are a couple of companies now that are doing Dynamic analysis fuzzing in the cloud as well that you do it but they um those companies will supply uh the infrastructure right and so you don't have to have your own computers and all that kind of stuff uh you just send them a binary they'll do the fuzz run and then come back some of them actually give you all this data that I should you and so is that your question yeah just like you know like application security is tough yeah and a lot of my customers don't do it well they they ship buggy products basically right because the SSL version that they're using in open source is outdated

and a right so I was just just curious if there's like a solution or a vendor that you know of that can do that kind of thing yeah there's nothing there's no there's no one single thing it's yeah unfortunately there's not at this point you still have to kind of do by hand or I mean although I mean it's it's an interesting field I'm sure I'm sure there's probably I that might be a good start okay yes um how long does it typically take say one of your examples you spend half day a day two days a week um it it it varies quite a bit um but it it I would say probably at least half a

day on On Any Given crash and that's that's kind of in the good case but the question was how long does it take to to manage a crash once you have it doesn't take me a half a day or a day or whatever um ideally it doesn't take that long especially so many of these off by one errors are easy to track down like it's easy to figure out one off by one error if it's not that it g be you know who knows what um and and I'm getting better at it too so and then if you're including writing the report you know you're obviously add a couple more hours for that and so it probably takes about

a half a day on average to a day for any given crash to really fully investigate it to get to the point where you'll be able to tell it's exploitable or not or have a good shot at finding the root cost anything else yes sir do you ever fuzz like net or Java applications that are run time there is there much fruit fruit findings those that's a great question so the question was is have I ever fuzz Java or net manage apps right so we're talking about here memory corruption and unmanaged c and C++ that's what my whole talk has been about uh the question is what about Java what about net or any other managed system right and and I'm

actually very really interested in doing that and I think there is um you know I think there's a lot there's a lot of study and research that can be done there and I'm very interested in finding that or I sort of call that fuzzing for correctness or fuzzing for conditions other than crashes so you know fuzzer you know generally it throws data throws random data at a program and it monitors it for some condition that matters almost always these days at the crash and crashes are easy to see right but you could certainly fuzz for other conditions and so you'd have to re you'd have to predefine those conditions somehow and that might take a lot of

work I'd like to think there's some automated way to think of those conditions or maybe based on the test Suite um and one other similar thing that I'm looking at is called differential testing in that case you pu three identical three programs that are different but they perform the same function like there's multiple ruby interpreters so you could fuz three of them at the same time sending all three of them the same um the same crafted or or mutated uh case and then you compare the output of them and certainly if if two of them have one answer and the other has a different answer you potentially uncovered a bug in either one of them or two of them and so that

that's that's another thing that that I'm really interested in as well and so yeah I mean I think that there's a lot of room for fuzzers to find more bugs and more kinds of code and there's other research you know it's pretty active research in some academic on as well yeah thank you anything else all right thank you guys so much [Applause]

David Moore - The Aftermath of a Fuzz Run - BSides San Diego 2017

Related talks