← All talks

This Chip Does Not Exist: Pre-Silicon Fuzzing

BSides PDX · 202343:05203 viewsPublished 2023-10Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleTalk
About this talk
Rowan Hart (@novafacing@haunted.computer on Mastodon) Fuzzing is a critical step in the security process, and has uncovered bugs in software throughout the stack. Mainstream support for fuzzing user-space applications is nearly mainstream, but fuzzing below Ring 0 has remained the realm of domain expert security researchers. A common, virtually unsupported use case is fuzzing software and firmware designed to interface with pre-silicon hardware. To address this use case, we present TSFFS: Target Software Fuzzer For SIMICS, an open-source snapshotting coverage-guided fuzzer built with LibAFL capable of fuzzing most software that runs in the SIMICS full-system pre-silicon simulator, along with a survey of its use cases. Rowan is an engineer at Intel working in system software fuzzing. He graduated from Purdue University in 2022 and is interested in fuzzing, program analysis, and security tool usability. --- BSides Portland is a tax-exempt charitable 501(c)(3) organization founded with the mission to cultivate the Pacific Northwest information security and hacking community by creating local inclusive opportunities for learning, networking, collaboration, and teaching. bsidespdx.org
Show transcript [en]

[Music] my name is Ron Hart I'm a security researcher at Intel um I've I've been there about a year I graduated from Purdue in December uh from the Masters program um so I'm pretty fresh but my my focus and the team that I'm I'm on is focused on system software security so that means basically kernel and Below um so we'll talk about that stack but that's a little bit of my background I'm I'm very focused on fuzzing symbolic execution program analysis that type of thing um and my main weakness is that I never use any of the tools I make I just make the tools and then hope somebody will pick them up first of all don't sue me please uh and also more importantly don't Su the company um and this slide says as far as I'm aware you're not allowed to so we're all good so there's a there's a full talk about fuzzing so I'm not going to go super in depth but just by show of hands who uh knows what fuzzing is okay almost everybody excellent that's kind of what I figured um and uh who has used AFL ++ in the last 2 years okay pretty good pretty good so um OG fuzzing cat Dev you random into your binary uh shocking amount of stuff crashes this was uh I was a teaching assistant in school and this is actually one of uh my hidden test cases on every assignment was this um not many points but uh not a whole lot of points gotten on that one so nowadays we do it a lot smarter we use mutators we use feedback we observe the program State we do symbolic tracing of paths as we're executing we use hypervisors all of this stuff um and I'm going to talk about how we've done a little bit of smarter fuzzing but um one thing that I really want to focus on is how fuzzing can be used not just for finding OD days in in stuff um you know that's a great use case for it gets you a lot of cloud on Twitter X whatever it is um but you know we can use fuzzing for for a lot of more normal development practices as well uh differential fuzz testing allows us to if we're developing a new version of a product test against the old version of the product and make sure that things function the same so users won't see discrepancies and how end points are behaving that sort of thing um we can use it for property testing and say okay instead of building a unit test with 50 cases we're just going to say this is never allowed to be more than five fuzz it make sure it's never more than five after 10 minutes and you're pretty much good to go um so things like that are really good applications of fuzzing that we want to see adopted more as we do what my team is focused on and shift fuzzing as we say to the left which is uh make developers do it instead of having red team come behind and clean everybody up because it just doesn't scale when you're trying to have a team of eight people fuzz a 100 projects a year U it's just on a numbers game it doesn't work so fuzzing super effective but not a silver bullet of course we still need those red teams to come behind and we want them to focus on things like if you were watching last week the LI web pebg where fuzzing probably isn't going to find that one uh as a summary basically you had to have six trees beta trees constructed exactly correctly and then you had you had to mess up the last one so that it would be corrupted the fuzzer is going to have some trouble reaching that state it might get there eventually Thousand Years 2,000 years something like that but we need we need manual analysis as well so that's that's a very important thing to keep in mind building fuzzers so at Intel we use a lot of fuzzers because we have a lot of software uh as you probably are aware we make CPUs um but we also make everything above that in the tech stack we do CPUs we do the firmware that runs on them we do the micro code below that firmware we do the operating system on top of it we build the driver support for all of the hardware that we produce we build user space applications for the operating system that we're helping to support and we build web applications that you can reach over the Internet running on those operating systems and user space applications so we need fuzzers for every layer of that stack because we want to test for those types of assumptions that we covered as well as most importantly for security vulnerabilities so this is not an exhaustive list um but this is just kind of an overview of some examples of fuzzers that we use for various levels of of stuff we have the developer of BU fuzz here go talk to him uh at some point uh great for Network fuzzing there's a workshop later um and then we also use wrestler for rest apis and web API fuzzing obviously user space is one of the most fuzzed spaces and we have really good support for enabling developers to do user space fuzzing with lib fuzzer you add a compile flag you write four lines of c and you're basically done you can't really ask for much better than that and that's the kind of that's the kind of development practice I we want to bring down the stack because as you go down it gets more difficult to harness things and more difficult to fuzz them so in the kernel we move towards hypervisor based fuzzers because you need to boot a machine over and over you need to run it and specific uh device test paths so for example you need to fuzz uh virtual IO drivers you need to boot your devices and you need to F A hypervisor for that of course most hypervisors are based on kumu so from our most recent server CPU release the sapphire rapid CPUs is released around January to March and Kim support was added in April so that's that's pretty good that's really fast on the Kimi developers and they're amazing but we need a little bit better than that if we want to test our firmware before the CPUs come out um so KFL KFX those are hypervisor based fuzzers they both require chemu support because they're using it under the hood below that when we're looking at system management interfaces we have an old tool called excite which I'll talk about this project is actually a successor to that one um and then even below that for some very custom deep features like the Intel management engine teams are generally using Uh custom Forks of fuzzers fuzzing things that are built in very specific ways um so we get pretty pretty hardcore there um and then even below that when we're at the hardware description language layer where we're looking at vhdl models of CPUs that are not produced not even modeled not emulated nothing um we can fuzz them at the uh RTL or vhdl layer uh with tools like pressy fuzz which I think is actually on GitHub now so you can look that one up as well so more of the story there are a lot of fuzzers um so let's let's rewind and let's look at the use case that uh my team is trying to T tackle so we have CPUs we want to make the CPUs and we need to ship them with firmware when we ship the CPU and developing firmware takes a long time almost as long as designing the actual chip itself so you have to do the firmware design while you are doing the chip design um the problem with that is how do you run firmware for a chip that you don't have um you need to virtualize emulate Etc um but the use case most of you may have seen is uh noticing something sketchy going on with your router um and I was in this situation a while ago I was looking at my router's JavaScript which you should never do tonight um and noticing some very interesting things that Sentry Link had put in there so uh what I did was I you know hit firmware update got into wire shark grabbed the grabbed the firmware download you know it basically just gives you an entire root file system so you unpack that um and then it actually downloaded an old version cuz it has a bug so you need to fix the vendors code now and you need to download the real version um yes fun fact my router updates itself to the same version it's on only um so you need to download the real version by guessing their URL scheme then uh you you run Kimu in the unpacked root file system if you're really lucky um and your router is in X8 six machine it'll probably run uh but more likely it's it's running myips so you're going to need to Google that and somebody's blog is going to tell you that lib and v.o from this GitHub repo that uh you need to hit translate on is going to work so you download it and you compile it on your machine you don't read the code um and then you put it in the root file system and it indeed does work because some people are really smart so you run Kimu fun fact this is actually by far my favorite way to emulate emulate routers is just run Kimu user on busy box in the unpacked root file system that's all you need to do you don't need to do Kimu system you don't need to do devices in most cases just do Kim user with the with the unpacked file system um and then uh yeah if you do need to do devices good luck so for us it looks a little bit different at Intel uh you know we're not we're not working on your your ziel router and your wall that's been there for 15 years uh we're working on new stuff and we don't have the chips for it we don't have Chu support so uh we have a few options basically for how we want to run the firmware that we want to fuzz we can run it on real Hardware we need the hardware so that one's out um we can virtualize the hardware totally doable and will be done but usually it takes a long time uh which is kind of the issue and is not done early in the chip development process or uh we can stub out the hardware so this means you go through your your firmware C code anywhere where it makes a call down to some feature in Hardware you just replace it with the c function that Returns the right value so you know if you're reading a model specific register in your CPU it's probably not going to work if you don't have it but if you just replace that function with something that returns zero every time you're probably going to be okay so we don't have this we don't have the CPUs we don't have virtualization and most importantly we do not have any time uh so it's a really hard cell as many of you are probably aware to go to a development team that has a bunch of Sprints and a bunch of upcoming deadlines and say hey we need you to stub out all of your Hardware functionality so that we can fuzz your stuff it's just not going to happen so developers don't have time QA doesn't have time validation doesn't have time and the red team has some time but there's only five of them so what we need to do do is make it possible for developers to fuzz stuff without doing any work so they need to be able to install it and they need to be able to copy and paste all of the instructions to install it because again and this is not a knock on any developers it's our fault as fuzzer authors for making our install processes so difficult if you have more than five options it's too many so you need to be able to install with a couple steps you need to be able to add your fuzzer into whatever development process is already happening so if they if if your development team is using make file driven development on Linux and it's just pure bespoke make files you need to work with that if they're using visual studio and they have a bunch of solutions everywhere and they're using uh all of Microsoft Macro systems and whatever they do I'm not a Windows person um you need to work with that as well and finally it needs to be a good fuzzer so despite all of the all of the ease of use things that we need which are in my opinion the most important uh you know you can't just pipe de random into your program because that's not going to cut it it's 2023 so we made that fuzzer in theory uh it's called Target software fuzzer for simix which is a very nice legal approved name and not the original name we picked um but luckily it kind of sounds like tisus if you say it badly enough so I drew I drew some art for that and and away we go so tisus is based on a simulation platform that we use at Intel that's also publicly available which is why I'm here telling you about it um it's called simix which as far as I'm aware doesn't actually stand for anything kind of like Kimu doesn't stand for anything except used to stand for quick emulator um and what simix is is a full system cycle accurate device modeling software so that device can be and usually is a CPU but it can also be a graphics card a network interface it can be a TPM um any of these things are modelable in simix and basically what you do you download simix you create a project you add a bunch of existing models to it because they are tough to make you know not as hard as actually designing the CPU but as you can imagine modeling modeling a modern CPU is quite difficult um if you've ever tried to write an emulator um so you you plug in your your server CPU you plug in a TPM you plug in a a arc graphics card or whatever it may be you don't get those models um and then you boot it and it runs cycle accurate simulated including uh Power thermals all of that simulation on your Linux system in user space which is the key so tifus is of course Very uh AFL Plus+ style it's feedback directed uh we have comparison logging uh with red queen it's snapshot based which is free for us because we're using simix which has snapshots you can launch as many processes as you want um and they'll all synchronize together so um we we'll get into some simic specific stuff in a minute but that's very useful for our specific use case and uh yeah basically we threw the kitchen of all current uh system software fuzzing Research into it it has mop it has Auto tokenization it has all kinds of mutations um there's a short list of things it doesn't have that we'll get to at the end so why simix first of all the number one reason is because everybody at Intel is using it for all of their firmware development which means if everybody's already using it we can just ask them to plug a little plug one thing into this and now it'll fuzz instead of just running your software not bad for the developer perspective um from the fuzzer design perspective it's pretty amazing to work with having come from a let's Fork Kimu again background um because in instead of adding functionality to Kimu to add snapshots or add introspection which um fun fact is still not allowed in Kimu because of GPL violation rules um unless you unless you have a fork which there are several um if you want to read a register you call a function if you want to read memory you call a function if you want to figure out what the thermal change over the execution of your your new hardware instruction was you call a function if you want to get the hash of all of your model specific registers for the CPU that you're running you call a function so there's a lot of really powerful stuff and we're only kind of scratching the surface at the moment but a lot of really powerful stuff you can do with simix that kind of is enabled because it's not designed as a scalable you know Cloud hyperscaler let's give people virtual machines emulator it's designed as a test bed it's designed for debugging um and all of that stuff so um second key feature is uh that I had to do about 1 hour of work to get snapshots working with my fuzz uh because once again if you want to take a snapshot you call a function and then if you want to restore to it you call another one so um simix basically provides all of the hard stuff for us to build on top of and build our fuzzing research on top of so that's great for us you know we have models for all of our upcoming software but what about all of you who maybe do not work at Intel so uh simix is publicly available you can you can go on Google if you say download Intel simix there's a web page you can download it and I actually found this out last week when I was checking to make the slides but it has all of our current currently available CPUs in it so uh if you want to go home and run a sapphire rapid CPU you can do that actually um if you want to do some research on those have at it you they're they're available to you so 12th generation client and Sapphire rapid C uh server CPUs are available in public simix so um it's it's at least on par and slightly ahead of actually Kim's release schedule for being able to uh emulate or simulate modern CPU uh designs we also have publicly available risk 5 models which the fuzzer doesn't support yet but definitely will because there's a lot of people who are very interested in doing uh risk 5 research fuzzing uh firmware for risk 5 platforms like um all of the pine and orange pie type boards and things like that as well as mobile phone devices and finally for a developer experience we get free time travel debugging bug triage read all of the registers read all of the memory forward and backward step inspect everything and figure out why your firmware is crashing uh which is really important for the developers um a when they're you know categorizing their bugs and figuring out okay what stuff do we work on first um and B is very useful from a once they start to work on it it's very quick to get to the root cause of the issue uh when you have basically every possible uh piece of data available to you second thing we do to get some free stuff is we don't Fork AFL um so this is not a talk about not forking AFL but I've done it three times I'm sure some of you in the audience have done it probably more than that um but there's a better way now and it's called live AFL so so uh lib AFL is a a library that's written in Rust and essentially what it does is Implement every part of AFL ++ as a library that you can just import from create your own fuzzer using all of those uh components which includes basically uh from the binary up you you have a back end which may be uh Kimu it may be NYX kemu it may be KVM Kimu whatever it may be uh we have Freda which if you're in iOS or an Android reverse engineer or uh or security researcher you probably know Freda and are a big fan and uh tiny instrumentation for for really good Windows speed and instrumentation and then of course uh you can run native binaries you can run forking binaries you can run you know harness your library whatever whatever you may want to do with it um and the modular approach approach is super helpful because you kind of pick a back end you pick all of your your fuzzer capabilities so if you need concolic tracing Li AFL has it you can add that in not every Target needs it so you get to choose whether you know you want to enable various stages you want to enable Red Queen your your uh program is parsing text so you want to enable grimoire which is a a gar mutator for for mostly English text um all of those are options with li AFL so um if you are a researcher in fuzzing and you're trying to implement new techniques for example you have a new power schedule you have a new mutator Li AFL is kind of the perfect place to do that because you uh you just write it you test it out locally and then when you're done you can Upstream it and then everybody else can use it you can go to the conference and talk about it but it'll actually uh probably get used in production if you Upstream it to lfl uh instead of like me your three or four AFL Forks sitting on your GitHub for years with 16 to 25 Stars pretty respectable uh but not a whole lot of uh active use in in industry or research so big shout out to the LI AFL team for putting together such a such a great tool and uh I actually did license this meme from the original Creator so you can't Su me for that either all right so that's that's what we get for free so this is what we are giving other people for free um as I mentioned pretty much everybody working on firmware in BIOS that Intel uses simix every day they you know make some changes to the BIOS they boot it in simix they see how it works make some test cases for it iterative development process um what we've given them is a three-step install process which is not one but is a lit