← All talks

Fuzzing Malware for Fun & Profit

BSidesSF · 201927:59362 viewsPublished 2019-03Watch on YouTube ↗
Speakers
Tags
About this talk
Malware authors rarely follow secure development practices, introducing exploitable bugs that defenders can weaponize. This talk demonstrates coverage-guided fuzzing techniques to automatically discover vulnerabilities in malicious binaries, then leverages those bugs to disrupt botnet operations, crash C&C communications, and even achieve remote code execution. The speaker presents netAFL, a new cross-platform fuzzing tool, and reveals 0-day vulnerabilities discovered in notorious malware including Mirai and banking trojans.
Show original YouTube description
Practice shows that even the most secure software written by the best engineers contains bugs. Malware is not an exception. In most cases their authors do not follow the best secure software development practices thereby introducing an interesting attack scenario which can be used to stop or slow-down malware spreading, defend against DDoS attacks, and take control over C&Cs and botnets. Several previous researches done by the security community have demonstrated that such bugs exist and can be easily exploited. To find those bugs it would be reasonable to use coverage-guided fuzzing. Numerous studies have shown that this is the most effective technique to automatically find bugs in closed source software. This talk aims to answer the following two questions: Can we defend against malware by exploiting bugs in them ? How can we use fuzzing to find those bugs automatically ? The speaker will show how we can apply coverage-guided fuzzing to automatically find bugs in sophisticated malicious samples such as botnet Mirai which was used to conduct one of the most destructive DDoS in history and various banking trojans. A new cross-platform tool implemented on top of WinAFL (called netAFL) will be released and a set of 0day vulnerabilities will be presented along with several exploitation demos. Do you want to see how a small addition to HTTP-response can stop a large-scale DDoS attack or how a smart bitflipping can cause RCE in a sophisticated banking trojan? If the answer is yes, this is definitely your talk.
Show transcript [en]

so hi everyone my name is Max and I presented this talk at Def Con last year and today I gonna present updated version of this research so okay first of all I'd like to thank besides organizers to give me an opportunity to speak here it's really great I'm own ability researcher must be focusing on box hunting in memory unsafe languages and I try to write exploits for those box and then use them now all kinds of security operations so this presentation is logically divided into three parts in the first part I'm going to explain why we're and how we can search for bugs in malware and why coleridge guide advising is the best technique to find

those bugs and what kind of problems we have to address when our target is malware and of course I am planning to show several interesting demos on how to pass malware and how we can exploit bugs in them in order to get some profit and before I actually start fighting binaries I decided to find and take a look over some leaked malware source code files just to understand whether this idea feasible or not does it make sense at all and guess what right in the one of the first source code file I found this comment in Russian which can be translated in the following way so I was really laughing for a couple of minutes I said okay it

looks like this idea makes sense and I'm gonna find a lot of bugs when we when they try when they write malware they have to do a lot of complex things like initial infection payload delivery and most importantly communication with situ there are a lot of things that can potentially go wrong here so an ideal place to find bugs for us would be a parser of incoming commands from situ or some really complex file format parser while some samples large very trivial algorithms to communicate with see to the a lot of samples that support complex communication protocols implemented from scratch so despite of this complexity blackhat malware writers rarely interested in implementing secure record for many

reasons such as lack of time or experience or expertise so in most cases we will not see a SLAR Deb or any other anti exploitation techniques applied for malicious binaries sometimes the code is so badly written that it doesn't work if environment has slightly changed and of course session for box is always fun I think using the same so hacking back in general it's pretty well known research topic and kindly I can safely guess this idea has lived with hackers community for decade there were a bunch of great talks in the past I just listed a couple of them but what about actually fighting malware well there are much much less publications in this field actually

there is no systematic research at all I found several research papers published by academia but the main goal of this researchers was to find and trigger new code paths hidden in malware samples using Feisal which can be really useful for malware analysts but in this talk I'm going to focus on bug hunting and how we can use this box to defend against malware which is the bit opposite so let's imagine that we found some memory corruption but it goes crashing some samples that are spraying around the plane such bug might be quite useful I guess many of you remember a famous kills fish found in Zbornak right sample which significantly helped to slow down spreading of this sample and

this way reduce malicious impact so if you can place one file with special name in one specific folder wanna cry will not infect such machine of course they let this kill switch on purpose but if we can find the switches automatically Oh see me automatically in this case we don't even need such gifts from them it's especially cool if you can do that against botnet which is trying to perform a DDoS attack against us for example if boards have someone abilities the victims response parts we just need to send our exploit back to board and it will cause a crash later and the dam I'll show that it's more than possible so well it would be really great if we

can trigger remote code execution we can take control over or shutdown but read botnet or we can even try to track down botnet owners and a lot of other things and of course our like sweet dream block is air-sea in situ in this case we have a god mode and can do everything whatever we want but in my opinion nowadays it's less likely because see tool usually written in memory safe languages like PHP go Python what everyone so actually I don't see any reason to write in C C++ may be performance ok how can we find those bugs today fighting is the most efficient technique to search for bugs in memory unsafe languages actually if I

think is very important for software security at all top tech companies huge open source projects who integrated files within their software development lifecycle there all report that security has increased by after they apply poison Lindos Torvald recently said that coverage guide advising is improving Linux kernel security which is very cool okay what's Phi Z Phi think is actually very simple technique you provide potentially in violet or malformed input to your software and monitor your program for crash so nothing hard you start your visor visor generates input and sends its input into the program all you need is to see it and pray that it will find something I usually very my father report one unique crash and

read it's like I'm really happy so what's coverage guide advising many years ago when fighters was dumped and blind father considered the program as black box and into which we sent our test cases it usually worked pretty good for trivial box the delicated not deep in the code but people want to find more complex problems deeper in the program's so they decided to instrument program under test and provide information about coverage back into Pfizer so the best example of such Pfizer is famous American fighter slope or effect during coverage guided files and if we manage to find a test case that triggers a new code paths in our program the fighter saves this new test case and then

perform subsequent mutation on top of this new finding and for the next code pass and for the next code and this way we can touch more code deeper in our program and in theory of course blind fighter can also find this code path but it's much less likely let's consider for example this example of code so in case of FL it gonna take minutes to find this null pointer dereference and it might take it sounds in years for dumb father so you're gonna you can see the problem yeah and why it's very efficient technique to search for bugs today state-of-the-art coverage guided fighters are FL and lip visors there are a lot of FL Forks design is for special

purposes like kernel-mode Pfizer k FL and what is more important for us a port of FL for windows benefit FL basically injects instrumentation routines during the compilation step so the resultant binary will have this FL me maybe lock routine injected in each basic block of your binary however in case of malware we have one tiny problem and I guess many of you know it's you don't have source code so I guess it's not sir right actually we have even more problems my way usually unpack and execute most important part of the accord dynamically at runtime so in this case source code instrumentation even if we have source code is useless we have to find some way to be able to provide

back to our Pfizer coverage of such dynamically unpacked and executed code paths and we can try some tools to automatically unpack some samples sometimes it works sometimes it not but in general I think it's not scalable approach and besides that if you want to search for bugs in situ communication protocol we have we have to encrypt our test cases the same way a small way so there are a lot of requirements for our Pfizer in this case thanks God there is Vienna fell for Windows binaries that doesn't use source code instrumentation is implemented on top of Dyna Mario dynamic binary instrumentation framework I'm not going I don't want to explain what is dynamic binary instrumentation

in details but basically it's technique for analyzing the behavior of a binary application at runtime through the injection of instrumentation code so I want just a basic idea let's say you have a DBA Engine launcher and binary you want to instrument so at step one you launch your binary suspended you inject your instrumentation library into this process memory then you hook entry point to be able to redirect control flow into your injected library and this step actually starts the magic dinah Mario takes the first basic block copies it in a special place called code cache then it transforms its basic block dynamically to be able to inject instrumentation extractions and then execute them so the most challenging

stuff is to make this execution transparent towards instrumented binary and the NMR you knows how to achieve this transparency so it's very sophisticated too and I highly recommend if you want to use if you want to analyze your binary dynamic so then it takes names basic block instrument it executes and so on we had three challenges lack of source code obfuscation and encryption so VNAF al+ dynamo you have solved the first problem and extra creates a new one VfL supports only file based filing and to address this problem i decided to implement a patch for bnfl suppose we have our Father and our malware instrumented by naina Mario let's assume our sample sent requests to see to

instead of actually send it to see to we redirect this request to our father our father generate new tests response and creep these responses if it's necessary and then send this response back into our sample then we update korish map provides coverage back to our Pfizer restart our target routine or entire sample and so on till we find one so actually all you need is to specify the address port and seat file that's all Pfizer will do all the rest for you and if you need to encrypt your test case before send them back you can provide a past due to your custom like test cases encryption library when FL will load this library and we'll use this exported

French function from your library to be able to encrypt each disk case so ok let's see how it actually works I've prepared a small video

maybe

so in this video we have a release version of Vienna fail we have Dexter version to our malware designed to steal credit and debit cards information from point-of-sale terminals it actually received and sent a request or HTTP protocol it is get cookie functions to be able to send commands to our sample so when malware send requests then it's and browsers cookie and then use decrypt this cookies to be able to execute commands on infected machine so if common start is dollar and sign it will execute command now machine so what we need is to implement our cincy init function in this function we have to define our like we have to listen our port to be able to receive this request

and we have to set up this cookies and in our browser so we need to accept then we call receive we send responses to our malware that everything is ok we received your request and we are ready to setup it in browsers cooking so it's and it is set up using API call this cookies and that's it all we need now is to compile this function and we are good to go we are good to the start so this command looks a bit long but actually I'm gonna explain it's easy so in the first parameter we specify our custom encryption library or to listen on then we have to specify standard in our directories for VIN FL and then then I'm

early release timeout target model we want to where we want to search for bugs our like target method we want to fast corage from visual model we want to take our colors back to Pfizer and number of iterations after which every NFL will restart our target and path to our binary that's it we are ready to start so as you can see every sin successfully started and loaded we set up our out directories and we are ready to launch our test cases so we never started it sends it sends our test cases statistics looks pretty healthy we have we already discovered new six paths in our binary execution speed is pretty good so Pat geometry instability looks good

and it's a bit slow but we are running in a virtual machine with no parallelization so 23 executions per second it's pretty normal so if we live our fighter like this for a couple of hours we can second so if we leave this Pfizer for 3-4 hours it will find our crashes that we want no sample so this is screenshot okay so let's see what I managed to find in malware so first case study and me right me right is actually isn't malware the targets IOT devices and use them as a part of botnet a large-scale DDoS attacks this malware was used in some of the largest and most disruptive DDoS attacks in history which caused a major

Internet platforms and services to be unavailable to a large amount of users in different regions in the world so in 2017 source code of Mira was leaked and different mirror-like but nets adapted it and are still operating well the fun fact about me right that they actually use some security practices and apply this electric fence tool to search for hip overflows and use after free box in the code which is quite unusual for malware minner I did those capabilities based on HTTP flat and several low-level Network attacks the most interesting part for us in terms of exploitation would be HTTP response parser Merai actually needs to parse HTTP responses coming from victim to be able to perform HTTP flood

attacked so it needs to understand what is what is actually running on the target so this parser has 800 in search of lines of code hundreds of potentially dangerous operations with memory pointers and strings which makes it a really great target for poison it's a seed file I decided to use very basic HTTP response so after fighting this sample for 24 hours and managed to find 42 unique crashes which caused by single bug in relative URLs and ER so execution speed was around one or two thought thousands per executions per second which is pretty good Pfizer managed to find approximately 430 unique paths what was the root of this bug if our HTTP response contains relative URL discount

budge branch is triggered in case of incorrect relative URL variable double I always equal negative values which caused a memory violation and crash this is a logical error after used this parameter for some other purposes in the code early and forgot to set it to zero in case of relative URL just like logical error okay I just want want to show how it works so

so in this virtual machine we have debug build of Mirai on the left terminal and on the right we have our Python server that will answer this this malformed package so now we are ready to start our Mirai so we started it sends request it says it's connected to c2 it's sending our request to our server and after several hundreds of packets we can actually answer with our patek manifold malformed URL and here we have our segmentation segment we are violet unfortunately it's just dinner service but this way if malware like Mirai is trying to attack your environment you can just answer with this simple HTTP response and both will crash and this way you can just protect okay next

example I already presented this sample when I was shown bnfl the first version of Dexter was one of the first known botnet the target point of sales terminals Dexter communicate with c2 over HTTP we are post requests and receive commands over the response cookies so actually in case of Dexter it turns that we don't even need to fighter after don't expect anything malicious coming back from C to trigger this bug we can easily send a long command over 255 bytes without trailing pound which caused a stack buffer overflow in multiple places in this function for example in the copy in the function copy teal so this back is definitely exploited to build it to LC so it's like really

old-school and you just need to send long enough command without Australian pound say so tiny tiny nuke is eel style banking Trojan designed to perform when in the browser attack using the up injects they have this lured weapon jacks and a huge JSON parser you can just use this JSON response as a seed file and then just perform our fighter and after 24 hours I found four unique crushes the root of this crushes is infinite root in infinite recursion in this function so if you provide long enough response which contains only opening brackets in my case seven thousand this sample will stop execution so I also found this tool very useful when you want to find a

target routine for poison I have implemented this tool for malware analysis on top of Dyna Mario it's basically l trace for Windows but transparent towards instrumented binary so it will trace all API calls in malware and print this information in the file so it's less detectable then standard API called tracers so you can give it a try it's open source busy license so boxing malware might be useful and you can really find them using fighting technique of course we NFL can and should be used for searching for boxing network based applications its general purpose fighter you can use it to find bugs in benign software course so I found last year cv1 hip overflow and network based application

for Windows using bin air fail so thank you for your attention so you can use this link to find more information about me [Applause]

so now times for QA so we have two questions online max the first one is where are we relative to thousand stateful protocols beyond the first message sorry where are we relative to fuzzing stateful protocols beyond the first message I didn't get this question neither do I the next one is one that's more ethical one remote code executing on bot what is the ethical consideration for running code a remote random systems question mark in contrast to quote just n quote crashing them without remote code execution this is a great question actually yes the lot of ethical concerns when you do hacking back and there were a lot of talks at Def Con in 2007 key

so Patrick word presented like ethical concerns about hacking back so let's say we want to cause remote code execution in our malware but what if this Malwa is running actually in some of our custom or some other network in this case we like we actually breaking we actually hacking back some other benign environment which is which is like really serious question yeah in case of remote code execution I can say that the a lot of legal concerns I had this slide at my Def Con presentation but I remove it in this presentation just because we have 30 minutes yeah but if you want to search for more details you can take a look on patrick word presentation at dif

2017 or my talk at DEFCON last year hey thank you it's a good it was a great talk I have a question about like a basic very basic question so we know by default FL and V now if they are not able to like detect children process so like do you know like how to tackle those kind of like problems how do you detect boxing children processes yeah so the question is if we have our target binary the trunk some child process how we can search for boxing this child process am i right this is a good question actually I don't know I have to think about this to be honest I don't know I know that in case

of Dana Mario for example you can inject your like client application client DLL in all processes the specific name so probably this way you can address this problem and I know Mario by default it's follow all child processes so yeah it requires probably some patch for Vienna FL but in terms of binary instrumentation in terms of providing coverage back into Pfizer there is no any limitation like any fundamental limitations one last question for you fry it Jay where did you source your ATMs for research from very interested in doing my own experiments on ATMs yeah so basically I focus on Dexter Dexter model that targets Windows so let's go so this this part of Dexter is actually

actually yes Dexter just a second so yes it's a point of sales malware but it targets Microsoft Windows terminals so in this case I don't need my own like ATM or any other like hardware it's the small ways like general purpose Windows malware we can just run it in on Windows machine and then the visor is VIN FA thank you max it's been a pleasure thank you