← All talks

The Power and Perils of Binary Emulation for Malware Analysis - Anuj Soni

BSides Philly54:28231 viewsPublished 2024-01Watch on YouTube ↗
About this talk
Anuj Soni The Power and Perils of Binary Emulation for Malware Analysis Binary emulators, which simulate the execution of instructions or an entire program, provide a compelling solution for automating the deobfuscation of code and data during malware analysis. When faced with malware that implements custom encryption or obfuscation algorithms, publicly available libraries rarely help. One solution involves writing custom deobfuscation code, but this can prove to be tedious and time-consuming. An alternative is leveraging emulation to perform activities like API hooking, unpacking, string decoding, and config data extraction. Emulators facilitate automated analysis, but there are caveats to consider. First, because emulators do not have access to all components of an operating system, any emulator tool or framework is limited in its ability to simulate operating system objects, APIs and other resources. Addressing these limitations often involves implementing the required API or OS resource. Second, although emulators can save time for analysts, their performance generally lags behind running the code on the intended operating system or other approaches to automation, such as Dynamic Binary Instrumentation frameworks. This talk will delve into various binary emulation frameworks and libraries such as Speakeasy, Qiling, and dumpulator, all of which rely on the powerful Unicorn engine—a CPU emulator. By exploring the available options for binary emulation, discussing their advantages and limitations, and providing practical guidance in the context of malware analysis, this session aims to empower reverse engineers to expedite their analysis and automate their workflow. Bsides Philly 2023
Show transcript [en]

and blackberry bought silence is that music going to be on the whole time no it down okay okay this is wild I haven't never seen this kind of headset situation before the acoustic situation here is just no it's horrific that was our solution except a mismatch we were able to versus how many people we have okay I guess it's good that a lot of people showed up I mean it is but I wish I wish we' been able to get more of the headsets yeah yeah yeah it is what it is so where do I just got to speak into this or how does it work okay so there's there's no amplification but the headsets go through here and the

recording goes through here okay so it's going to be up are you guys GNA upload it to YouTube at some point uh yes we have one person who has to not be recorded I forget who it's not you um but uh yeah otherwise it it'll all be up um I'm basically I'll keep an eye on the levels and stuff and once I'm pretty comfortable with how you're presenting it's going well I'm pre to go put out some fires okay sure sure can I move this just a little bit so it's kind of like facing over here I don't know how close I have to be to it okay okay yeah yeah I'll just if I can keep

it this this distance is fine yeah

okay

for

so for the audience in the room uh we're going to start a session about five or 10 minutes uh from the schedule time want to make sure you know all the audience get the benefit of this talk and thank you for being

patient thank you

live I live in you know where that is it's like near a little bit South yeah it's like 15 minutes 10 15

minutes cou more minutes if you want yeah a couple minutes yeah he's not aware of what's how's the schedu been in the morning 15 minutes and I

was

ago

never I mean I have like 10 years ago just some local security conferences in

DC I

just it

isch set never done it before oh tell me about I on Wednesday it was in does that even mean I don't even see the audience I was literally and you see those uh each of us has a me so you see everybody's moving around in the room even people standing behind just I don't know what's going on but me and my co-presenters will be on the screen anybody can us yeah oh it's in my uh am I supposed to put it on here oh it's I wasn't sure how to use it I was wondering it's in my uh it's in my backpack oh yeah break it and put it is that what you're supposed to

do yeah I had no idea okay you got to show me afterwards I I don't know how to do it but that's typically I think from the one of those in a long time either hey folks we're going to get get started in just uh couple minutes just waiting for people to roll in a little bit after lunch can everyone hear me by the way all right cool thank

you I saw on the you guys and never heard of

I would have to yell so loud for anyone to hear me so it's kind of nice not to have to yell apologize you

your

could

think

15 um I hope you all have wonderful lunch break U my name is Angus Chen I'm your track one host our next speaker is anush Sunni his topic is the power and pearl pearl of binary emulation for mware analysis let's give him a round of applause the not have any chance of hearing me without the headphones all right there we go it's time what's that thank you all right everyone and uh welcome back from lunch welcome to your upcoming food coma uh my name is Anu Sony I'm a principal threat researcher at Blackberry uh really excited to be here this is my first bsides Philly conference only my second bsides ever I'm actually local to this area I live

in Bucks County just a bit north of the city so happy to come home couple things I want to get out of way before I get into the actual material first yes BlackBerry still exists can't tell you how many times asked that question after I tell someone I work there as if I might be making it up uh but Blackberry does still exist no I do not use a Blackberry phone nor has anyone ever tried to force me to use a Blackberry phone although I recently did acquire this beauty right here for my office it's a disassembled Blackberry Curve 9300 uh so after years of working at blackberry and getting tired of people asking me the question I figured

if I don't use one I might as well still have one so got this Al Etsy which is pretty nice I'm a malware reverse engineer at blackberry and you know formerly I was at silence anyone ever heard of Silence before um so they're AV company that got acquired by Blackberry a few years ago so hence I now work for Blackberry but I spend my days analyzing malware which of course is the focus of the topic here today I help uh understand detections based on our products and also support incident response I'm also a s instructor and author I co-author the S reverse engineering maare courses author the advanced code analysis class that came out a couple years ago and I'm recently

started creating YouTube videos I only got like five or six up there they are a lot more work than I thought to create and publish uh but if you want to check them out they of course focus on maare analysis topics or at the very least just search for my name and you can see silly photos of me pointing to random things because that's what people do in YouTube thumbnails and uh finally I even though I was from this area I live in Maryland right now uh with my wife and three young daughters and as much as my wife and I tried to escape pink and unicorns they did find us and unicorns actually come up quite a bit in this

presentation though in a slightly different context so let's now talk briefly about what I'm going to get into here because maybe you're trying to decide do you stay here do you walk over there and by the way I looked at that abstract in that title uh very impressed even based on the first few sentences anyone here anyone here used to watch Always Sunny in Philadelphia that that talk somehow managed to work in that reference both in the title and the description which is pretty cool but getting back to this talk what am I going to talk about here well we're talking about binary emulation and what that is is basically the simulation of execution right so whether you're

executing individual instructions that are generally executed by the CPU executing functions or an entire program binary emulation allows you to simulate when I say simulate I mean that the instructions aren't actually being executed by the CPU they're being run by software right and that provides a lot of flexibility and interesting opportunities for analysis as you'll see over this presentation now why are we doing this well as a maare reverse engineer I spend the vast majority of my time focused on Deus right so these days the hard part about malware analysis once you've been doing it for a little while is not necessarily just understanding functionality it's getting to the point where you can even read

what that functionality is most strings for example embedded in M these days are ausc or encrypted configuration data is generally going to be encrypted nextstage executable content whether you're talking about Shell Code or additional executables that might be pulled down from the Internet or embedded into the actual program all of those are generally going to be encoded or encrypted as well so anything that speeds up that process allows us to overcome that complexity of understanding algorithms and most importantly allows us to scale is something that as a malare analyst you want to dig into which is why I chose this research topic now how are we going to do this well we're going to look at

it in the context of several fr Frameworks that have been available for quite a while and that are very helpful towards getting started with emulation for hour analysis so unicorn right is one of the most popular Frameworks it is really the foundation for all the other Frameworks you see listed here such as Mandy and Speak Easy or the killing framework or dump ulator I'm going to touch on all of these briefly but we're going to focus specifically on both unicorn because it is the foundation for all the other Frameworks and the killing framework now just some uh ma analysis warnings here right some trigger warning for what is to come uh there will be some code here right so there's going to

be some python there's going to be some assembly as well x86 s64 my disassembler of choice when I perform reverse engineering of malware is gidra okay anyone here use gidra or ever use gidra okay we got a few people a few takers all right uh so just want to make clear that I don't expect you if you don't have a background in malare analysis to understand everything that's on my screen my hope is that at the end of the presentation at a bare minimum it piques your interest I also want to say that if you're interested in any of the code or any of the samples I discuss just shoot me an email I'll send you or I'll give

you actually my email address at the end of the presentation just you know take a picture of it shoot me an email put bsides in the subject and I'm happy to send you a zip file full of malware well there's going to be scripts in there as well but bottom line is all the stuff that you see referenced here right again I can't show like a hundred lines of code on my slide I'm going to show you excerpts of it but all the material will be in the file I send you if you're interested by the way any anyone here actually done Mal analysis before I know some people raise their hand for gedra so maybe other people have touched it

okay so I'll do my best you know as we go through here to like touch on some of the basics of course not enough time to do all of that um but just understand that even if you don't get all the lines of code on the screen I want you to just try to hook on to the overall flow the mindset and uh hopefully that pequs your interest enough to take a look at these some samples run some of the code that I can provide you with and see what some of the benefits are so let me let me give you kind of a preview on where we're going with this so what I have on

the screen is is a disassembled content from a piece of malware this is an elf binary so it targets uh Linux and um so it's an elf file and essentially what I did is I loaded this up into gidra and I'm showing you again just an excerpt of it now even if you haven't really seen assembly before I can tell you probably the most important instruction you'll focus on is the call instruction right so let me bring up my pointer here right so these are the call instructions you see right here and a call instruction basically calls a function okay there's going to be an operand which comes right after it that's going to be the function

or the address of the function that will be called and during execution as it goes from top to bottom when it hits the call instruction it'll jump to a location in memory where there is a function and it will execute that function during reverse engineering the understanding of functions is very important because after all the goal of reverse engineering is to understand the underlying functionality and of course functions contribute to that now you'll see this function actually has a name decore comp okay for decode uh configuration now while you can during the and should during the reverse engineering process actually rename functions along the way as you learn about them I actually did not rename

this one it came like this right and this is an indication that symbols were not stripped so all of those useful pieces of information that a developer would actually use such as function names variable names argument names A lot of that was actually in this elf binary so this is an example of an XR dos troan and as the name implies the purpose here was to at first gather some information from the host but more importantly help facilitate dos attacks against a specified Target so again we got some assembly here we got the uh call to a function called deorc for decode configuration that's a pretty good hint that's that's probably what the function does now as we look at a

function during even nearly any re effort what you'll typically do once you identify a functions also look above it above it I mean like these instructions right here usually the instructions immediately preceding a call instruction are going to specify arguments being passed to that function now one of the arguments I want to focus on here you'll see referenced with this label that's dat dat underscore followed by heximal value which is essentially a virtual address that's called a pointer and a pointer just points to some interesting stuff in memory turns out that what it points to if I continue here with my slides is this data you see on the right hand side it's just a sequence of bites

it's not meaningful it looks like it's got some you know ask here but certainly we can't make sense of it just by reading it if you were to look at all references to deore comp and on the right hand side that's what I'm showing in gidra or in any disassembler you can see references to a function you'll see that there are 13 references if you were to actually dig into that function the deckor comp you would find and I'm not going to show it here on the screen because it's it's way too long but there are many mathematical operations like add exort shift right shift left and so on so all this information I'm presenting you here with

the fact that there are multiple references like 13 references the fact that there are mathematical operations in this function and the fact that there's an argument being passed to the function that points to what appears to be some J gibberish all of that seems to indicate that this function might in fact perform some sort of decoding so as a reverse engineer one of my jobs is to as quickly as possible extract indicators of compromise right you might be familiar with that term we're talking about IP addresses file names domain names Etc anything that might help identify this activity on a host or across the network so my goal is to as quickly as possible and efficiently as possible

extract those values I also want to keep in the back of my mind and this is the case in many incidents that you don't generally just find one piece of malware oftentimes there are multiple maybe even you know tens or less likely to be hundreds but multiple such that I don't want to have to do this manually over and over again and ideally if you have time you create something automated that allows you to scale your analysis not only to this particular case but other ones you might encounter like it so what are some options we have for trying to deop uscate uh these those random bites those strings in order to make it something more sensible one option is to

given that this is an elf binary go ahead and execute in Linux and just take a look at memory right that's probably the quickest approach but arguably not very precise because when you execute it in memory well first off I got to hope it doesn't terminate immediately because if it terminates then the strings in memory aren't going to be there anymore uh or at least not in a way that I can easily probe but even if they are in memory it's hard to know what strings in memory correlate with those Opus skated strings okay so there's some some challenges there I could debug it with GDB again I'm on my I'd have to go to

Linux machine and actually use GDB to debug it set the appropriate break points One Challenge there is that when you execute any program you're basically exploring One path of execution right now maybe that path of execution is the interesting one maybe it's not but most programs have multiple decision points throughout the execution of that program and who's to know if you're hitting all of those decision points correctly right maybe the malare has detected that it is within a VM and it's going to operate differently so again not very precise and I also want to say up front that I do 99% of my job involves looking at Windows malware and so I do most of my

malware analysis in Windows so when someone gives me enough binary I get a little nervous when someone gives me a Maco binary I'm likely to just have a sudden urgent call for my wife and you'll probably never see me again so I like to stay within my windows VM as much as possible and emulation does allow me to do that as you'll see here in a moment now another possibility is to write a python script right maybe you're really good at writing python uh you can do that here and it would work but there are certain things you need to investigate that might take some time first of all you got to figure out where

all of those off you skated strings are in the binary and programmatically find those right so you're gonna have to create some sort of approach to actually doing that programmatically also you're going to have to possibly Implement an algorithm now I didn't get into the actual algorithm used by this function that does the decoding but that could be pretty complicated now maybe you get lucky and it's a known algorithm maybe it's just Bas 64 decoding well in that case you can just find a python module but in many cases it's not it might be something more complex it might be something completely custom or a modified version of a known algorithm all of that makes this task a little

more challenging and that of course is where emulation comes in because the beauty of emulation although I will say it has some downsides that I'll get into the beauty of emulation is that it doesn't matter what platform you're on I can be on a Windows machine and you use emulation to run essentially run Linux code and it doesn't matter what the algorithm is because if the program is going to just run its algorithm at runtime in order to deop youate the content I can use the power of emulation to run that exact code so that I don't even have to look at the algorithm and it'll do the work for me now the end result the end result

would be the creation of a python script here and I'm this is the the script scpt I got right this is what it's called and I basically run this script against this file this elf binary and I'm able to actually output all of the decoded strings now I know what you're thinking cool man you're showing me some some script output here uh what's in the script I'm getting there but what I want to show you is just a glimpse at the power of emulation here I ran this on a Windows machine okay I ran this elf code on a Windows machine I emulated on a Windows machine and I didn't even look at the actual decoding Alm I didn't have

to touch any of those mathematical operations I just use the code built into the program to perform the Deus so what are the caveats with emulations there are a lot of them so you are in an emulated environment which means it's not the real environment of the operating system and so there are some limitations when you emulate code you don't necessarily have access to all of the file system objects or the operating system resources you'll see what that looks like how it manifests when you actually or when I show you some more examples here performance perance is definitely slower right it's definitely slower than running it natively on the operating system also prior manual analysis is required

although you can create a sandbox with one of these emulator Frameworks the way I'm going to show you here to kind of probe Specific Instructions and functions right you can't just run a program and expect it to work you're going to have to do some manual analysis first and that's why emulation is a good approach to automation right the manual has to come first then you think about Automation and it is best suited for targeting small groups of interesting functions as a reverse engineer you can identify what interesting functions are right for example this one has a bun of bunch of mathematical operations I don't exactly know what it does but I can see that it's doing something of note lots

of tools out there like um mendan cop mendan COA tool which will help you identify interesting pieces of code you do have to have that background and that knowledge and then you can apply emulation to those targeted scenarios so this is definitely more on the advanced side of the house when considering automation okay so don't don't confuse this with just like running something in a sandbox all right let's talk a little bit about unicorn which is the foundation for just about every emulation framework unicorn is basically a CPU emulator what that means it is really good at executing individual instructions like the ones I showed you on my screen earlier but that's it doesn't know anything about Windows

doesn't know anything about any Opera system Maco or any Linux operating system doesn't know anything about file types doesn't know what an elf binary is or a portable executable Windows executable file it just knows instructions and it's really good at that which is why all these other Frameworks that try to add a file type awareness or operating system awareness right they use unicorn as a foundation but let me discuss an example where you could just use unicorn right this is a good way to just understand the basics and the challenges of emulation so you might be familiar with perhaps the idea that in October uh there was a story right the FBI took down quack bot I'm using air quotes

there because usually those types of activities are important right um but but they're usually temporary right now there's already some reporting that the infrastructure is is quite alive and well well if you read up on the story perhaps you heard about it you might know that the FBI actually deployed code to machines infected with quot and not only did did they deploy the code they provided the code to the public which was pretty cool so not really malware I guess because it's us I guess that makes it okay but uh it does use a lot of maare analysis techniques so I took a look at this Shelf code I actually did a deep dive into it in one of my YouTube

videos that you can check out if you like I'm just going to talk about it here from the perspective of emulating some of that code but more information in that YouTube video if you like it so what I'm showing you here is a screenshot of uh some of that code and by the way that was Shell Code they deployed Shell Code so it's not a Windows executable it did Target Windows systems but it doesn't have a nice header doesn't have all that nice information that allows you to just double click on it while upon loading the Shell Code into G one excerpt of code I'm showing you here simply shows a bunch of move instructions right and

hopefully in the back you can see right it's just a ton of move instructions now as a malare analyst when you see a lot of move instructions like this that is usually interesting and if you look on the right hand side where you see the operand or the values the move instruction is working with you'll see a lot of these hexadecimal values okay now you might think okay maybe those are strings or something and maybe they are you could start reinterpreting some of those values and you would find that they actually represent characters asky characters this is a pretty basic opusc technique called stack strings because most malare analysts even the most Junior of them one of the first things

they'll do is run the strings utility against a sample in order to see what sort of strings are embedded and a lot of times even today you get some helpful insight you might be lucky enough to see IP addresses domain names registry keys Etc One Way Ma authors or in this case the FBI Aus skates that information is via stack strings which is using these move instruction instuction to dynamically create strings at runtime so as each of these instructions is executed it'll actually start putting together and building a string that will be used later on in execution so one approach you might think of to trying to figure out what is this stack string is

you could manually kind of extract all these hexadecimal values obviously that is very very manual if this was a Windows executable you could quite easily load this up up into a debugger get to this point in the code and literally step over each instruction this is code however and so it takes a little bit more work you got to use a utility like a shell code to exe in order to convert it to a.exe or there's other tools out there like run SC which will basically inject Shell Code into memory allowing you to actually analyze it but I'm going to obviously show you how to solve this problem using emulation so what I have here are

basically some cells from a Jupiter notebook anyone here use jupyter notebook a few people pretty awesome uh so I like to use jupyter notebook as a way to kind of explore for uh not only a file but to kind of build upon over using iteration kind of and improve my code so again don't expect you to understand each and every line here in just a minute or two but let me highlight some of the important parts of this code so we begin here by just uh doing some imports okay so we got some imports at the top which have to do with unicorn then I basically have to insert my shell code so these are the op codes

and if I go back here these are the op codes that you see on the left hand side here in the middle okay so these are all op codes and on the right hand side is the representation of those OP codes as instructions so what I basically did is I just copied right I just copied right click copy inra and took all of those btes that have to do with these operations that I think are building a string and I placed those right here into this SC variable I then initialized the emulator framework unicorn you have to specify the platform and the architecture that's all this does right here and this is a 32-bit sample and then and this is not

the end right there's a couple slides like this I have to start creating the right environment to execute this code okay now what you're seeing here is thankfully what a lot of these Frameworks take care of for you but I'm just starting off at the foundation which is unicorn and with unicorn you have to be very explicit about what you're executing what you're doing and where in memory all this stuff is so what you see here on the bottom left is basically me allocating space for the stack the stack is a structure in memory where you'll typically find arguments and variables during execution and and as part of this basic acusation approach of Stack strings well it's got the word

stack in it it uses the stck so I need to allocate space for the sack the most important method here is going to be mu. memore map and that does the allocation okay so I specify an address a size and then I'm allocating space for the stack now when you're reverse engineering you also have to be concerned with registers registers are kind of like variables right they're technically on chip memory locations that just store various important things like addresses one of the important registers is called ESP it has to point to the stack so all I'm doing here is I'm populating the ESP register by providing it the address of the stack that's just something that needs to be

in place for any code to successfully execute next I have to first I I mapped memory right allocated memory for the stack now I got to allocate memory for the code I want to execute that's all this block up here does you'll see another reference to me map just allocating space for that Shell Code and then I use mem write to actually write that shell code to the allocated space and that's the setup right it's it's not too bad once you've done it a few times you get used to it okay you got to just import the appropriate module you got to set up the stack you got to set up a location in memory to actually uh store

the code and then you're just going to run it so this next block here actually emulates the code now whenever you emulate code you have to specify a start address and an end address so start emulating here and end here and that's all this does it basically points to the location in memory where that I allocated for the code and then I use the L method here to get the length of the Shell Code and that's going to help me calculate the end so I have the start and the end I run the code using uor start and now I actually need to look at the stack because the stack is where hopefully there will be some sort of

deated string right this the the result of the stack string so here I do a m read to actually read from the stack and now I just need to look through whatever I just read in to see if there are any interesting strings there now what I have here is just a function really don't worry about the details it's just a function to extra extract uh aski and unic code strings again don't worry about uh the details I'm sure you could come up with something better this is what I just kind of came came up with on the fly in order to extract that information and if I then run all of this code which sets

up the stack sets up m memory for the code writes the code to that location runs it and then reads whatever is on the stack and then looks for aski and unic code strings what you get is basically this output here on the bottom so we'll see some strings like virtual free virtual alak now some of them are kind of swished together that's just how they appeared on the stack U but as a reverse engineer some of these apis are particularly notable virtual aloc is one I always focus on because it's used to allocate memory and whenever a suspicious executable is allocating memory always ask yourself why right is it for some next stage executable

content right is it going to unpack something or de off you skate something at runtime is it going to play Shell Code there is it about to download something if it's allocating memory there's going to be a good reason and so that's something I usually focus on and also virtual protect which will also come up later on in this presentation virtual Alec is used to allocate memory virtual protect is used to update the permissions on a segment in memory right and so when updating permissions occurs during the execution of something suspicious often times that's to make it executable and anything that's executable for a reverse engineer is something you want to dig into so Bottom

Line This is one approach to using just unicorn which again is just a CPU emulator it can only execute CPUs no OS awareness no file type awareness I extracted the Shell Code from the FBI's quackpot takedown code I just put it into my Jupiter notebook and of course you could replace that Jupiter notebook now with any code you like and it just ran it and then it took a look at the stack and extracted unic code and asky strings and that's how we got these deated stack

strings so that's the foundation for all these other Frameworks I'm going to touch on killing in particular but let me just briefly discuss the other Frameworks that are out there and there are a few others Beyond these but these are the most popular so mandian Speak Easy is a pretty well-known framework anyone used this before um so this this is a really nice one because not only is it a python libr library but it comes with a standalone tool so let's say for example like in the case of the FBI scenario you just have some Shell Code the command line tool that Speak Easy comes with you can actually just provide it on the command line with a pointer to

your shell code and it'll just really do all the heavy lifting of emulating the execution and actually provide you with a trace of any apis any functions that have been called so that is pretty nice but it does only target windows and for this presentation and this research I wanted to use something a bit more versatile uh again a very good tool check it out but that's not what I wanted to go with for This research dump is another one dump was created by the author of x64 dbook Who definitely knows what he's doing it is only windows so that's why I didn't kind of pursue it for this particular research one benefit and drawback of dump ulator is that

instead of emulating apis like Windows apis for example it emulates CIS calls CIS calls are basically functions calls that are closer to the kernel and the reality is is that many windows API calls like multiple windows API calls might lead to the same CIS call so by taking this approach and focusing on emulating CIS calls instead of Windows API calls the author hoped to just decrease the number of apis or the number of functions that had to be basically implemented all right so that is a benefit the drawback is CIS calls are a little bit harder to implement and there's very little documentation on them Microsoft doesn't really want you learning about CIS call so a lot of it's

like third party documentation not ideal another drawback is that in order to use dump later you got to create a mini dump for any executable that you want to emulate a mini dump is a subset of a crash dump file you can create these uh mini dumps with x64 debug but it's just a drawback and I think it also kind of prevents some level of scalability because now for every file you want to analyze you're gonna have to manually create a mini dump so I decided not to go with this one but understand that this is out there and is a pretty cool capability as well we're going to focus on killing okay it's crossplatform cross

architecture is operating system aware is file type aware uh and focuses on API emulation so as an introduction kind of to using one of these Frameworks I like it because it can do really whatever the other ones can do and it can also focus on other operating systems as well so let's let's get our feet wet with killing here and let's revisit a a separate example emotet I'm sure many of you have heard of emotet before right many versions of emotet continues to be a very popular downloader what I have here is I loaded an emotet sample into gidra and similar to that first example I showed you of uh here I'm focusing on a call instruction so we got a call

instruction called as _ decode now my name is anony those are my initials and it's not a call out to me I renamed that function so in this case wasn't as lucky to get a binary that uh had not been stripped uh most binaries windows executables that are malicious are stripped so I renamed this one to decode just to make it clear that I suspect this is a decoding function as I did before I look above it in order to take a look at arguments and I do see yet another reference to dat underscore which is a pointer to some location in memory now if I actually dove into this function you'll see code that looks like

this okay this is just an excerpt of the code but when I say that a function has a bunch bunch of mathematical operations now this is really what I'm referring to you'll see for example that there are exors here there's a shift right over here there's more shift rights and most importantly I know it's kind of hard to see probably on the projector here but you'll see that there is a dotted line that extends all the way down here and it starts down here and it actually points upwards that's a visual CU that you're looking at a loop so we got all these mathematical instructions in a loop which means they're going to happen over and over again this is yet another

indication that you're looking at a function that probably does some encoding or decoding because it's taking some content and repeatedly running the same operations over over and over again this function also had many references just like that one in the elf binary this this time it's 44 locations just an excerpt here of course but these are all indications that you are probably looking at a function that does something valuable from a Mau analysis perspective my next step is generally to confirm that my suspicions are correct now this is Windows I like working with windows so pretty easy to pop open this executable this dll into a debugger my debugger of choice is x64 debug most

people who do windows mware analysis will use x64 debug and one way to kind of confirm my suspicions about this function is to load it up in this debugger and set a breakpoint breakpoint basically tells the CPU during execution to pause at a specific instruction that allows a maare analyst to then look around in memory and understand what the heck is going on okay that's how you kind of build a case for what maare might do well what I did here is I set a break point on the very last instruction for this decoding function okay so the very last instruction of the decoding function the last instruction of most functions is going to be re for return

right go back to where you came from so I set that break point and then I ran the the program up until it arrived at this breakpoint RP is another register IP stands for instruction pointer and this is just telling me hey we are paused right here this gives me an opportunity to then figure out what was the result of running this function and when you look at 64-bit code which is what this is and you want to understand the result or the return value of executing a function you'll often look at a very specific register called Rax on the bottom here showing you the contents of Rax and what you'll see is that the contents are this hexadecimal

value which turns out to be an address and that address points to this string right here which is advapi32.dll okay that's a dll on Windows that allows or contains functionality uh to interact with the file system interact with the registry and so this is one that malware often uses if I keep executing this program I will keep arriving at this break point because it's hit over and over again and as I hit each one if I were to show you each value of Rax you'll see that it keeps deating dll names then it deop RNG then it deop skates what looks like some sort of a format string and there are many many other strings what this tells me is that

yes this function does in fact do some sort of decoding and I now want to get access to all of those decoded strings again a lot of time as a ma analysts spend decoding stuff right because that's where often times the ioc's are and the function ity that the adversary the author is trying to hide are generally going to be opusc so my goal now is to figure out how to decate all the strings now I could continue running it like this and just manually copy and paste each string but remember when you're running code you may not hit all potential function references so I might just get a subset if I want to comprehensively get all the calls to

this function and decode the strings emulation provides the way forward now when I'm going to emulate something before I create this like script to emulate all possible strings I'm going to just emulate the execution of one string okay or the the Deus rather of one string so I got to dig a Little Deeper here uh if you take a look at this as decode function that I have renamed uh you'll find that it really just takes two arguments right you'll see two arguments referenced right here and if I look at the decompiler output which gidra has right similarly I see there are just two arguments pram one and pram 2 this becomes important because when I emulate code I need to

make sure that I'm emulating very specifically the instructions necessary to successfully execute the function so if I go back now to a particular call to this asore decode function I need to figure out which one of these instructions do I need to execute I obviously need to execute this call instruction but I told you earlier on that arguments passed to a function are usually going to be provided somewhere up higher right this is 64-bit code it takes two arguments the two arguments are going to be passed via ECX right here and RDX okay I'm just giving you that background obviously we can't talk about all the foundations of re here but for 64-bit code looking at this

particular sequence of instructions ECX and RDX are going to contain the two arguments that are being passed to this function what that means is that since I need to choose my starting point for emulation and my end point for emulation I'm going to have to start at the earliest point I require which is right here okay and I can end or I need to end after this function is executed so I'm going to end right here so I got my start right there and I got my end right there the address specified as an end doesn't actually like this this instruction won't actually be executed once it gets to this address it'll just stop so this

should be the group the minimum group of instructions required to successfully decode a string located at this address okay this is a pointer this is the python code required to actually perform that Deus and again I'm not going to get into each and every line here because if you're not already falling asleep that'll definitely put you to sleep but the point I want to make here is we just got to initialize killing okay this is the standard boiler plate code code that you can use and that I can provide to you to go ahead and get killing going basically you need to specify a path to your actual dll which is provided right here and you

need to specify What's called the root FS the root file system as part of installing killing you need to specify a directory for Windows emulation you got to copy into there all the system DLS and registry that it requires that's where all that is located but then you're just going to use q. run to actually emulate the code and this is the start address I chose and the end address I chose once I'm done the emulation I just need to check our ax right that's where the uh return value is often stored so I'm using this method right here to read the contents of Rax and I just extract the strings if I were to do that for this

code right here and run it what I get is what you see here on the bottom left now this is this is just a test to emulate the deification of one string and you see I have the string RNG maybe that's hard to see in the back right RNG which was one of those strings that we saw in the debugger in my previous slide so that means we've successfully emulated one string okay now the idea here is to apply this to emulate all strings now there are a couple challenges to consider when doing that one you're going to need a disassembler API so whether it's binary ninja or Ida or gidra you're going to have to use its

python API in order to find references to the function of interest and basically make sure that for each call to that function you step backwards like in addresses in order to get all of the instructions that you need remember you need rcx and RDX and so you'd have to program that in another challenge here is that all references to this function are actually not code so there's an there's a screenshot here showing you basically that one of the references to asore decode ends up being in like the Pata section which is like exception handling stuff bottom line it's not code and so you need to kind of consider some of these cases also when I tried to

start emulating other strings in some cases like this one right here which is a reference to a registry location often used for persistence it didn't actually extract the string properly and this has to do with like a basically handling Unicode strings so that's something else I had it to I had to incorporate into a larger script now other other issues I had to address here uh there was a case there was one case where rather than passing that dat location which is a pointer to a string to Diop skate uh the reference instead was to rcx and the point here is that basically this function that was calling the decoding function actually had passed to it by an argument the

pointer to the off skated string it just made it a little bit more complicated and I had to go one step further backwards in references in order to collect all the instructions necessary to execute this successfully now if you're asking yourself right now wow this seems like a lot of work why would I do this this is certainly not something you would do for every sample you analyze okay as I mentioned earlier this helps tackle situations where the algorithm is sufficiently complex and custom such that it would take less time to do this versus actually implementing an algorithm which can sometimes quite a bit of resources okay so I eventually did go on to create a script this is

just an excerpt of that script like I've mentioned a few times I can provide you with the full script I use the gidra API for this and I'm not going to walk through all the code here but basically I hardcoded the address of the decoding function and then I found references to that function and in every case I noticed that whenever there was a reference to the function if I basically stepped backwards like four instructions that would be a good start starting point and then I simply have to execute all instructions until the actual decoding function and that worked just fine now if you're also saying hey you hardcoded the address of the decoding function that's pretty weak and

unscalable because even a similar piece of malware would have that has a similar function might be located at a separate address well this is where Yara can be helpful so while I'm not going to talk about it in this presentation you can create a Yara rule for the decoding function of interest and then use Yara python to identify that function and then incorporate that into your emulation script so so that it is more scalable to variant of this malware what you see up here is a command line where I basically used gidra uh thank you I used gidra headless analyzer which allows me to run scripts against a file loaded into gidra without actually using the guey I'm not going to

talk about it because maybe you use gidra maybe you don't I don't want to focus so much on gidra in particular but upon running the python script called emotet decode strings here against this dll I was able to then decode all of the strings of which this is an excerpt and you'll see references to some of the strings that we saw in decoding as well okay so this is a good option when again the algorithm is sufficiently complicated and you're trying to scale this across multiple samples couple other challenges I want to mention I know we're only five minutes away from our end time here I talked about exor dos before and you already saw this output I created as a

result of using emulation to deop usate configuration information well there was some challenges I hit along the way there as well this is some code I used to initially test the emulation and de ausc of one string and when I tried to run this unicorn gave me an error that said invalid memory right wasn't quite sure why this happened but remember that when you're using any of these Frameworks right it is dependent on a appropriate environment being set up in memory now generally things are taken care of for you you don't have to worry about unicorn when you're using a framework like killing but this was an indication that something went wrong with regards to what was happening in memory and what

I figured out by just doing some debugging in a python interactive shell was that EBP which is another very important register during the execution of any function for some reason it was pre-populated with a value of zero which wasn't going to work right so this thing should normally point to the stack just like ESP so when I finally discovered what this issue was I then incorporated into my code right here some code to basically populate the value of EBP and I simply populated it with the value of ESP okay which is common to reverse engineering an x86 a 32-bit program in Windows okay so just an example of an issue you might encounter and then you

have to overcome it also keep in mind that once you hit one of these issues or you implement a script that uses killing you can generally reuse that code over and over again with small tweaks for other samples right so seems like a heavy lift certainly it is to do it once but as you do it over and over again you'll see that it becomes a lot easier and finally just I want to Breeze through kind of these last slides I didn't anticipate NE necessarily getting through all of these but I wanted to just touch on them so that you can see kind of how you can start to think about tackling other common reverse

engineering tasks so another task besides deop skating strings is to deop youate nextstage executable payloads so I have a file here called bacon. exe which is a stagelist Cobalt strike loader and what that means that you might have heard of cobalt strike before popular red team tool what that means is upon execution it's going to unravel in memory a second stage executable binary that's called the beacon that's the ultimate payload of cobalt strike that gives an attacker command and control over the victim machine and so that's essentially what this was now if you look at the Imports or the dependencies of this executable you'll see a reference to Virtual protect I mentioned that earlier virtual protect changes the

permissions of a section in memory and if you read about virtual protect on microsoft.com it takes four arguments one of the most important arguments is the first one which points to the region in memory whose permissions are about to be changed as a reverse engineer whenever I see virtual protect being called and specifically if permissions are being updated to be executable I'm going to want to pay attention if you were to debug bacon. exe and well a debugger and you set a breakpoint on Virtual protect okay breakpoint again tells the CPU to pause you would eventually arrive there for bacon. exe and if you took a look at the first argument which specifies the address and

memory where it's changing permissions you would you would have to go to this rcx value right here and use a debugger you can just dump this address down in other words show me what is located at that location in memory you would find there to be a 45a anyone know what 45a is or MZ Windows executable Windows executable right so we've now found the under we found the underly Windows executable in a debugger if I want to automate this with killing I can do so because killing provides me the ability to hook functions or intercept functions and it does so by basically using this this method right here okay so we are pretty much hitting the end of the uh talk here

so I'm basically going to say and mention that these remaining slides talk about how you can use this set API method in order to basically say Hey whenever virtual protect is called go ahead and execute a function that I Define called hook VP and what I eventually do here is I implement this hook VP function and that hook VP function is going to essentially take that address that first argument to Virtual protect and it's going to just write the contents of memory located at that location and this allows me to create basically an automatic unpacker for Cobalt strike stagelist binaries without worrying at all about the actual algorithm okay just focused on the fact that virtual protect points to the Deus

skated uh uh nextstage payload in memory I can use emulation combined with its hooking capabilities in order to go ahead and dump that down and if I jump forward right there's some more content here but here at the end I actually run the entire script and I'm able to actually dump that Cobalt strike Beacon down to dis all right so that's basically it uh closing thoughts emulation is a powerful option again when you are encountering complexity in an algorithm or you want to facilitate scalability it works but as you can see definitely not easy uh lots of Frameworks to choose from killing I think is very promising especially given its multiplatform multi-architecture nature and just keep

in mind that although this is a heavy lift if you haven't done it before each emulation script you get you can use usually utilize portions of that code in future efforts and finally clearly unicorns aren't just for kids all right that is uh my contact info feel free to take a snapshot if you want to reach out to get all the materials the scripts and the samples happy to give those to you and of course I'll give you the slides as well but thanks for joining me here for these 50 minutes if you have any questions I'll be off on the side feel free to chat thank

you all right thank you so much all right real quick announcement on merch if you want t-shirts badge components stickers or other miscellaneous sunry we have a table set up in the corner of the chill out area in the other room uh stop by there pick up some good stuff thank you my number I love it I love that one

too I really

app was

able

here yeah as long as this loads up