Basic Malware Analysis for Incident Responders by Jared Graff

Name: Basic Malware Analysis for Incident Responders by Jared Graff
Uploaded: 2024-05-24
Duration: 32 min 33 s
Description: Basic Malware Analysis for Incident Responders by Jared Graff Description My presentation goes over some basic malware analysis techniques to help incident responders in their investigation. My presentation will go over some of the following topics: PeStudio, trid, pdf-parser, pdfid, Strings, FLOSS

BSides Tampa32:33119 viewsPublished 2024-05Watch on YouTube ↗

Speakers

Jared Graff

Tags

CategoryTechnical

DifficultyIntro

StyleTalk

Mentioned in this talk

Tools used

CyberChef Floss Ghidra IDA Pro pdf-parser pdfid PEStudio Radare2 Snort Splunk strings trid UPX YARA

Platforms

FLARE VM REMnux

Vendors

CrowdStrike Trellix

About this talk

Basic Malware Analysis for Incident Responders by Jared Graff Description My presentation goes over some basic malware analysis techniques to help incident responders in their investigation. My presentation will go over some of the following topics: PeStudio, trid, pdf-parser, pdfid, Strings, FLOSS, Differences in strings in binaries and stacked strings, finding network indicators in binaries using python. I will go over these tools and provide real life examples of how to apply them to the investigation. My goal is to provide everyone a basic understand of how to safely use malware analysis for incident investigations.

Show transcript [en]

[Music] uh thanks for the introduction Chris my name is Jared um I'll be giving you a presentation on basic malare analysis for inant responders and feel free as I'm as I'm explaining things I'm going to have a lot of screenshots so feel free to just raise your hand if you have any questions and I'll be standing back here pointing out some um some key points on the slide with the the laser pointer here so here's the agenda for today um so the the goal of this presentation is to go over static malware analysis techniques and then in response to that I'm going to go over some evasion techniques that malware authors will use and then kind of the back half of it is

the response that you could use to some of those evasion techniques and we can skip this Chris just gave a great introduction so thanks Chris I did want to give a quick shout out though because they did fly it from DC my co-workers that I work with Nightwing at Dave and carot back there carot is my supervisor and Dave is my cooworker so the goals of this presentation is to give to give basic understanding of malware analysis to find some indicators of compromise and then I'm also going to show some evasion techniques that malware authors could use to avoid some of these types of detection C and really this isn't meant to be a detailed class

on reverse engineering it's just kind of a kind of an intro into into Mal analysis so I'm going to start off by going over a couple virtual appliances which is remnic and flare so when you're performing malware analysis you need to have an environment in which you're doing it so these are I would say the two most popular and they kind of serve the same purpose the main difference between the two is remnic sits on a Linux distribution where flare is a collection of tools that will sit on a Windows 10 or Windows 11 operating system and which one to choose you're going to find that most malware analysts are going to use both but in this

presentation you're going to see a lot of screenshots from my from my remx VM and I just wanted to touch on the installation of the two because they're they're slightly different so remnic and just to point out they are both free and open source so those links right there you can go access them and download them yourself in your home lab and the main difference is remnic comes prepackaged into an OVA file where flare is a little bit different it's a a collection of scripts that was developed by mandant and you need to download the windows VM first and then install the the flare git repository on top of that VM so before I get into the specific

techniques of malware analysis I just want to do a highle overview of kind of what it is so when I think of malare analysis I think of three different three different segments of it so static analysis Dynamic analysis and then code analysis so static analysis is where you're going to analyze malware without executing the code itself where Dynamic kind of the opposite you're going to execute the code and then look at the the various behaviors this could be processes spawning in the malware um network connections it makes really just a variety of things and then the last one there is code analysis so this is where you kind of get into reverse engineering a little bit more

you're going to start looking at the low level code like assembly and you're going to determine the specific functionality of the malware today is mostly focused on static analysis however I will have a couple screenshots of of a disassembler gedra so there will be a little bit of code analysis sprinkled in so the goals when we're talking about doing Mal work analysis for instant responders is we want to find some indicators of compromise so this could be a variety of things one is suspicious strings so this could be anything as obvious as like the word evil it could be the word malare it could also be completely different things like Windows API Windows a and windows API calls

another thing you could look for are domain names so this could include evil.com malware.trace and then you could look for suspicious files so this could be executables dlls L files things like that then the last one there is IP addresses are are a main indicator of compromise so now we I'm going to get into the more technical part of the presentation so we're going to look at identifying the file so this is kind of your first step when you're doing malware analysis it's really a very important step that sometimes people were people will forget because you need to identify in order to determine which tool you want to use because certain tools will only work on certain file

types in malor analysis there's really no one one tool fits all so you can see on the command or on the presentation there that one of the most common utilities to use is the file command in Linux so you can see I got a screenshot there running file against evil and this is the name of the executable that I'm analyzing in that screenshot and then the output is very straightforward it gives you the name of the file which is evil and then the file type so this one is an elf 64-bit this is basically just saying it's a Linux file and as you dive down deeper into your reverse engineering Journey you want to start paying attention to the

architecture type because this will matter once you start looking at it in the assembly code and the architecture type right here is x86 and this is one of the most common that you'll see so a big a big thing I want to go over is the difference in strings versus floss and these tools kind of serve the same purpose but they have slightly different functionality so they're both used to extract characters such as IP addresses domain names I mean you name it you can find it in strings floss does the exact same thing except it has a little bit more of an advanced capability it can identify some evasion Tech techniques like stack strings which

I'm going to go on or go over later on in this presentation um another key difference is strings can only or sorry floss can only be used with Windows executables where strings can be used with a variety of file types so whether this is Linux Windows it can actually even be used against PDFs and just note there that um both tools do come default on remnic and flare and the screenshot there I'm just using the witch command in Linux and specifying the binary that I'm looking for so you can see both of them exist in the user bin directory so I briefly mentioned stack strings and now I'm going to go over it in a little more detail so this is a

technique that is used by malware authors to avoid detection and also they're using this technique to avoid detection by instant responders as well so the String's utility will not detect this but floss will this is kind of one of the key differen between floss and strings I'm going to go to the next slide for the example so here's how stack strings would be written in C code I'm going to do my best to walk through this code just to kind of explain but this is at the top here we're just importing the necessary C librar so we're doing an include the standard input output and then the string library and the real main part of this code we need to look

at is the modify the string section so you can see it's really the name speaks for itself it's stacking the strings on top of each other so you can see HTTP colon evil.com and then at the end here uh for those familiar with C this is just printing or doing the print F function with percent s which is just specifying a string and then it's printing the HTTP evil.com this sample here is not actually malicious it's just shown for the example so moving on to a couple string or couple screenshots of detecting stack string so what I just went over in the previous slide these are the tools used against it here so you can see I'm

doing strings against the evil.exe strings against evil.exe here and then I'm using the grep command and for those that aren't familiar with grep this is just similar to the find command in Linux where you're just looking for a string in a file so I'm doing the T back eye option which this is just case insensitivity so if evil.com was capitalized it would still detect it and you can see there are no results it was able to evade evade the String's utility if we look at figure two over here is the floss example and I'm running floss against the exe and running the GP command same thing I did on the on figure one except floss was actually

able to detect it so you can see it extracted the stack string and found HTTP evil.com it did pick up a random character I here this is common it really just depends how the the code is compiled but the important part is it was able to detect that malicious IP or that malicious domain so next I'm going to move into using regular Expressions to find ioc's so a regular expression is a expression used to match various character patterns such as IP addresses domains I mean you name it you could use a regular expression to find it Um this can be used with the grp command which I went over a little bit on the previous slide

or it can be automated into a script I would say most people are going to choose to automate this just to just to save time and then an example right there of a regular expression for an IP address is the grep option and it's important to note you do need this Tac OE if you don't in include that capital E right there it will see this string as literal and not as a regular expression and that's just an example no one's going to memorize these things that's why on the next slide I included some resources so one I like is reder and this is just a tool to help you help you create regular Expressions you would

need and then another one this is actually a free training that you can go to online this is rx one.com so everything that I just went over using strings floss and then finding regular Expressions we're going to try to automate this to save time for for future future investigations so I I already already mentioned it why would we automate it saves time and then you can use any any scripting language of your choice so you can use Python bash Powershell um I'm most I'm most comfortable with python so that's the example I'm going to go over in the next slide so here's a script to show everything that we previously went over all automated into into one script here so

you can see at the top I'm going to do my best to walk through this this isn't meant to be anything super detailed on python but I'll walk through the main components of it but we're importing our libraries up here and then we're defining our function extract underscores IPS and domains from file and then we're specifying our argument here which is file underscore underscore path so you can see later on in the script I do Define that down here with the input function in Python and all this is saying is the user can type in what he wants to use for the file path you can do this differently you can put the exact path this is just how I chose

to do it for this example so moving back up here to this output equals subprocess do check this is just Python's way of saying that we're going to be running the strings command against the file path variable and then this accept accept and return this is just error handling built into the script and then moving down to the IP underscore pattern and domain underscore pattern here these are the regular Expressions that I just went over um just are just put in the script and labeled as variables and then at the end here we're just running the IP pattern regular expression and domain regular expression against our output which we saw up here was defined as the String's

output and then at the end we're using print F to print them to the screen so that's all those steps I went over at the beginning of the presentation all automated into one script so next I'm going to move into using Yara for Signature oh yep so this this one specifically is with python but it is sitting on the remn the remx virtual machine so this is this script isn't built into remx I had had to make it myself but you can use it on on Linux or Windows so next I'm going to move into using Yara for Signature detection so Yara is a pretty cool tool tool that you can use to identify specific types of

malware uh the key component is it does rely on a rules file to use Yara you can make the rules yourself but most people are going to choose to download them from their their favorite G GitHub repository and this does work very similar to those for those that are familiar with snort it works similar to that except snort is run against packet capture where Yar is run against files so I'm going to break down the the basic components of yaraa rules so there's two components strings and conditions strings are used to identify what we want to Define in the rule so looking at the example here it's under the strings under strings right here we're labeling this a string as

evil and this is a very simple rule they can get they can get really complex I just wanted to use something simple as an example and then the next part or the next component of the rule is the condition and this is the action that must be met to trigger the rule so you can see this rule very simple we're just looking for the a string which we defined earlier as the as the string evil and something I forgot to mention at the top past rule is evilore detection this is just where we're naming our rule so next I'll move into running Yara so once you have it installed you will need to install a Yara rules file on

your remx VM it does not come by default but the Yara utility does come by default and it's really simple to run you're just going to use the Yara command and then specify your rules file so my rules file on the screen there is rule. Yar and then you're going to run it against the executable that you're analyzing and we're still looking at the evil executable here so once we run it you're going to get an output also very straightforward to read you're going to get the rule if something does trigger you're going to get the rule name evilore detection and then the file that you ran the rule against so now that I've talked about

some some common static analysis techniques I kind of want to move into the portion of what malware authors would use to evade some of these techniques and the first one I'm going to go over is Shell Code so shell code is used to spawn a pro or spawn a shell on the victim's system and this can contain IPS domains most of the time it's going to contain things like c2s that it's reaching back out to things like that and then figure one is just an example of what Shell Code looks like this is going to look like garbage to most people and it looks like garbage garbage to Strings and floss as well because it

cannot extract any IPS or domains out of this so another technique that malor authors will use is hardcoding IPS so malor authors can choose a variety of ways to do this one way is by putting the IP in a heximal format and the easiest way I found to detect against this is by putting the executable you're analyzing in a disassembler I chose G for this presentation you can use Ida cutter whatever one or whatever preference you have so I wanted to touch real quick on what hardcoding IPS will look like in code so I'll walk through this here we're including the standard input output Library similar to what we did before and then we're introducing the

main function and this is just specifying it as specifying it as an integer and then we're specifying an unsigned character array of four and then assigning it four values here and this is actually the hexadecimal so this 0x FF is 255 and then these 0x z2s are the 2.2.2 and then at the end here we're using the print F function to print the IP address to the screen with the percent D so this will display it as decimal format and this another note is this file is not actually malicious this just to show what an author a malware author would use so here is the output of that file when you run that file which I named

hardcoded it's going to Output the decimal IP address to the screen 25522 two and then the bottom here figure two is I'm again running strings against the hardcoded executable I'm doing the same GP command that I did on previous slides and looking for the two 5 address right here and you can see with the screenshot it was not able to find it that's because it is hardcoded there in heximal format I don't have it on the screen I don't have a screenshot but I did run this against floss as well and floss is not able to detect this either but there is a way to detect it and that is using a a disassembler and one I used here is

gidra so this is a tool used to view um executables and assembly language and it also has a decompile option to view the code in it's called like a pseudo c um it's how gidra interprets the code itself and then this screenshot is of the decompile view of gidra so you can see it's able highlighted in the Box here it's able to identify the IP address really the exact same code that was used when it was originally written but it does identify it in HEX format so you will need to go the analyst will need then then need to go back and convert that manually but that is a pretty good indicator right there sometimes the decompile option

doesn't work in gidra um if that happens you'll have to look at this in the disassembled view so here's a screenshot of that hardcoded executable and how it I guess how it picks it up here so it does find it which is highlighted in this red box right here it looks a little different though the reason it looks different is it's in little Indian format and this is least significant bite first without getting into too much detail just think of that as the reverse order from the normal order that you're used to looking at so it's going to have the the twos up front and then the fs on the back end here this isn't really an

ideal way to find an ioc when you're doing when you want to do something quick for instant response but sometimes that's that's the only way you have so another way way to evade string detection is a malware author can choose to pack malware so when I say pack malware this is basically saying the malware author is going to compress encrypt or modify the original code of the program and a common Packer there is the upx Packer and this is free and open source and does come come default on many Linux distributions there are also a lot of custom Packers that malare authors will use and you could do a whole present presentation really a whole college

class on Packers and because they can get really difficult to um difficult to unpack so before we unpack the malware we first need to identify that it is packed so how are we going to do this one of my favorite tools is detect it easy and these are both screenshots here of the exact same file one is of the packed meller and one is it not packed so on the left in figure one is the not packed version and you can see real quickly it's able to identify that it's not packed another thing you can look for is this score here this 5.9 it's not it's not a written rule but generally if it the entropy is above

seven which is this number which is the entropy it will it will display is packed because if you move to figure two you can see the packed version and it quickly identifies it it's packed and then it gives it a score of just above seven here and then another thing to highlight on the slide is the this rectangle here is it's actually able to identify the Packer that was used so it wasable or was able to identify upx detected easy cannot identify every Packer though if you go to detected Easy's get repository it has a list I think it's like 10 or 12 Packers and it can that it can identify but if a Mal

author chooses to use a custom Packer it will likely miss it it will identify that it's packed but it won't display a packer here so now that we've identified it is packed with upx how do we unpack it and when it's upx is very simple it remx has a pre or a built-in utility the upx command and you just need to run the TAC D option here so upx DD r x.exe and that will unpack the file and if this doesn't work you will get an error it will spit an error to the screen basically saying not correct Packer used there's an error and just one thing I want to I want to point out is if you

don't use this Tac D option this is BAS instead of unpacking it it's basically packing it again so you do need that tacd option so now that we have some techniques to find various types of malware and we've discovered some ioc's how do we use them as instant responders so it it really is pretty straightforward once you identify some of these ioc's you want to query them against the security tool set that you already have so whether you're using a Sim like Splunk Cabana every place has different tools you want to query against that tool and then if you're using like a a network intrusion detection system like snort you would want to start implementing rules or running snort

against again with those ioc's and then if you have edrs like crowd strike trellix query that in that tool set as well and after you run your queries for indicators of compromise you then need to verify verify and then react accordingly so what I mean by this is if ioc's are found in your environment you need to confirm if they're true positive or false positive so if they're actually malicious you need to follow the necessary instant response procedures that you would that you would normally follow in your environment and if they're false positives you want to tune them tune them out and then document them so an analyst doesn't waste time on it later so this is just a little summary

of what I went over today so going over basic static analysis techniques and then some evasion techniques that Ma authors will use in response to that and then I went then I got into some some techniques to respond to those evasion techniques that malor authors will use and that is it and I'll I'll leave it open for questions and comments y go

ahead so if you're gonna if you're going to you if you're going to find find base 64 encoded stuff you would you could still run strings against it because it will find the encoded you'd then want to pull out that encoded those encoded strings and use a tool like cyers Chef to to decode it hey we got some great questions that you guys are going to ask so um please yeah let's keep it down if you're going exit go ahead but do it quietly please thanks Chris uh cyers Chef so once you pull out that base 64 you could then use a tool like cyers Chef there's also built tools to remnic that you can decode with base

or decode Bas 64 okay yeah no problem yeah so I'm coming at this like developer perspective so and I was so you mentioned inent responders and are those two separate groups and what's the difference yeah I would consider them two separate groups so malware analysts are going to focus on generally just like the malicious files or instant responders can be focused on like a whole environment so looking at logs pcap packet capture things like that so instant response a little more broad Mal analyst a little more focused asking question you can repeat it'll oh his question was what is the difference between malware analysts and instant responders and then the first question he asked was if you find an

encoding option like base 64 what would you do if you encountered something like that got it

yep so generally when I first encounter a file I will one figure out what type of file it is with that file command and then usually the first thing I run is strings against it just because strings will catch everything so when you see like a packed executable with strings it will look like a bunch of garbage and the more you encounter it the more you'll just get with com more get more comfortable and know what to Pivot to next if that makes sense any more questions yep I actually haven't encountered that personal story on how it I could answer that question for you so you know the resume is going to get you that

interview the networking on LinkedIn as he just went over he going to get you that interview the interview is what's going to get you your offer so what you know we kind of did was made something that would kind of be representative of a few different scenarios uh because what Coca-Cola is looking for is some giant mega company is different than a startup environment right startup environment might be you might be wearing multiple hats yeah so to cave on person and SEC expert a lot of these files that you're going to see with stack strings they will have already gotten caught by like an EDR TOA responsibilities so the questions are going to change depending on the

environment you're interviewing in and as you're doing the preparation uh the young lady that was in the middle had mentioned you know preparation goes a mile uh look at the company but kind of take the cues when you get your interview invite if it has uh hey here's some tips on interview so the question was asked about pack files what's it basing off it's really just analyzing kind of the compression the was for example it will also identify that asack so that would be a false positive y

yep um I just want to understand your question correctly can can you repeat

it prer ability to how do you fion not a false oh I see what you're asking so you can use a variety of tools so if you if you find if stack strings identifies as like an IP address you can fall back on open source information so like virus total uh you may have other Intel sources depending on the type of environment you're working in you also want to look at the surrounding events to that to that ioc so if you're seeing something that looks suspicious within like a two or three minute time frame of that indicator you want to start digging down the rabbit hole a little bit further yep yep so um I was looking again coming

from theop and reading the output of it looked like it was emulating the function that emulating functions that guess interpret

of that is the potential toate aun that but then also way that when emulates it it breaks that's that's a really good question so the question was I showed you a basic stack string if if if I'm understanding you correctly is there a way that a stack a type of version of Stack string can break floss right or or just like I mean obviously you're building this in you're doing this in a virtual Mach where you run floss on the function it actually breaks out of floss break into whatever [Music] ma but a short yeah short answer to your question is yes you can there are stack strings that can also evade floss so the

more complicated it gets it can it can be missed by floss as well and specifically malware that can yep good question thank you any more any more questions not seeing any so thanks thanks everyone for their time I appreciate it [Music]

[Music]

Basic Malware Analysis for Incident Responders by Jared Graff

Related talks