A Unique Approach to Threat Analysis Mapping

Name: A Unique Approach to Threat Analysis Mapping
Uploaded: 2016-06-12
Duration: 40 min 2 s
Description: A Unique Approach to Threat Analysis Mapping: A Malware-Centric Methodology to Better Understand the Adversary Landscape Kyle O'Meara, Deana Shick Abstract: Malware family analysis is a constant process of identifying exemplars of malicious software, recognizing changes in the code, and producing

BSides Peru40:02935 viewsPublished 2016-06Watch on YouTube ↗

About this talk

A Unique Approach to Threat Analysis Mapping: A Malware-Centric Methodology to Better Understand the Adversary Landscape Kyle O'Meara, Deana Shick Abstract: Malware family analysis is a constant process of identifying exemplars of malicious software, recognizing changes in the code, and producing groups of “families” used by incident responders, network operators, and cyber threat analysts. With adversaries constantly changing network infrastructure, it is easy to lose sight of the tools consistently being used and updated by these various actors. Beginning with malware family analysis, this methodology seeks to map vulnerabilities, exploits, additional malware, network infrastructure, and adversaries’ using Open Source Intelligence (OSINT) and public data feeds for the network defense and intelligence communities. The results provide an expanded picture of adversaries’ profile rather than an incomplete story. The goal of this document is to shift the mindset of many researchers to begin with the tools used by adversaries rather than with network or incident data alone for an “outside-in” approach to threat analysis instead of an “inside-out” method. We chose three malware families to use as case studies—Smallcase, Derusbi, and Sakula. The results of each case study—any additional network indicators, malware, exploits, vulnerabilities, and overall understanding of an intrusion—tied to the malware families should be utilized by network defenders and intelligence circles to aid in decision making and analysis. Bio: Kyle O'Meara is a Senior Member of the Technical Staff at the Software Engineering Institutes's CERT Coordination Center (CERT/CC). Kyle works on the Applied Threat Analysis team at the CERT/CC where he researches and analyzes current and emerging threats to national security with a focus on exploits and malware. Most recently Kyle was with FireEye, where he was the lead senior threat analyst for the active cyber defensive program called SHARKSEER. Prior to FireEye, he was with the National Security Agency (NSA) for roughly five (5) years. At NSA he had a few different positions as a cyber-cryptanalyst, six (6) month deployment to Iraq as a media exploitation analyst, and a communication signal analyst. Kyle received his MS from Carnegie Mellon University in Information Security Policy and Management. Kyle has also presented at major information security conferences to include DEF CON and FIRST Technical Colloquium. Deana Shick is a Member of the Technical Staff at the Software Engineering Institutes's CERT Coordination Center (CERT/CC). Deana works on the Applied Threat Analysis team at the CERT/CC where she researches and analyzes current and emerging threats to national security. Prior to working at CERT/CC, Deana was an International Trade Specialist focusing on EAR and ITAR regulatory processes. She received her B.A. from Duquesne University in International Relations with a Security Studies concentration. In 2014, she completed her M.S. in Information Security Policy and Management from Carnegie Mellon University. Deana has presented at major information security conferences to include FloCon and FIRST Technical Colloquium.

Show transcript [en]

all right ladies and gentlemen we are going to get started again this is Talk number two a unique approach to threat analysis mapping and our presenters now are on the threat Intel team with C right up over the uh Hill here uh I introduce to you Dina shik and Kyle Oma they're going to do a full introduction of themselves hello oh I will make sure to avoid that how's everybody doing enjoying bside so far

excellent okay all right so we are here today because we are presenting on a paper that KY and I wrote um and published about a month ago so everything we are going to talk about today is available to you because our goal in this was to create something that was actionable so other people can take it with them them and use it um so I just want to ask a question to everybody in the audience and so how many of you guys reverse engineer malware can we raise your can I get a show of hands okay how many of you guys do instant response or a part of a sock okay how many of you guys do both

anybody one person excellent two people all right and I'll get back to that in a minute so um my name is Dina shik and I work at the software engineer software engineering Institute C coordination Center I am on the threat intelligence team with Kyle om and I've been working um at C now for about three years prior to this I did export compliance and itar regulations stuff um before I got my masters at Carnegie melon and I've been there ever since Dina said I'm Kyle uh I've been at now for a little bit over a year and before that I was uh with firey and if anybody there any guies in here I worked in the Sharks here program basically ran

that and then before that I was at NSA full-time after before that so uh but it's exciting to be here at sir I'm back in Pittsburgh I want the school in this area so it's good to be back in this area so we are on the threat intelligence team and our overall directorate consists of um some malware um reverse Engineers some um Network engineers and then the threat intelligence team that puts it all together um we recently merged with our vulnerability team so Dan who just spoke and a couple of our vulnerability researchers are out there in the crowd so now we have real a really interesting access to different types of data that we didn't necessarily have before and

that's kind of um how this all started um we do produce a whole lot of tools our maare team um produces reverse engineering tools and other static analysis tools and these are available on our GitHub site and you guys can use them and you can always ask us any questions we might not be able to answer them but we can certainly feeli them to the right people and that's up there because some of the tools we use to do our analysis are available out there okay so um I asked you guys a question of what you guys do so we we have maare people that do maare analysis people that do incident response and then you

have people that are doing network analysis but rarely do people kind of put them all together and in the industry right now you can go out you can read crowd strike reports or fire eye reports or what have you and typically these take an inside out approach to understanding the adversary so what that means is um I have a network that network has been hosed by a bad guy and I start my analysis with that data um our team does not do instant response but we have a really good idea of what type of malware is out there so we wanted to flip this on its head a little bit and start with the tools that

adversaries are using instead of um more fragile indicators of compromise like domain names IP addresses and so forth um we do have a malware artifact catalog that has about 200 million pieces of maare in it and we've been looking at Trends across that now for the last 10 to 12 years minimum we also have what's called our known process so our reverse Engineers will get an Exemplar of malware they will reverse engineer it and they will find um all of the related pieces of malware and we call that a family so our analysis is really going to start with how we understand a malware family and what we have is a configuration dumper a bunch of Yara

signatures the configuration Dum will then give us all of the other indicators of compromise so we can put together the entire picture or I guess a more complete picture of an intrusion instead of just one piece of it um the other challenge problem that we have in the industry is that different security vendors like to name things differently so you have people calling different groups ABCDE e f and there's nobody putting them together nor is anybody really confirming the ings there's a lot of circular reporting and frankly we just got kind of annoyed with that and decided that we were going to tackle this at least tackle it in terms of if tackle it in terms of mware and um all of this

spurred this paper and the goal of it is to just put all of the different types of indicators together to tell a more complete picture using an outside in methodology instead of an inside out methodology

so I guess kind of talking about what that really means so our goal was to utilize the data that we have in house so our vulnerability stuff we have an exploit catalog malware catalog and a whole bunch of network stuff to help Net Defenders and those in the intelligence community so everything that we've produced in this report um is public to you but also we want to teach you how to fish to some degree so you can take anything that you might find and reapply it back into your network um let's see and I think I kind of went over a lot of this we do utilize indicator expansion and I just want to explain that really quickly and that is

the process of going um from IP address to domain name or domain name back to IP address or anything along those lines so any of those steps that you take we call indicator expansion so the data that we used or we tried to Ed so we certainly scoured the internet for any of for all of the case studies that we were going um to talk about we're going to talk about one but um first we used um our maare family analysis of course and then the second biggest piece of data that we used was farsight's passive DNS database has anybody used that here anybody use pass Okay evan excellent there's one um and we attempted to use

blacklists so again there are a lot of security vendors that produce a whole bunch of blacklists and um we intersected all of our ioc's for the ma families that we looked at and we found that there was little to no overlap on any of the black lists says to me that unless you have that blacklist at a particular time you might have some coverage if you're implement ing them but you probably won't when it comes to I guess when it comes to adversary tooling and that was just not helpful at all we also use the uh miter CV database to identify those and CV details as well uh exploit DB obviously and then other open source uh information uh such as

you know Twitter blogs vendor articles and uh you guys haven't checked out the circle. luu miss you know maare information sharing platform it's pretty useful lots of information there you can kind of get an account if you're a c or a vendor um and you don't really have to contribute but you can contribute but there's a lot of good information in there some of the tools we use obviously we Yara uh one of the tools that was developed in house by our developers as function of Y I'll talk a little bit more about that later uh when I talk about the code preparis section um silk um which I if you knowes anybody not know what silk is and it was developed

in-house years ago um I'll Leta talk she knows more about it does anybody here not know what silk is okay so silk is a network monitoring tool for very very large networks and um I think George could probably explain it better than I canor analis yeah you can find it online it's downloadable just search for silk and you can download it and use it so instead of keeping a whole bunch of peap you would then uh you would use silk to analyze your network flow um we use silk in a little bit of a different way because you can make what's called prefix Maps so that would map like an IP address to an as for example like an ASN

and um do really really fast um analysis it also builds IP sets which is helpful if a client or sponsor gives you a bunch of net blocks and you have to build them yourself and just some scripts for automation purposes yeah okay so we're going to go over our methodology a little bit and in our paper we went through three different case studies of actor tools the first was the small case malware the second was jusi and the third was sakula um so like I said a lot of a lot of reporting that you'll read in O simp will start with the incident data and you'll work your way out to try and understand how an intrusion works so we

are starting with our known and in particular with um the data that comes from our configuration dumpers and this gives us an md5 of all of the files IP addresses domain names um any strings ports remote files Etc um we then after starting here you then then map out to all of the other data sets so you can find additional C2 infrastructure by using farsight's passive uh passive DNS database you can search ENT for incident data um which is helpful at least so it can tell you that a piece of malware has been involved in an incident and then you can move you can pivot to um the vulnerability that was used in the exploit these are in no

particular order the only thing that we chose to start with were our known um but everything else you can kind of do it's it's more circular so you can use one of the pieces one of the data sets to inform the others the one Circle that we left out of here was um the so going from exploit to exploit kit we did not look at it for these case studies but it's certainly a future work um a future work area so does that make sense to everybody are we all on board here excellent so what were some of our results so um again this is an example of um jusp starting with the configuration dumper

using the farsides passive DNS database to find additional um C2 infrastructure so IP addresses domain names um we then used the md5 hashes of the D malware to find incident data which there was a lot of thankfully well maybe not thankfully but at least for my purposes um we then found certain cves um that were similar similar um vulnerabilities were used and then uh of course you found the exploit that was taking advantage of vulnerability and what's kind of uh the Highlight here is that we're not just saying like oh this is the exploit use like I actually like dug around until I actually found the exploit itself or like you know know if it was already on exploit dbe as a proof

of concept or something like that so it's we actually have the exploits themselves or supposedly right but it's it's also interesting because the exploit piece is actually the hardest piece and I think that this is a community Gap area that I I think that we can work on I'm not sure many people are really collecting and looking at exploits but we are certainly trying to okay so what is jusie so this is an used by um certain AP actors that um compromised OPM Anthem health forbes.com um what them that they were used that semantic and everybody used can't remember it's in a different SL um we assert that the attacker tool was created in 2006 and there was a

major rewrite in the code um I think I I'm not sure when that happened but in a couple slides slides will show you that there was a pretty big spike in the collections of deusi um derus malware so what's really important is in osen some people have claimed that jusi is another maare family called Koso and beyond that um some have asserted that jusi is also breba and what has happened is once one person says these things other people pick up on it and they're like yeah yeah definitely like this is definitely Koso um I think Palo Alto went as far as to call the actor group using Bria the Koso group and that was only because

there was a string in the MW that said Koso it was not the same hour but we wanted to look into that um to verify some some of our some of those claims because I we weren't finding people that were actually doing that so for the uh code consp comp comparison I'm going to get like I'm going to talk a little B more about our function Yara tool so on how that works and that's what I use to do a lot of this so the tool that's available out there on the TCI GitHub is um under the feros uh tool is function Yar so how it works is that it creates a function for any uh executable for creates a y rule

for every function in an executable now you might say well that's going to give you good functions bad functions right then you kind of you bang it against a known clean directory so I would bang it against a known clean uh system 32 directory to then remove any hits that occur and then I have done a subset of rules after I uh do some frequency analysis that I can say that all these rules from this specific malware that we know is to be this malware that these rules are malicious rules so what I did then was then I'm like well then I I want to compare the rusie to uh Bria and and Theus to Koso and by doing that um

determine like how common are these are these malware families that supposedly in open source are saying that they're all equal they're all the same whatever well based on doing this analysis and looking at these collection numbers are what we have in house right so based on starting with Koso and doing the frequency analysis I came with a rule set that I had then I would and when I compared those rules and how many rules actually hit on the uh the rusby I found only one function in common with four files and another function was common with 11 so that's showing you that only two functions out of this huge rule list I had were found to be common in both

maare sets you know similarly with deusi and the Bria there they only shared seven functions right so you can see how based on doing further analysis and looking at the code itself that these files are not the same this malware is not the same um and that angle so that's why um and then one of the other things is that you don't see a comparison some debris B Koso that's in the paper itself but that's a different set that we're looked at cuz this is uh as we talked about we're only going to give you one of the three uh use cases that we did and this is talked about the rusp do that kind of

makes sense so like you know function yard is out there you can use it it's kind of a good for socks or in responders when they're trying to get quick hits on things but like I said you have to you have to take it and pair down your set before you can actually have a good high confidence that that set you have is a good yard Set uh to that will hit and only fire on malicious rules so what were some of our findings so we're going to go through each piece and then kind of put it together to um Drive some conclusions so we had 112 files of jusi as of January 2016 um we have not collected any more

derus since this date um as of yesterday I read the a report that we put out within the configuration jum there were 58 unique domain names used for C2 and only five IP addresses um the malware is communicating over some um pretty common ports so that's not really too interesting um I guess it's just an extra data point um and it did have some remote files and some interesting strings including cia. exe um Mame rout print p. jpeg um none of this is super interesting but the this is the kind of information that you can find if you are looking at a piece of malare and you are writing the a configuration dumper to un to better

understand the malware which is cool and I I'm not sure most instent responders are taking this and implementing implementing It Well Network Defenders are implementing it so um we saw a huge spike in the compile times in 2012 and as a lot of us know these things can be easily changed um again this is just a data point that take these types of things with a grain of salt um I'm not sure what accounts for the spike from 2011 to 2012 so if anybody has any ideas please see me later and we can talk about it so after doing um after doing some indicator expansion we found 60 more domain names associated with the malware

50 IP addresses but the interesting thing is that we found that all of the infrastructure only used 12 different name servers we also looked into different records like the SOA record which might be able to tell which could tell you more about the person that was registering the domain among other details we didn't find any information in the SOA records and we found that the only MX records were um variants of Yahoo JP which is different not necessarily expecting that so who owns or who uh is responsible for these IP addresses so these are the um the organizations and the way that we got to this data was we took the IP addresses found the autonomous system numbers and

then found the organizations that have the autonomous systems so uh Unicom China was had the most IP addresses and um you see some others sprinkled in here including Confluence which comes up quite a few times Amazon which is not terribly surprising because it's pretty big but the interesting thing is that Joe's data center came up and one of our colleagues at C is presenting it first I think next week or whenever first is about identifying public sink holes so we know that Joe's data center is running um they are running sinkles there for certain types of malware so this is what the network looks like so the other data point we're talking about is instant data so think

of that as public reporting in itself you know you have your uh who's who of doing the reporting crowd strike you know samante um Cisco and so forth right so uh the rby was tied to to uh these different names it was tied to deep Panda it was tied to Tung Fu kitten um you know it was involved with the breach uh you know supposed L to TI of the breach of OPM um the rusby is only one part of this rat itself it's actually the kernel level tool and uh you know crowd strike also ties it to defense industrial based um you know attacks as well and what's interesting is that it also aligned with one of the other use

cases we had ironically but they're actually not similar in the self as well but even though there some sometimes they tie them to the same way so that's why we say like it's our approach was to start from maare itself not really trust outs out you know open source reporting as like the main uh key area to start from it the other data point that we mentioned in our uh nonlinear sort of circle thing was vulnerabilities right so deusi was known to link um via the uh was linked to these three different cves and um you know what what was important for me was to actually just find these CVS so what kind of happens

or find the exploits tied to these CVS but lucky with time and some of these were older it makes it easier to find the exploits out there but still took a lot of digging um and you have to kind of trust what you're reading and that's actually going to go uh dive into my next slide but like if to sort of trust what you're reading assume that that the exploit that you found is actually exploit that's tied to that specific CV and uh sometimes it's the case and sometimes it's not I think the other thing to note is just because the cve is old doesn't mean that it's not relevant um and I I think

the industry as a whole likes to chase the shiny penny so you're looking at oh this zero day that zero day but the truth is that the older cves are being exploited constantly by exploit kits and um that kind of place into this analysis too so just because they're old doesn't mean that they're not relevant which ties into the whole patching thing and updating your systems and so forth so uh this next slide uh was sort of you know given the time that's elapsed since our report was completed you won't find this information here however uh we're working on putting together a different technical report you know looking at flash uh exploits themselves but so uh

here you know some of the tools I use to do this so why did I dig into this because uh it was something of Interest I was working down a different path but I was looking at sort of the same uh cves that were found in our report and then I was you know doing um through kind of decompressing them and running a couple of these tools I um I'm visual so I I kept seeing the same uh hash appear here for a underlying uh exploit and I'm like well that's tied for you know it was tied to this CV why is it showing up in this one right so that that's sort of where it kind of led to so it's kind of

like why is this important right so we're star trusting what uh vendors are saying about uh or open source is saying about different exploits but when you're looking at the actual code I can 100% tell you based on this example I had that's not the case right so uh the exploit that came out for um uh the 2015 8651 which came out the end of December was it was a zero day when it came out it was targeting flash obviously um but what was interesting is that that exact same hash was listed to be targeted in uh the attacks that were going against um like this the specific hash but that specific hash once you broke it down

looked at actual code was the exact same exploit that was used a year earlier and it was tied to the uh 2014 cve right so like when looking at it uh and it it the fact that like we trusted the fact that it was a zero day wasn't the the case because then the other exploit has that I had for that one didn't match at all either so kind of the conclusion is like what do you trust who do you trust like what do you believe when it comes to like these exploits and the cves and uh you know like I said we're doing further analysis in it I only kind of the tip of the iceberg here uh when

looking at the stuff but it was sort of an interesting finding that I had so the tldr here um is that jusi was used by AP actors named um deep Panda shell crew what have you there are thousands of names that drive me nuts um jusi is not Koso or Bria which is very important um is important for a couple reasons because if it was Koso or Bria then that would have added back into our analysis set and we would have had more M to look at but it is not so please do not believe everything that you read the actors used a very small Network um compared to some of the other networks that I've looked at um they

also um and I I I think I forgot to bring this up so one of the as's um one of the organizations was actually the Taiwanese academic Network that was used during Ceno which I thought was kind of interesting so we can say that at least to some degree the actors were probably targeting that that organization in some way the infrastructure uses only 12 name servers and um in at least two cases the group exploited zero day vulnerabilities um what I want for um for those of you guys listening is that if you guys are network Defenders please take do this analysis take the data that you find so the extra domain names IP addresses what have you and Implement

them to protect yourself um the malware is like we can say with certainty what the actors are using in terms of C2 infrastructure and I'm just you don't typically see this type of analysis out there the CV oh yeah this you know like I said the three CVS that were used uh and the exploits themselves you can find the actual hashes for the exploits in the our paper um and you know you know it's important to you know the focus on what you actually know before you and that you actually trust TR when going forward some of the future work you know this is just one methodology right so this is just a methodology that we used

um it's important that you can also kind of pivot your different ways from there we're just saying that start with that malware that you know that malware you understand and and spin from there uh couple of the uh Community Gap areas um themselves is that uh is is in exploits right so it's very retroactive unless you're developing the exploit right like you don't know about an exploit till it actually happens if that's not the case I'd love to talk to you more because how do you find it out um we actually developed an exploit catalog itself so we're kind of cataloging uh exploits the actual files or hashes or proof of Concepts tied to cves and like I said

we're doing Furniture uh analysis on like flash exploits which Cann that can expand into other uh application exploits themselves and we can happy to talk about any of that offline okay so the takeaways for you are are that this is an outside in approach instead of an inside out approach that allows you to see just more than one aspect of an intrusion so we talked about vulnerabilities exploits malware C2 infrastructure and a little bit about the actors um this is important for those of us in the threat intelligence communities too because we're not necessarily focusing on just one piece we really wanted to put it together um put the I guess a more complete story

together for everybody what this means for Network Defenders is that if you go through this methodology and you pivot from one data set to the next you can then deploy whatever ioc's you find to protect your networks um this also kind of helps us create an adversary profile too so if we understand that an adversary is using a particular tool then we can take a look at that and see if there are other tools that they might be using if there is similar code and so forth so our goal was to really start with the tools instead of with incident data and separately to confirm or deny what you are finding in ENT um please don't

believe everything you read um and think a little bit harder about how we as a community want to approach the adversary so we're open for questions find a link to our paper there so um if if I'm a smart adversary I know that you're tracking my hashes of my functions can't I just foil your whole methodology by doing that and changing my C2 um from time to time and now can't track me so in other words you're just tracking the people who aren't smart enough to Avo so the thing is about tracking malware itself is is that unfortunately we in the world people are really lazy so I guess you could have an adversary that changes the

code every time it it deploys something but what we found is even like the really Advanced actors only change a little bit of the code which is why we can group malware families because ultimately people are really lazy now you're right though that we probably aren't tracking the 1% of the PE of the aps that are doing like some really really bad things um but I think that's a really hard problem I'm not sure that anybody really has like a really good solution but like I said this is just one methodology not the methodology to use yeah I'm beginning the sample of is that from a live infection or that you guys so we are we have a collection

of different feeds that we've been analyzing now for like 10 to 12 years so we have a bunch of files um that we've looked at just from collection so we didn't have the incident if that like it it wasn't like we collected it because we were hosed that makes yeah yeah yeah we we just kind of have a malware setting it could have came in through an RFI to say hey we found this can you look at this for us and then that's where we develop and you know we find out we have oh we start with five now we have you know 100 because based on our reverse engineer is based on doing their analysis and how

they develop it they're able to develop and find more samples and our huge bucket of 250 million samples um first question is basically so how much time do you have in this now is the second actually refers to the first question I mean doesn't this this needs to be automated in some way because I mean yes if if you can identify indicators from particular actor but at large I mean look at this grp I mean like we're all not going to be able to do this work and if we do and the OS problem is we don't know whether or not we can trust each other because we might accidentally associate or whatever else so I mean isn't this really an

automation issue automating what I mean I don't know if you can really a you can only automate parts of you be have reverse Engineers because you can't automate it right but I mean on the clip side I mean in other words like obviously the process of narrowing down the functions to associate the different things I mean those parts because obviously to identify an entire grouping I mean there are a lot of threat actors and we have the shortage in this the field so I guess I mean I'm not I'm not knocking anything understand what you Me by automation yeah I'm just saying that I mean at very large if you can't trust the other OSN then we're pretty much

left to we need a solution to automate I mean in in sense we're saying be careful with you know trust and right you know I mean because you have to sort of trust right because you're paying for these reports from whomever right that develops the reports and that you know you're paying to get this feed and it has MDS and you know IP so you have to sort of trust it right but like you know if you it's it's more like if you have your in-house shop and you're you know developing out like this is the sort of approach that I would take if I was developing my shop inside out which is sort of how we do it

right but you know we're not giving like doing cleaning of the CBE database for The Identical haarden anybody sorry anyone doing cleaning of the CBE database for duplicate uh repetitive hatches not that I'm aware of no no I mean I I think it's a lot of trusting and then like how AV if I understand how most AVS work they like look at a hash because like once I ran these files back through say virus total right I upload them back up uh they hit on what they said they were supposed to hit on right and then it's like that's not the case and especially if it's compressed right like so you throw the original file out it says it's

this cve then once you you know decompress it and then it's saying know it's this CV you're like well then you're just going on a hash alone I don't know if anybody's scraping it so one one comment one question uh the uh I think the the notion of furthering the automation piece uh developing a tool that goes from a function to Yara converter uh by having that tool out there um you know we can share Yara signature so that's certainly a step in the automation process so there is some contribution there uh so the actual question I had was to um so when you were doing your comparison sort of convincing me that uh derus is not these

other families the the kind of missing uh convincing piece um that I would I would have liked to see and I'm curious if you've done is if um you so for the overlap those few functions five out of 40 or whatever um uh that overlapped between the rusby and those other families were those overlapping functions found in malware families all all across the board right that would convince me more does that make sense yeah yeah so no we did not do that um you know we did it for like these sets we have that's you know something that could go upon I mean you kind of have a little more insight to how we work but

explain to everybody else how how we started with these sets and how we were confidence in these sets is that we have reverse Engineers that actually get sample sets that reverse it create functions or sorry create yard signatures then identify other things and they also create config dumpers to identify you know other figs for those same Mau families so that's where where we start with but looking at that overlap of other functions that hit everywhere else no but I mean that's something that should be done right I mean because you you know how many common nonassociated malware families are using similar functions right I mean you know code reuse is very common you know why

reinvent the wheel so so the other thing just to address your your question Evan is we did take a look we we worked with Will Casey to do his suffix tree analysis too which confirmed so um one of our data scientists will take the binaries and compare the binaries between different ma families and we we did that so we did at the function Dr level and then we did it like at like a lower level if you will and we it confirmed our findings that they were very very different so that's in the paper we just didn't present about it it's kind of a tricky it's not necessarily the easiest thing to discuss okay so yeah so so maybe the

context of what other families were looked at because I think you're doing a suffix tree comparison it is sort of looking at what parts of the um what what overlapping chunks of the code uh um are are common but part of the question with the suffix tree is thank you part of the question with the suffix tree is how um basically what I'm saying is with the suffix tree analysis you may have looked at a few other families and I guess I'm suggesting that by broadening the scope for example to look at the whole P you can get help the argument more I think that there's like a lot of auxiliary things that we can do and I

think you bring up a good point so this was just kind of the beginning not necessarily the end right and and kind of like how we got down this path as well too right so like the other use cases aren't like this they're more they're actually more simple it's just that when we were doing like you know researching that we're finding that open source was calling deusi and Koso and Bria the same that's sort of why we chose to explain this one right and then go down that path but the other two use cases in our paper are a little more simplistic than this is uh I know you had mentioned that uh people are kind of lazy so they tend to

reuse code but isn't it simple enough for people just to throw like an off code randomization stuff into their build like you see that in the wild at all come across anything like that I mean not for these specific examples but I'm mean I mean that's not something that's uncommon right I mean but you know based on you know once you reverse and understand that right you're going to you kind of know what to look for and I think you know that's something that's anything that's polymorphic it's going to cause you know uh harder analysis right so any questions anybody thanks all right cool

thanks yes once more thank you very much Dena and Kyle we'll give them uh speaker gifts they're wonderful if you would like a speaker gift yourself you have to be a speaker right there we go thanks a round of applause please thank you

A Unique Approach to Threat Analysis Mapping

Related talks