
good afternoon I hope you had a good lunch so you can sleep well now my name is ladislav and my talk is going to be about something that has been uh almost forgotten but it's still to be counted on when you maintain a system for an autom automated sample processing I am a senior a researcher at gen and I have been part of the AV industry for 18 years already and whilst my CV includes five companies it has still been the same chair we've been renamed and Acquired and renamed again uh I did reverse engineering I did malware analysis I built internal tools cleaning tools for file infectors uh I did a Windows emulation engine and ransomware decryptors and
many Yara rules to detect those fire infectors and ransomware what is is going to be about I'm going to talk about file infectors which are the true computer viruses and I'm going to present the prevalent strains that we see in our user base I will show you the their brief description and will delve into quirks of detection of filing infectors with Yara okay come on yeah general public and some journalists they still tend to use the computer the term computer virus for every uh malware malicious program that are out there um although not correct it comes from the term that uh started to be popular like 30 years ago so don't this is not a virus this is a
malware the true viruses mimic the behavior of biological viruses and they are defin defined as a programs that when executed then they replicate by inserting their own copies into the healthy other programs even though blog posts don't mention them too often they still exist and uh I can divide them them into two categories first of them are so-called prep penders those uh replace the whole XI with their own body and the original XI is appended to the end sometimes it is also inserted into resources or exceptionally can be even in a SP special separate files those are very easy to write you can make them in a p as a python script they are very easy to detect they are very
easy to clean some of the strains we saw lately include the black moon Plastic Fantastic Lamer which shows how sophisticated this uh this virus is memory the last one that seems like a typo well it actually is but not on my side we named it after a string in the virus body the second group is so-called appenders those are more sophisticated the virus body is incor incorporated into the XZ usually to the end and the code execution is altered so most often the virus code runs first and then the transfer uh then the run is transferred to the original code of the program very often those are polymorphic so every copy uh is different than the previous
copy there are often encrypted those are quite hard to detect by Static detections as an examples I could present solity virot or expiro how do they achieve this uh code execution altering well there are several ways how to do that first of them is to modificate to to modify the field that called address of entry point in the PE header you know this is a typical structure of a PE file and then in the header there is a field that calls that's called address of entry point in the infected copy which is usually slightly bigger because there is a virus code at the end all you need to do is to modify the address of entry point so it
goes to the end first and then if necessary then it will transfer back to the original code the second way how to achieve the same functionality is to modificate the code at the entry point itself so in the healthy copy in the original copy of the program again there is this address of entry point pointing to some code in the infected copy that is the same value but if you look at the code it is different that is the second one and the third one you can also modify some random instruction in the code section of the program so again in the original file you see the entry point looking like this and the infected one you see
the same code but if you diff those section you will see that in the healthy file there is some instruction and in the infected file there is a different one so what is spreading what do we see in our user base this is here is a six the most prevalent uh strains of file infectors that we see in our user base including their aliases flux solity v expir r house the last one again that looks like a typo and it is but not on my side because it creates a file in system 32 directory which is called run house dxi it is quite many of them that we see this is the numbers on of protected
users per month in fourth quarter of 2023 so it's not like 15 years old or something it's still it's still up to date it's still actual if you see the if you look at the countries it's mostly Africa or near East the scale shows so-called risk ratio which is the term use as a number of attacked users divided by the number of active users in that country so let me show you a few of those strains how they operate how they infect files flux is uh the prep the upender kind of file infector the code at entry point is Rewritten so you can see that it's on the same address but the code is different there is a small loader in the
code sections and the most of the virus body is placed in the PE overlay the second one solity it goes by modifying the address of entry point in the header so the original code and the entry point of the virus has own different address and also obviously is different is very po polymorphic it's encrypted and the virus body is appended to the in to the last section which is enlarged in order to hold it all one of the most problematic at least from my personal point of view is virot that has a highly po polymorphic stop the first thing what what it does is that is called get K count repeatedly and try to tries to find viral machines
by speed how fast it can call this API it injects processes if you look at the machine that is infected by verot that some functions of interest are changed are hooked for example anti open file uh for who for those who saw a an API like this ever and it should rather look like this however there is a call that it should not be there it also modifies the host file so for the Local Host there is some Chinese uh domain and it also modifies HTML files so so there is an i frame which points to the same domain xiro this one is interesting because it is capable of infecting both 32bit and 64bit xes is quite large is
more than 240 kilobytes the entry point code is Rewritten by the virus stop and is typical by having multiple push in instructions and the beginning Black Moon is one of those that are more recent it's like from 2018 and it is a combination of a file infector and a Monero Miner if you execute a sample then it will infect all files on the hard drive and then it will also modify HTML files on the hard drive and finally it will drop and execute the Monero Miner 32bit or 60 64bit depending on the AR architecture the file infector part is is not really sophisticated written and it doesn't really care about reinfection so if you list and the the file overlay
you can find multiple PE files in it and those all of those are like steps of on on of the infection it also doesn't really care about reinfection of those HTML files so it is very common you see several executions of the Monero Miner in one HTML files now why am I even talking about about it I said that it's quite old stuff they all all that and so so on why is this still important why am I even talking about well if you get an infected file then everything important is at the end of the file static features that you use to detect or cluster or use for machine learning they don't matter anymore so
all of this like strings resources pdb part icon H whatever is in the exit doesn't matter anymore so if you have a machine learning on or or a clustering system could be like a group of log bit samples could be a group of notepad exist could be a group of of samples that create an interesting mutex all of them may be tainted by file infector so you need to do some pre-processing which Step tells you hey is there a file infector or not if if not then we go back to the original clustering if yes we need to process that file infector also by their nature fil infectors produce large amount of samples it's not a problem to get 1,000
of samples from one single PC or a Sandbox so if your sandbox uses to resubmit dropped files it goes Haywire pretty fast so at gen we use a Yara pre-processing to check whether they are infected by a f infector and I'm going to tell you how do we do that as you probably know the ra is sometimes called a pattern matching matching Swiss knife and it can detect both static features for example this simple rule that everyone if even if you don't know Yara you can pretty much see how it works so if you run the Rara scanner on an HTML file from this conference you will get a detection but they can also detect a behavioral
feature so again you find a simple Yara rule behavioral that detect uh overwriting the exit files and if you run it with a with a behavioral report you get a detection uh in general Yara has quite a tradition at Jen in uh former at Avast we are even present at one as one of authors if you list the appropriate document you will see Avast being there as one of the aors we also have very high standards for our internal Yara rules we have very low tolerance for false positives we want our Yara rules to be perfect or almost perfect and trying also trying to minimize the false negatives and this is not possible to make to do
with the with the official Upstream era so we have our own to be able to deal with the file infectors we uh did many improvements to the file infectors and also pushed them into the official the Upstream Yara like this one so we enhance the PE module that is the very important when uh when detecting uh anything Rel to file infector so we added quite a lot of fields that are are supported and we also added a lot of constant and that's not the only thing that we enhanced we also enhanced the behavior module the cuckoo module because originally for example for files there was just a function that checks for file access but it wasn't enough for us we obviously
kept the old function as it is but we also added the function detecting file rate file right and file delete um the same we did with the registry functions originally they only was the key access but again we did the same so we can now detect key read we can uh we can detect key right and key delete also the functions themselves were enhanced so instead of them simply returning Boolean so it will it the Upstream era will tell you whether that happened or or not we Chang that and now we use that that it returns an integer which means that how many occurrences of that action uh happened in the in the sample run in the
sandbox because this is about the community we also want to give to the community and we always try to push our changes to official Yara not every time it gets accepted so if you join our effort and say you would like to see those changes in the official era join us because I think that the more people will want a feature then the more likely the feature gets through so how do we do that how do we detect file infectors with Yara it could be easy or it could be very challenging like I said prep penders those replace the the body of the file with their own body so if you want to detect a prender
it is very easy you know all copies are the same so uh you can even make a a HH of a first like 400 kilobytes and it will work on the other hand appender may be easy if the if the if the file infector is not polymorphic like Rous or Nesta but it's very hard on V virus that are polymorphic like solity or virud the good news is that most of the file infectors are no longer developed why is that a good thing well once you cover them you're good and you can forget everything and it will just work uh we did some cleaning tools uh at AVG and IAS there more than 10 10 years old
and but they still work up to today so again once you cover them you good it is also very easy to write a generic rules that detect not a particular infector but all of them this is an example if you have something that writes to large amount of exist maybe the zero here could be could be larger like 100 or at least 50 but still you get the idea it's not that hard so prepends all samples have the same same primary exi and the original one is in the overlay or in the resources so those are usually very lamely written sometimes they're are funny like this one uh apparently they some someone forgot to check if the sample is already
infected so it keeps reinfecting and reinfecting every run so the samples grow bigger and bigger static detection is very simple if you have a non-packed sample and is string will do this is a this is an example of nesa infector no comment about that but writing the ARA rule is obvious you don't really need to be a rocket engineer to find that out however it is always good not to rely just on the string but also to add some detection of a file format otherwise what well otherwise that y rule would actually detect itself as well which you don't want to it would also detect a any text file that would be small enough like
smaller than 10 kilobytes so you want to put there something like I want it to be a PE file not just the string if the sample is packed which is also the case sometimes the infectors are packed with upix or empress specker then you can you can use that any bite sequence you can put into the r Rule and it will work remember they are no longer developed so it's not like next 14 years you will get a new sample with a new uh new sequences so it's okay a few tips that you can use when you write a Yara rule uh for filing infectors is don't check for file size it is much better to use the PE module
of Yara and check for the size of image this is because the file may be infected multiple times and typically very often is or the file could have a large overlay so file size does not really work you can also check for the size of the overlay there using the Y rule like this also don't forget that the infected exes may be in the middle of the healthy exi like a resource or an installer package or a part of the overlay you probably want only to detect the primary XZ having the infector not anything that is packed in the resource because that's the job of the unpacker or the installer itself it is also common to have
multiple copies of the same infector in one file be it upender or prender and it is also common that you have more than one infector in one file it is pretty OB uh pretty often for appenders static detection is a problem because Normal uh most of the traditional method when you check for strings for code sequences they just don't work because the virus body is often encrypted at the end but there are still a few things that you can use most often the entry point goes to the last section this is not typical when you you build an XI or typical XI has one code section and the entry point goes to there the last section is often
Mark as read write execute again this is not a typical one also because they are often encrypted that means that the last section the virus body has a very high entropy how would such yaru look like this might be a good example how to detect that and because nothing on the wall is easy then this behavior is also typical for a packed sample so you may want to uh exclude those peckers that you see most often there is again there is some good news that you can use when uh checking for a file infector some of them do check their presence in the file and how they do that you may compare the original file
with the infected one focus on the header and you can see some differences most of the time they use fields that are not used so in the do header basically all of them except the first and the last one the NT headers there are time stamp this one can be abused because basically it's not important for anything pointer to symbol table number of symbols some sometimes they get modified also in the section header the pointer to line numbers I will show you an example later so if you compare the original file with the virus uh the the infected one you can see that there is a change in the time stamp so uh because we added a Tim stamp
field to Yara so you can write a very simple Yara rule checking for the virro file infector again it would be good to add some more check here to confirm for example you don't need to to check for a format because the PE uh module automatically implies that it is a PE file but I would like to see more checks here to confirm that it really is a virud for example the entropy of the last section would look good there another sample if you look here on the image section header you can see something that is really really suspicious what is that this that is that is just not the time stamp yeah so again you can enhance the
previous rule with the new value so you will catch also the new variant however with all due respect to static detection it does not always work well so you should really focus on a behavioral detection that one will skip any poly morphism any entry point obfuscation and it will work generally much better than static detections for example like I said file infectors often modify multiple exi files so there is nothing more simpler just to write a Yara rule for that and if you tune your value and exclude some known installers that do that you can use very simple rule that will detect file infector Behavior again some installers do that as well so you need to exclude them put
them into V list file infectors like any other malware May create a special files so if you see that someone is creating link INF for dll into inside the windows directory is pretty good candidate for making a behavioral based Yara rule in particular this one is Alman file infector it also applies to registry entries so if you see something like this then it's yet another file infector and the most important thing like any other malware or most of the other malware file infectors also need to check present on the machine like if they are already running and they often do that by creating a named object which could either be mutex semi for event atom whatever or name section
for example virot file infector creates a very specific mutexes or a named sections so if you see that again you can make a Yara rule behavioral based and it will detect you virot quite reliably and again because nothing is as easy as it could then there are some exceptions as for the markers like I said for the static detection of the of the markers of infection if uh the file has been cleaned by a cleaning tool very often the marker stays there this one is a marker of the solity file infector we see a lot of those we actually had to make a special Yara rule just for these kind of samples they are not malware they work even though they
are not the same like original so we had to do a special filtering of of uh samples modified like this also the legitimate installers May modify and drop many exit files uh an example from my last practice was the foxit PDF reader whatever is that it just installs itself without the user consent and all our ER rules modifying that detect modifying of exif fil that were screaming like this is an infector so you need to put that on the exclusion list as well there are more things to watch for some legitimate applications May create named object the same like file infectors it may have a legal reason for example there are immunizers those are programs that run in the background and
they will just create the mutex or name section just to full a real infector and prevents it from running also some AV Solutions do that or most funny is just random test programs this is a sample that I processed like 14 days ago that was just a sample it does nothing else than this let me make it a little better looking yeah this is all that it does nothing else but because it creates the named object that is the same like the virus infector now our systems were were were screaming that this a false alarm well it is but I had to do that I had to include that sample in an exclusion list so basically that is it so the what
is the conclusion of this talk file infectors are mostly the blast from the past however any kind of automatic sample processing will be messed even in the the present so if you have an automated result automated system you need to process samples differently when you find out that they are infected by file infector so you need to maintain an up to-date exclusion list and check for false alarm periodically thank you very much that's it for me and if you have any questions please file them okay