
okay thank you very much good morning everyone once again thanks to Warren and the organizers for having me here and besides to be honest is the first time for me to speak at b-sides so imma be nervous apologize for that so bear with me please my name is Costera I am the director of Kaspersky's global research and analysis team also known as great and I've been doing this kind of thread research for some time now probably you guys are familiar with some of our publications or some for big discoveries well Duke you too for instance was one of the things that we discovered in 2015 in our own Network sometimes it's funny when you're looking for sophisticated stuff
you know all around the world and you catch things and then one day you discover something even more sophisticated in your own companies never which is the irony now to be honest thinking going back and thinking about some of these things maybe wasn't the smartest idea in the world to announce them probably the equation thing was not very smart to announce and probably have seen the fallback for our company from that I do remember another case in 2010 I was invited to present at a conference in Vancouver in Canada one of my colleagues was invited to speak about Stuxnet there's the Twitter bless you how many people know about Stuxnet or heard the name of Stuxnet okay one of
my colleagues Alex Gustav was invited to make a speech about Stuxnet there and well she couldn't get the visa in time so he asked me can you go instead and as I sure no problem but what is this Stuxnet thingy I wasn't to be honest paying attention very much to these apt sayings back then and okay I read all the documentation we actually found one of the zero days and we reported it to Microsoft so I went there and you know as soon as I started speaking about Stuxnet I realized was like a tension in the air not just that but I looked back in the room and you know just at the far corner there were
three guys everybody was sitting there were like three guys there standing of you know they looked a bit meteor Middle Eastern ish and they looked like not very pleased with the kind of things that I was saying and later I asked the organizers who are those guys and they said we don't really know they just came at the last minute they paid in cash and you know next to their organization name they wrote GeoEye so we don't really know what that means so I said all right obviously there's something going on here so I went back home and about two weeks later I found a gift in my living room there was a rubber cube with a
message which was take a break so obviously this kind of research we were doing was not pleasing everybody and some people wanted us to take a break which to be honest for a while I did take a break when you when you find such a gift in your home you take a break well not for long I think was just couple of months and then we continued I don't know if was good or bad but this is just the things some of the things we published about all the way up to 2017 well we have other things but we just don't fit in the screen anymore all right so probably I think a good subject
nowadays is always fake news and attribution and if we go back it kind of started around the 2016 with the USA presidential elections I was in Barcelona with one of my good friends right in the rain we were talking about the possible outcome of the elections and he said you know there's this guy he writes for New York Times and here's like a methods mystical method which predicts who will be the winner of these elections and he's got it right like for the past around 15 years or so he's always got this winner right because his method is impeccable it's you know and I said so who who is the winner according to this guy that for sure will be Clinton I
thought why are you absolutely confident yes all right no problem well the election days comes and you know I wake up in the morning just to check the results so I opened my laptop I go to Google and what to search for I search elections and it's like the first hit in there is just a photo of a lady who looks very sad like this so what's going on obviously we know what's going on well it was really it's quite difficult to find the picture of these two guys looking nice and friendly like most of the other photos you find on the internet they're like upset or yelling at each other or something but not so
many people remember that actually before the u.s. presidential election there was this guy called Gucci firm and Gucci fur was actually a Romanian taxi driver going by the name mark Chile's a letter and he was one of the first guys who claimed to have hacked Hillary Clinton's email server the very famous email server and he not only said that yeah I I know she has a mail server I hacked it and there's like a bunch of other people who hacked the server as well so we were kind of partying with each other on the server you know everybody knows each other from the mail server days so what was this guy doing he was a hacker and the taxi driver he
was mostly kind of self thought and one of his Morrow's why this nickname Gucci fur he said because if he combines the style of Gucci with the light of Lucifer whatever that means yes yeah to be honest he had no skills he had no knowledge of hacking everything he knew was from the internet Wow however with all this like not less a mess saying he was a hacker but he was more like of a script kiddie with a with a fashion sense he was able to hack Colin Powell Rockefeller FBI and the US Secret Service agents Carina karatsu which probably nobody knows who she is but I'll tell you she is a Romanian politician and he also managed to hack
Georgia Meyer now Georgia Meyer was the head of the Romanian intelligence service and this guy managed to hack his email account which obviously to be honest nobody in Romania cared when this guy was hacking all this Rockefeller Colin Powell guys but when Gucci fir hacked the head of the Romanian intelligence service well that's when things basically changed so he not only that but he started you know blackmailing the head of the Romanian intelligence earth he called him the skull and he was asking him for money not to release his private emails it's kind of funny to know how he actually succeeded in hacking Korena cred so she had a yahoo account and what he did was pretty simple he
tried to reset her password and there were two questions one of the name of the city where she grew up and he knew because she was born in a city in Romania called brayla the other thing the other question was the name of the street where she grew up and he didn't know so what did he do he went to the map of Braila he tried all the street names and Lily actually got the right one so that's how he hacked her account and probably others now as I was saying everything was kind of quiet as long as he was hacking you know the other side however when he hacked the top intelligence man in Romania Georgia
mayor it took about two weeks to find him and to arrest him so he got arrested and eventually he got extradited to the United States now what is interesting because well when dnc announced they got hacked and he hired actually hired CrowdStrike for that and i think this was a kind of turning point in the elections well they claim that the hack was done by two apt groups one of them called fancy beer also known as sofa C and a PT 20 a the other one called cozy bear and well nobody disputed these claims until maybe let's say a few hours later when a blog appeared from a guy calling himself Gucci furred 2.0 say like you know all
these fancy things a PT's they don't exist it's all fake it's actually just me and I hacked the DNC just alone by myself and you know there's no story here and to prove it here are some documents from the DNC now why this is interesting because obviously maybe a few people believed him but not just everybody and Lorenzo Franceschi be he arrived from motherboard he actually took the time to to do an interview with a Gucci fare 2.0 and he asked where are you from he said I'm from Romania you know so Lorenzo he is Italian but he used also Google Translate to try to speak a bit in Romanian with Gucci fur just to test his language skills so for
instance he said why did you put metadata in the documents like Russian metadata to which Gucci fur answered esta filigrana lamelle which is another way of saying it's my watermark and then he keeps using this word filigrana a couple of times and now to be honest is not just that but he actually he makes a bunch of other mistakes so this is not perfect Romanian there's like a kind of you know rookie kind of Romanian not necessarily perfect so one of the things here is that this word filigrana is extremely extremely rare in the Romanian language I have probably friends my age who never used this word in their life because nobody uses if we want to say
watermark is a watermark we don't use this fancy word filigrana so if you're wondering where it was coming from if you just go to Google Translate and you type this Philly granny word you'll just get watermark so it's like just let's say another proof that things were kind of fishy and we did get a confirmation recently when the Mueller indictment was posted and well this Mueller indictment goes through the timeline of the DNC hack and they say like well the conspirators logged into a Moscow based server and then they basically created this gucci fir 2.0 persona in order to undermine the results of the crosstrack research and how do they know it basically because they well search we
assume on google for certain keywords like for instance some hundred sheets DC leaks Illuminati worldwide now and actually you know he posted this they posted it on WordPress and they did use all these things they were searching on Google like for people who don't speak English natively sometimes you do search you have like an idea how to say something but your search it on google just to make sure it's a decent English you know it's not broken English so this let's say it's like the way they operate of course it's a reminder to everybody that everything you search on Google gets locked probably forever and if somebody wants to find it or ask for it
probably they'll have this capability and they will be able to see what you were searching for at any point in your life so what's missing from by the way what's missing from this Mueller indictment I think there's a very very cool point missing and I wonder what will happen with that if you remember CrowdStrike said this hack was done by two apt groups one of them kazimir one of them fancy beer so to be honest all this Mueller indictment it deals with the fancy beer and another group we call Hades however so far we have not seen any kind of indictments or political let's say outreach in regards to this other it group called cozy bear so makes you
wonder what's going on like are they probably next in line and I assume that this is part of maybe another big story so where are where are all the dukes nobody knows but probably will be an interesting story when they get published now I remember actually last year it was a Friday in May yeah you know usually Fridays try to be quiet so you go to work well do some things for a couple of hours and we all go home right so there was a Friday last year in May I go work and the first thing I see is a message from one of my colleagues in Spain and he says well we have a huge outbreak
here in Spain and I'm like dude on a Friday come on nobody wants outbreaks on Fridays I mean neither malware authors or security researchers and I like maybe you know if we forget about maybe it'll pass away well it didn't so things got worse and worse and maybe you remember what happened on May 12th when this place were looking something like this in Spain like they say it is equal this place look like that in hotels ATMs in Russia well this was the transportation in I think in Frankfurt ATMs in Asia pretty much all around the world screens were kind of red with the very well-known message which actually led people to well that's the great thing about people
and I do you know how to make fun of everything really no case making a comeback I hear is very fashionable to have this new Nokia phones with a battery that lasts for three weeks probably we'll get one of those yeah microwave ovens yeah Google glasses are conditioning Fitbit my favorite yeah in cars even the matrix probably got owned by wanna cry yeah well everybody was wondering wow this is like a huge thing out of nowhere who's behind it so to be honest was kind of quiet of all of us I tell you that that weekend I word Saturday for I worked on the Sunday I word the whole day on Monday and we were
wondering who is behind it like who well it's like not surprising that somebody 2k0 day developed by an intelligence agency and use it to build a worm but like still the code like the code base of this threat is kind of new so who is behind it it wasn't until I think was about 11:00 in Romania or maybe around this time that Neil Mecca from Google posted his beautiful pit which confused a bunch of people and while the other let's say half said he has obviously so yeah what did he mean by these numbers obviously we have two hashes and two to offset for each sample over there so actually what he meant he pointed to one
of the one Christ samples an older version actually had a subroutine which is identical to a sample previously used by the Lazarus group so as a way of saying that we know that this wanna cry thing is a project of the Lazarus group which if people don't know is a North Korean apt so recently of course they've been indicted as well for this but of course a question is how the Google managed to do this when everybody else you know was struggling for the weekend I'm wondering what's this one a Christ thing about it now if you remember in 2011 Google bought the company called Dynamics and I had a talk in 2014 with a guy working at Google and
we were saying like how do we find samples like I have this you know pattern I want to search but if I search it in our collection it will take me like two months and he said come on dude the CPU time is cheap you just pin ten thousand machines and you do a graph it's like two hours well I go to my boss the company sit here and say boss I need ten thousand machines his answer was not very pleasant well obviously well this allowed them in 2017 to link one and cry two Lazarus and to be honest Google is not the only company with such capabilities there's another company I was called binary it was now acquired by
crawl strike which can do the same kind of magic and there's another company now in Israel called in taser which have a very very cool technology allowing them to do exactly the same and of course it's got me thinking and how do you design this kind of code similarity technology without having to buy ten thousand machines or investing huge amounts of money so actually it's not that hard you can for instance strings binary strings out of samples and then you just check for overlaps for new files the only issue is that our collection is way too weak so at the moment our mother collection is 5 petabytes so if you want to search 5 petabytes for all these things obviously
it will be way way too slow so one of the things we've done and I want to talk about is apt similarity search using yara how many people are familiar with the era wow that's a lot well yeah for those who don't know is a well pattern-matching language developed by Victor Manuel Alvarez who now works for Google and it's actually one of the most powerful things you can use when you work in security threat research because it allows you very easily to create rules to match in malware samples so for instance I like to think that this is a lot more powerful than IOC so for instance let's say I'm investigating some attack and I'll give you guys a
hash and the domain so what can you do with that well in theory with the hash you can find a sample sure but in some cases each sample is unique so you'll never find a sample with the domain what can you do well basically you can block it in your firewall and that's all if I give you a URL rule then automatically you will see let's say things which are specific to this sample what are the kind of unique things about that sample so that's why it's so much powerful so what do we do for instance we can identify the relevant parts in malware samples and then we can try to build the yarra rule from that the only issue is
of course there's like a lot of strings in a sample so for instance a 100k file has about 102 well 384 16-byte sub strings in it so even if you filter out let's say the clean ones you still have about 30,000 strings in the sample so how do we know well obviously we cannot keep 30,000 strings for every one for every single sample so what do you do for instance if we look at these two strings which are extracted from the same sample immediately you as humans understand that the first one is more interesting than the second one so what the second one is like 2000 and a bunch of CC anybody knows what CC
means what was it sick trap well in 3 correct it is it is in 3 on the Intel x86 architectures so like a bunch of trees are not gonna find us let's say unique malware samples so if you know what to search and what to select for these are the kind of Viera rules that you'll write so for instance this one you see there's like a bunch of shell code fragments they do not appear in any clean samples in our collection and I like this thing we have the largest clean files collection in the world probably about 400 million files so this actually can help you find a lot of interesting stuff of course this can be further improved for
instance you can generate the yarra rule you can test it on a set of samples and then you keep only those unique patterns which should match across the entire family and this is how you make it more efficient more efficient and so on and this is how you find other samples produced by the same actor so just a few numbers just to brag a bit about our system well we process about a quarter million files per day we have about 6 billion clean strings in there and about 10 billion known clean up code sequences so this allows us actually to do some cool things which I want to show you next so I call this you know a tributing apt
malware by a common code one good example for here is the shadow pad apt I don't know how many people follow this story but last year and I'll tell you how we found it we got a phone call from a from a big customer from a bank and they said we suspect that we have an infection in our network so we said how do you know they said well we see DNS requests to some domains which are like very shady they come from a computer which is very sensitive however we can't find whatever program or malware is making these connections like there's nice we went there obviously and we started looking around we made the
memory dump we searched the memory down for the particular DNS request and we found a fragment of code which was responsible for that now the funny thing that it was actually inside a legitimate laughter from a company called net Sarang so we looked at their website download installers and discovered that one of their windsock libraries has been trojan eyes with a very nice encrypted payload like a sophisticated malware system that produced those DNS requests so who was behind the attack well we were able to find a common fragment of code between plugins from this shadow pad and plug-in self-served in a win nti incident maybe some of you have heard about winning TI it's a
malware which is used probably by several apt groups Microsoft calls one of them barium and that one is a very very interesting apt group which specializes in this kind of supply chain attacks they had the company and then that software which is pretty much being used everywhere well is their entrance vector into the organization's in case you're wondering what this code is it's a unique hashing API hashing algorithm which is used only in these two samples so it's very specific and it's only used by this group and makes us believe that they were responsible for this Shadow had compromised now there are very interesting case where the ccleaner incident was published by our friends from Talos as well by another company
called more physic from Israel there was a let's say pretty obvious who is behind this huge compromise I think Avast talked about several million victims in this attack but if you look at the code from this malware we find a custom base64 encoding subroutine I was spotted by our system and this is something we have seen it before with a malware sample called missile from an apt group known as apt 17 so when we did this I you know I posted on Twitter just for other researchers to to check it as well and it was pretty I will say quick that integer this is the company they also confirmed the overlap so by the way the same well we
call them genotype so these fragments of code we extract from our samples that are specific to that model or sample we call them genotype so we checked our apt collection with them and we immediately found a bunch of other samples like this missile from Aurora pandas ox PNG ticket and Grasim so this like overlap before this custom base64 with all these other samples which are used pretty much let's say by the same guys and in case you're wondering what it is missile Cris McConkey which is here in the audience somewhere I think has an amazing presentation about this missile thingy it's one of the guys who probably worked as a developer for several IPT groups
and he was able to docks missile pretty pretty well I would say and another thing is that Nevada published a report about the malware samples used by they call it the axiom group and they say that for instance tools which have only been used by axiom include ticket this ticket and Sox PNG which shared the same code with the Malheur from the ccleaner incident well another case which is quite interesting in my opinion is from an apt called the regen or regin if you're not familiar with a semantics was the first to publish about it in 2014 and we are followed as well with a big paper on this was probably one of the most sophisticated apts we have seen and
one of the cool things about them was their attacks on gsm-based towers and GSM equipment now the yarra rule that we wrote for oregano better said the system wrote for Reagan it found a file from the shadow brokers dump called Siena Lida chuan DLL so that was like immediately surprising that one and let's say pretty much nothing else outside of it so when we looked in there we noticed there were a lot of functions exported by CN Li DLL such as C and E file IO C and E file IO tear open and so on if you're wondering what do you think C and E stands for yes it probably means computer network exploitation which is another word for
cyber espionage let's call it the destructive things being called computer network attack so well if you look at the code what is the overlap well the overlap they have wrapper functions for all the let's say B ap is in the system so like they're open and file open all of them are wrapped by this library probably for portability this way you can link to this library and then you can run it pretty much anywhere on kind of any operating system how are we doing on time all right two more our story all right here's another case we call it the Lambert's this is another let's say kind of sophisticated thing we with research they've been around for a long time so
it all started with a sample called the black Lambert this black Lambert was originally discovered by fire I who found it together with the zero day so they wrote a bit about a zero day I think in 2014 in November and this was targeting a nuclear research kind of organization in Europe so we started looking at the sample trying to find others so for instance this black Lambert it had a configuration and inside they had this tool type it actually it says that tool type WL built and the built number and the version name it actually says version name in the config wasn't long and it will found another sample which had again tool type
AAA and we were able to determine that is a a stands for Archangels because of a PDB bath in the malware they have the build number which is a bit smaller than the black number so we've decided to call this one the white Lambert just to be politically correct then obviously we started finding out there's a green lamp or blue Lambert red Lambert well I would say that between October 2014 and October 2017 for a total of three years we found quite a few of them each one being like significantly different from the others but still sharing some common design principles for instance either using the same CNCs the same CNC communication algorithms the same
persistence mechanism you know the same file names and so on so this took like about three years and I'm not saying that this was the only thing we were doing for three years and just saying that well it took a let's say about three years to notice all these different ones now let's see how these code similarity tack does against the Lambert's well if we compare them pretty much immediately we see that one of the while amber drivers has the same fragment of code as the black Lambert phone Tex Floyd the zero day as well as the same fragment as a brown Lambert which is probably one of the latest we found so this takes like a couple of
minutes versus three years whatever the things we were let's say doing in the past in a number of years can now be done in a couple of minutes using this kind of technologies well here's an example let's say a counter example I know what you're saying this is an amazing technology but let me show you a case when this fails so the very famous case of the Olympic destroyer think Talos was probably the first to write about the Olympic destroyer and pretty soon on different companies you know they started to chip in with our attribution and research in tester and they said we found similarities with Chinese apts and three of them nonetheless apt ten
a BT 3 and apt 12 we saw that they were kind of skeptical that looks very strange why would the Chinese target the Olympic Games in South Korea then record the future came and they said we found similarities with the Lazarus which maybe makes a bit more sense because Lazarus is well North Korean apt the games were in South Korea the North Korean apt group Lazarus has previously targeted South Korea with destructive malware so maybe this makes a bit more sense well if you look at the code this is not like to be honest we are not disputing it there are for sure some similarities in this why / malware used in the Olympic destroyer incident and an
older why / malware used by blue north so we were thinking can we do better and we tried our own code similarity system against the Olympic district and we didn't find pretty much anything we didn't find any similarity with the Chinese apts but we did find something very odd so the system did actually spot one tiny very tiny similarity with the sample previously used by the Lazarus apt in particular this blue nor of a subset and this sample was kind of famous because it was used in the Bangladesh bank heist however this was just a small fragment of whites he wasn't like let's say as an entire subroutine and even worse it was not even in the coat it was in the header in
the B header so we discovered that actually the rich headers of the both these both files were identical how many people are familiar with the rich headers all right well again for those who don't know the rich headers are kind of a nice thing that visual see puts into every executable you compile well the linker actually puts it and it's like an encrypted list of all the different libraries and tools that use to build the sound so what does it mean when you build let's say different malware samples on the same system with the same unique combination of visual see version and patches is quite likely that the rich header will be the same and actually we
discover that in some cases this is like a unique fingerprint for the developer of that power so to have like a cross let's say our four billion samples in the collection to have just two with the same rich header both of them being used let's say suspected by a North Korean apt this cannot be a coincidence well later we discover that actually well this was just a fake copy so the attackers behind the Olympic destroyer which has an apt grouping are now calling Hades well they just copied these as a very cool false flag so they copied the rich header I I really don't know if they hope that someone will find it I don't know if they actually knew
that we will find it but it was interesting that actually our technology spotted this and it if it wasn't let's say for the humans to check the results we would actually be fooled as well and probably led to believe that it was a Lazarus apt group who did the olympic destroyer attack so just a few more examples bear with me wanna cry yeah this is a Yara rule that we built for wanna cry from all these op codes it catches blue nor of manuscript and Decca faith which are all malware samples used by Lazarus you know it's pretty small just five strings that's it five strings all of them they only appear in one a cry and all these
different malware samples used by Lazarus now I see Paul is taking notes right in the strings here's another one this one catches dark hotel samples based let's say on the code that we discovered in an incident we named scar craft later we discovered that this was pretty much the same a beedi except there was another very convoluted false flag operation in his car craft hacked the same command and control server used by dark hotel and used it let's say one week later in an effort to confuse us so this is something which you know it happens that apt is hack the same CNC is just to use the same infrastructure in attacks to confuse the
researchers so I know like for the people who wrote the other rules before probably this is how they feel when they see these rules with the opcodes yeah I guess it's pretty much the kind of a new level and it probably it simplifies the whole question of doing attribution of sophisticated attacks so maybe something you want to call attribution 2.0 automated attribution by code similarity so what does it mean well I think that tasks which took us years in the past they can now done it's in a matter of minutes like I've I have shown you with the language however I think this technology will become so widespread throughout the next years that actually it will not only be partly automated but
I believe that it will have a very significant effect and that is more false flags so for instance the Lazarus guys after the NSA said that you know I want to cry Lazarus our North Korean equities immediately they started added adding Russian keywords in their malware just because it's fashionable to blame Russia nowadays for anything but the first thing was their Russian keywords were like really crappy like the way that the keywords were written you know for instance word China which means like a teapot was written in a terrible way so didn't fool anyone but yeah I think the Olympic destroyer case with the rich header is a beautiful example of how this code similarity
can actually be fooled and probably how integer and recorded future or fool into giving this mistaken attribution to Chinese apts and Lazarus so what'll mean probably all these apt groups will start moving away from code and instead just use scripts I mean when everybody is using PowerShell COBOL strike Metasploit it's becoming almost impossible to know who was it right so I'm actually surprised that we don't see this more and more because it makes the attribution question so much more difficult so I wanna actually end by saying a few words you know about the future of security I like this thing that we're probably doing a pretty good job catching cyber espionage and I would say probably you're doing a good job
catching cyber sabotage as well so C and E and C n a there is one thing which I believe held like a lot of danger and it's something that we haven't been able to tackle very well so far and that is this mass opinion manipulation so we are probably seen throughout these elections and for the past two years people are very easily influenced about and I think that we as a community probably we can do better and we should do better trying to spot this mass opinion manipulation and trying to help the users and if we don't do that the risk will be very simple and will cost us our democracy and our freedom so with
that in mind thank you very much appreciate it and happy hunting stay foolish stay great