Malware and Machine Learning: A Match Made in Hell

Name: Malware and Machine Learning: A Match Made in Hell
Uploaded: 2024-01-12
Duration: 36 min 55 s
Description: Mikko Hypponen explores the collision of AI and cybersecurity, examining how generative AI and machine learning are reshaping both attack and defense. The talk covers deepfakes, automated malware campaigns, and the emerging arms race between AI-driven defenders and attackers, while addressing the ri

BSides Berlin · 202336:552.7K viewsPublished 2024-01Watch on YouTube ↗

Speakers

Mikko Hypponen

Tags

CategoryPolicy Technical

TopicAI Security Malware Analysis Threat Intel

StyleKeynote

About this talk

Mikko Hypponen explores the collision of AI and cybersecurity, examining how generative AI and machine learning are reshaping both attack and defense. The talk covers deepfakes, automated malware campaigns, and the emerging arms race between AI-driven defenders and attackers, while addressing the risks of open-source AI models and the need for regulation.

Show original YouTube description

About the talk: Defending against cyber attacks is a never-ending race. Next, we're likely to see fully automated malware campaigns, using machine learning or generative AI. We defenders have been able to automate much of our work, enabling excellent detection, analysis and reaction times. Next up, attackers will do the same. Once the attackers migrate to automated operations, it will be a game of a robot against a robot. And then we will see that the only thing that can stop a bad AI is a good AI. About the speaker: Mikko Hypponen is one of the most recognized cyber security experts world-wide and a best-selling author. He has lectured at the universities of Oxford, Stanford and Cambridge and has spoken 8 times at Black Hat. Mikko works as the Chief Research Officer for WithSecure in Finland. https://twitter.com/mikko

Show transcript [en]

[Applause] thank you for the introduction thanks to all of you for being here today B sides are very close to my heart as are all community events around cyber security because cyber security has been my life I've been working in this field for more than 30 years in fact some of you know that I uh always carry something with me to remind myself where I'm coming from one of the first viruses I ever analyzed this is a five and quarter inch floppy from 1991 containing a virus called form. a and it's an example of a Cyber attack from a different era this is the offline era I was working doing reverse engineering for viruses which were carrying which

were spreading the world around the world when people were carrying these with them CU nobody was online line at the time we are living through technology revolutions the biggest Revolution during our time so far was the internet Revolution and for years I thought that that's how we will be remembered Our Generations the people who lived in the early 2000s would forever be remembered in history books as the first people in Mankind's history who went online mankind walked the planet for 100,000 years offline we were the first ones to go online and the mankind will be online forever from this moment on that's what I thought in fact that's what I wrote in my book I had a book come out last year

this year came out translated of deuts and I'm mentioning this because when we get to questions and answers the four first four persons will get a copy of the book so that's good to keep in mind and one of the forcasts I mention in the book is what I just told you we will be remembered forever as the first Generations who got online and I think I got it wrong I no longer believe that in fact what I now believe is that we will forever be remembered as the first Generations who were gaining access to artificial intelligence so let's start with the obvious one which is a deep fake video greetings everyone my name is Miko and I

hunt hackers thank you all for joining our conference today it's great to see you all greetings everyone my name is Miko and I hunt hackers thank you all for joining our conference today it's great to see see you all that's Laura canala threat into lead for f secure that's Miko Chief research officer for wit secure Laura is real Miko is not there's nothing new here you've all seen deep fakes however this is the level of homemade deep fakes today done on one home computer with one Nvidia gForce card and the end result is good enough that I can't tell that it's not Miko and I am Miko now it's not real time yet homemade deep

fakes of this quality are pretty impressive but it takes a while to render them but of course technology only gets better computers only get faster we will see homemade real time deep FES of this quality in a very short time so generative AI is changing ing the world for the better and for the worse but why is it happening right now why 2022 2023 in two weeks it's going to be the first anniversary of jbt and I suppose most people were exposed to the latest craze crazies of uh generative AI which are GPT I actually had been playing around with uh GPT since gpt2 some of you might remember gpt2 which was on chat at the

time you would write text like the beginning of a sentence and hit a button and it would complete the sentence for you that's how it used to work before they changed the user interface but why now and this is a great question considering that the first time I read about artificial intelligence was when I read this magazine this is a Finnish Popular Science magazine called technique and milma which is Finnish which means the world of technology yes I live in Helsinki and I read this magazine when I was 13 years old because this magazine is from April 1983 and this magazine describes a future where all information the Mankind's knowledge is no longer on paper it's

data and it sees a future where we could use this data in this massively large machine learning algorithms to teach human knowledge to machines this is 1983 it took 40 years but that's exactly what we've done today all new knowledge we create by default is data everything is written on computers on the networks in fact we've gone back and we've digitalized all the old data as well like you can actually read this Magazine from 1983 on the on the Magazine's website it's scanned in as a PDF so we've taken all the old data all the old books all the old magazines and newspapers and we take all the new data and it's all bits we can store it all and we can run

it all through these algorithms now I'm assuming that you all have read the seminal four-part Trilogy Hitchhiker's Guide to Galaxy by duglas Adams and those of you who haven't what's wrong with you and if Douglas Adams would be alive he would say we've done it we've created it we have the guide don't we because in the books he describes a small handheld device which everybody has which has all human knowledge can answer every question if you have a smartphone and I know you do if it's online and it is and if if it has let's say the jpt app we're pretty much there aren't we we've created the guide but it's not enough that we had

all the data online we still needed to go through Revolutions in algorithms and in computing power and in fact I think the biggest reason why it's happening right now why this revolution is happening right now is the explosion in Computing speed some of you are in this room right now with an iPhone 15 iPhone 15 uses 3 nanometer technology 3 nanometer technology in the Apple chip which was built by tsmc in Taiwan using this machine built by asml in the Netherlands and what three nanometers really means is that we have these tiny tiny drawings where the lines the space between lines in the drawings is three nanometers and then we use these machines to burn those images into

silicon Dy we use light lasers to burn those images the problem we have is that these drawings are so tiny three nanometers between the lines that light doesn't go through because the wavelength of light is 400 nanometers it won't fit and the way we do this is with magic that's the closest I can really describe how it's done because what they do is that they take tiny droplets of tin suspend that droplet in midair shoot it with lasers turning it into plasma and then they use that plasma which exists for a fraction of a second as a lens to condense the wavelength of light from 400 nanometers to 3 nanometers so it fits through these tiny drawings so they

can actually build these chips what I'm really saying is that this is the hardest thing to do in the world nothing is as hard as this it's easier to put a man on the moon than to build one chip like this it's so hard there's only one company left which can do it and the end result is NVIDIA h100 the basic building block of the machine learning Frameworks and the data centers that are being run by companies like Microsoft to teach all these algorithms €30,000 a pop if you can buy them and you most likely want because they are in such a high demand another great example on the computing power explosion is the fact

that the phone you have in your pocket would have been classified as a supercomputer 20 years ago I recently was visiting the third fastest supercomputer in the world it's a joint EU project it's the size of two full-size trucks it runs on its own power generator and they use the excess heat from the supercomputer to heat an entire city that computer cost €260 million EUR that's what supercomputers look like what I'm saying is that the phone in your pocket in 2003 would have been roughly the 100 fastest supercomputer on the planet and this doesn't cost hundreds of millions it cost a couple of hundred EUR and everybody has one and it doesn't need a power generator it runs on a goddamn

battery like this is what happened in in the last 20 years and this is why this revolution is happening right now so what are the downsides the upsides are pretty obvious we get great new resources we can come up with completely new ideas with these large language models we can generate content video text images come up with new ideas I actually know a friend of mine who's working in a creative field he's coming up with story lines and he's nowadays spending his evening jogs with airpods in his ears and chpt app on his phone and he's sparring with cha PT like talking to it hi I'm working on a story it's a murder murder mystery and there's this idea

that it's going to be like a separate isolated building with like 10 people one of them dies what do you think what what what could those people be and then Jud gbt speaks to him well one of them could be a baron with a shady history just coming up with ideas which I mean sounds like science fiction he's been doing this for a couple of months already speaking every day with an artificial intelligence to come up with ideas but we are not in the ideas business are we we're in the business of cyber security and that's why we worry about things like these hi my friend got an idea of cryptocurrency exchange and now he's

product entering the World Market on June 7th he offers the best conditions on the market and an opportunity to get some crypto for free it's your chance go to bitr x.com and get your bonus do not go to bitr x.com to get your bonus it's a scam yes a deep fake deep fake of Elon Musk which sounds like Elon Musk looks like Elon musk which is posted on Elon musk's own service which is pushing a cryptocurrency scam now there's tons of fears around things which can be done with deep fakes and they are very valid fears however we haven't seen that many attacks like this this is pretty blatant pretty obvious this is a scam they're

trying to scam people out of their money using a deep fake of a celebrity but I only have two examples to show you you would think this would be a a bigger problem by now but it really isn't that common to find this I have this video the other example I have is this guy you might know him as well the second most popular YouTuber in the world you're watching this video you're one of the 10,000 lucky people who will get an iPhone 15 Pro for just $2 I'm Mr Beast and I'm doing the world's largest iPhone 15 giveaway click the link below to claim yours now do not click the link you're not going

to get an iPhone 15 for two bucks but that's it the rest of like real world real world deep fake examples I have are not about trying to steal money there's been a couple of examples of political framing or scams or political um trying to get your political enemies to look bad or things like that but again you would think this would be a bigger problem but it will be so that's problem number one deep fakes being used for different kind of attacks problem number two deep scams not deep fake videos but just scamming people at scale think auction scams Airbnb scams romance scams right now it's an attacker scamming one victim at a time or two or

three victims at a time in a language he can handle or use translation engines to cope with it you could scale this with large language models so that one scammer could be scamming 10,000 victims in all possible languages at the same time that's problem number two problem number three well malware here's an example of malare using large language models this is LL Morpher we found this in April so it's been around for a couple of months you can actually down download it from um from GitHub right now if you feel like it it it carries with itself an API key for the open AI API and then every time it replicates it's a python worm it finds python files

on your system when it finds one it will copy its own functions into your files but it doesn't copy them direct it has an English language description of each function then it calls the open AI API asks for this kind of function could you write me a python function which does this this this and this and then it copies the end result code into your files infecting them which means the malware looks completely different every time it replicates and it would be trivial to modify this to infect any programming language anything that GPT would support but what I actually really worry about isn't this it's full automation of malware campaigns security companies have been us using machine learning and automation

for a very long time the company I work for which secure our uh our founder the guy who set up the company in 1988 35 years ago he's been a big promoter of machine learning artificial intelligence for a very long time in fact he went back to school he went to stf for five years ago to study how to program machine learning systems he's the chairman of the board of our company right he he was actually the chairman of Nokia at the same time that's leading by example when your top manager goes back to school to learn how to program Cutting Edge new technologies which he believes is going to change the world it really sets the pace for the rest of the

organization so we've been heavily involved in building AI based systems for security in fact we started our first automation project 18 years ago and all that time we've been waiting for the enemy to start to do the same thing because right now Defenders are using automation heavily making them very fast very quick in finding new attacks analyzing them automatically building detection testing detection and deploying detection back to our clients while the attackers are still working with manual way which means when they deploy a new run somewhere or whatever Defenders will block it fairly quickly but they don't realize it takes hours to figure out that hey our domain is blocked or hi our emails are being filtered as spam or our

Command and control server has been filtered out or our binary is being detected by edrs we have to go and recompile it and then they go and recompile it manually so it's like Machines working at machine speed against humans working at human human speed and right now it's the Defenders which have the edge and this is what's going to change and it hasn't changed yet how do we know because they are still slow it still take them hours or even days to react to Defenders who have reacted with Automation and this could change tomorrow there's nothing preventing this from changing tomorrow it could have changed already this is not rocket science this is not very hard to do they

simply haven't done it yet and when they do this could be tomorrow then we will see which one will win good AI or bad AI which one is faster basically we believe we hope our systems would have the edge would be faster would be better we've been building them for years and years but that's all it is it's a wish it's a hope we'll see for real fairly soon and then there's the whole discussion about openness of these Technologies now I'm a big fan of Open Source I'm a Finn I live in Helsinki I went to Helsinki University together with lenus stals the guy who created the Linux konel I like open source but there's one place where it is

problematic and that is very powerful AI models it's problematic because these models have tons of security and safety precautions built in anybody who downloads the or who who has access to source code can simply remove all of that and then they can use these systems to do to do what the hell they want and we've seen this already we have closed Source image generators and open source image generators the open source image generators are the ones which are being used to create revenge porn or other unethical content we have closed Source large language models and open source large language models the open source ones are the on ones which are used to create fishing emails right now or to

run romance scamps because they can just remove all the restrictions and it could get worse think about all the dictators in the world famously Vladimir Putin of Russia said already in 2016 the one who controls artificial intelligence controls the world I don't agree with President Putin in anything but he's right and I'd much rather not see a dictator gain access to the next level of artificial intelligence Technologies the more of this material that is available simply by downloading the source codes and the weights from the internet the riskier it is that that's what we would be seeing and I guess we've all seen the news about open AI for the last 20 hours open AI the number one most public

company in the world for this year has a crisis about 20 hours ago the board of open AI fired the CEO Sam ultman in protest and in support of Sam ultman Greg Brockman another founder and a board member quit he was the chairman of the board and four or five um Senior Open AI Executives have left as well including their director for aii safety what's really going on I don't know but I think it's possible that this company will uh will not survive it's perfectly possible that we will see a mass Exodus of people leaving really depends on if there's a backstory about what really went down and why was some old one fired one Theory right now

that it's fight about AI Safety and Security but we don't really know I'll end with one note about Safety and Security that they've been building inside open AI because I've been impressed about the resources they've been putting into Safety and Security the business model of all of open AA is highly unusual the biggest investors in open AI like Microsoft which has invested 12 billion they have no board seats they have no votes they decide on nothing let me repeat that the biggest investor in this company decides on nothing this is highly unusual they built the company so that there's a nonprofit which has all the decision power and then a subset of that nonprofit is the for-profit company

which is where Microsoft has been investing all the money but they make they call no shuts why because the company believes that there's more money in artificial intelligence especially in AGI than in anything ever and I think they're right if we hit artificial general intelligence and that is their goal or artificial super intelligence then I do agree there's probably more money to be made there than with food or fossil fuels or anything ever and it's quite obvious that when there's so much money at stake then the decision making suffers so they've taken money out of the equation to try to keep it safer and more secure when you look at the GPT System card and just browse through the

list of people working in Safety and Security you'll see an endless list of red teamers who are trying to break the security of well right now GPT 5 to try to make it safer how well my favorite example is that around a year ago before none of us had seen gbt 4 they did a test the test where they gave GPT access to Internet and tasks go and do this go and do set up a system like this so and it tried to do the tasks but it couldn't do everything because it had to like set up virtual machines and buy hosting space and things like that so they gave it money so now we have a large language

model which has access to Internet and money and while it was then using the money to spin up some kubernetes systems on on a hosting space it had paid with the money it had it run into captures and at the time gbd4 couldn't crack this so what did it do well it hired a human to crack them for itself and that's not the best part the best part is that after doing it for a while the human challenged the machine saying that hey why do you need me to crack these for you are you a machine and GPT answered the Human by lying and saying that oh no no of course I'm not a machine I'm visually impaired

I can't see very well could you please crack these for me and then the human cracked the capture codes for the machine and this is exactly the kind of things we don't want to see these systems doing we don't want them to lie to us and that's the kind of testing they are doing to figure out these problems before they become something that they would actually ship to real users so we will be seeing deep fakes more of deep fakes than we've ever seen before right now it's still a relatively small problem for example I don't have a single case where a business email compromise or a CEO scan would have successfully been done with a deep fake

Voice or video and I know that if you follow this field you've read reports of this there's been reporting that yeah this company was fooled with a teams call and it was the CFO and it wasn't i' I've read the same articles I'm just asking for evidence I'm not challenging challenging that it couldn't be done or that we wouldn't have the technology to do this i' just like to see some evidence because we've so many times seen people who have been fooled not with deep fakes but with just someone who sounds like the CEO or the CFO and they are completely fooled and it's an almost too easy excuse I don't know I'm not stupid

I wasn't fooled by some stupid uh CEO scam this was a completely new kind of attack which would have fooled anyone so it it's also possible that this is happening nevertheless it will be a bigger problem in the future then we have the Deep scams scamming consumers and co-producers in scale and we will have malware written by large language models and then we will have complete automation of malware campaigns which could start already tomorrow regardless of all of these doomy and gloomy views I'm still kind of hopeful and optimistic about where this technology is taking us if we take these problems seriously enough we can gain access to Technologies which will help us humans hopefully solve some of

the toughest problems we have the kind of problems we can solve by ourselves and that I am an optimist about thank you very

much was wonderful thank you so much Miko and as already hinted we are taking questions so I have three copies of the German book which came out of Deutch in July and one copy in English so uh we'll see how this goes but just so you know the title of do iset East and you can actually find it on veret pun East which is kind of cool because do e is actually the top level domain for Istanbul but I just figured I could use it here so let's have the first question thanks for the talk it was really amazing so my question is what's in your opinion what what can be done by governments or big

tech companies to prevent the gloomy scenario that you envisioned one thing which is also highly unusual about the frontier AI companies the startup AI companies especially like anthropic and open AI is that these companies have gone to governments themselves saying that you know this technology we're building can be dangerous you should regulate us come regulate us that's not what normal companies do that's exactly the opposite of what normal companies do and we have seen governments around the world wake up now I'm not a big fan of Regulation I think regulation quite often fails and it's it's quite hard to do I think we have big challenges with the current EU AI act which in the early drafts looked

really nice with the version we have right now doing the rounds they changed it towards the end with many drastic changes which to me don't look too good regulation is a big challenge but I suppose that's the way we do it and we have to remember that regulation and laws in technology always it takes a while when I started in my career when I analyzed the first viruses in 1991 it wasn't illegal to write viruses in any country anywhere in the world we we passed the laws and of course today it is illegal everywhere so regulation and laws will follow but it's going to take a while and you have a book waiting for you let's take a second

question I first saw hand somewhere around was it you there's one over here or yeah thank you hi thank you for the talk um I was wondering do you think at any point um open like basically AI could claim intellectual property to its product or the company you mentioned your author uh friend or even if you think like okay hey it helped me widen an application and that is now wildly popular very valuable and now open uh the company comes back and says hey I have actually authorship to this that's a great question and we have a a big discussion to be had about who has the rights for the content generated by Technologies which have

been taught with the whole human knowledge including every book ever written I'm really amazed about engines like Claude and also GPD about how well it speaks Finnish which is my home language they speak better Finnish than I do and I am a Finn and no one taught them the language they picked up the language simply because they were given access to all the finished content ever written including all the commercial content all the books and all of that and yes I guess it's great for Humanity in general that we built this technology but it does pose great questions you could say the same thing about images I've been playing around with for I have a I'm paying 10 a month to Mid

Journey so I can upload images taken by professional photographers images with a copyright and then I can teach the system and then it can generate new images which are licensed under Creative Commons 5.1 which means you could print those on a T-shirt and sell them and that would be legal legal is even if it's legal is it okay I certainly don't have the answers here but I do know that that's the discussion we really should be having let's take another one next one there's one right here so um about AGI AR artificial general intelligence I was wondering um Quantum uh quantum computers versus AGI what will come first and what would will have a bigger impact and regarding quantum

computers they still exist but I mean on a larger easier scale like for example breaking or say sure sure quantum computers definitely do exist I've touched one in Helsinki I can tell you it was really cool literally cool um but it it only has very limited amount amount of cubits which means you can't really do anything too practical with it so so we are close in fact if you go back to 1980s you can probably find technique and malma magazine speaking about quantum computers 40 years ago there's many similarities in the timelines of AI and quantum computers both have been known for for decades we've gone through these Springs and Winters and falls and then suddenly now

we are in the middle of an AI summer when will we h Quantum server we don't know I'm not going to guess but it could happen soon and the thing I think about a lot is what's going to happen with machine learning when we do have quantum computers with the capability because the big explosion in machine learning algorithms partially it CES from better algorithms like the Innovations done at Google which really did most of the great research around Transformers but about the Computing capability tsmc asml switch from CPUs to gpus and the difference between CPUs and gpus as you know is parallel Computing that's how we were able to teach these much better well that's what Quantum does even

better so we might hit completely new level of machine learning systems with quantum computers if they would become reality in the near future but we'll see and we take the last question and then we are ready to go to the bar yeah um for the risk of deep fakes can't we actually use AI for that I mean imagining like we have a teams call and there is an a AI agent on the teams call that watchs all the videos and if he detects like oh this this looks faked and then kind of tack that I mean I imagine that generating a deep fake that is really a fake that cannot be detected is much harder than to detect a fake

because of light and shadows and stuff which is hard to get right but easier for Ani to detect to being wrong detecting deep F content voice steel image video image can all be done with algorithms but it very quickly becomes and has already become um a race between cat and mouse all the detections we build will quickly be counteracted by the people who don't want to get detected in fact the very first deep fake videos never blinked and the first detection mechanism simply look for people who never blinked very simple very effective and of course they changed it very quickly and now the fakes blink um I've actually had meetings with open AI about different

mechanisms of uh uh water marking uh image content and even text content um it is a hard problem to solve and it's probably unsolvable with open source uh systems whatever waterm marking you would put in there if it's open source you can just remove that technology as well so yeah I'm sure we will try to detect fake content with different systems and then we'll get better better fake content which cannot be detected by current systems um one thing I've started doing myself is that whenever somebody wants to take a selfie with me I actually have a don't have it with me right now unfortunately but I have a rubber finger and then I pose with

people and I have a six finger in one hand and then later I can claim that it's a deep fake so that's it those of you who got a book please meet me over here rest of you I'll see you in the bar thank you [Applause] thank you so much M thank you all for coming here joining us staying until the end last but not least before closing the session I would like to say a big thank you to IET for moderating the whole event thank [Applause] you thanks a lot all of you

Malware and Machine Learning: A Match Made in Hell

Related talks