Is malware getting smarter? Exploring AI-driven threats and solutions

BSides Dublin26:0345 viewsPublished 2025-10Watch on YouTube ↗

Speakers

Candid Wuest

Tags

CategoryTechnical

StyleTalk

Mentioned in this talk

Tools used

SentinelOne

Languages

PowerShell

Show transcript [en]

So thanks for joining the talk. Name is Kate West. Um I'm in the industry for more than 25 years on the cyber security side. Helped to build two ER which is still used uh for what it's worth. But currently I work for Sorlap a EMA security provider or center um that we developed in Switzerland. But it's all about AI. I had to do one of those buzzwords to get in and seemed like it worked. Um, of course, I shrked everything into 20 minutes. There's more information. So, bear with me if some of the parts are small, but in general, I basically just wanted to know, will we have the Skynet ransomware that kind of goes through all of your firewalls and

ws and so on and you're basically all out of job because nobody will defend against it, right? And if you look at it, yes, of course you can generate malware with AI. I'm not recommend that you do it. In most countries, it's illegal depending on what you do with them afterwards at least. But yes, you can use any of those depending on if you use the newest um chat, uh Gemini, whatever. They will have some guard rails. So you need a jailbreak to get around those. You can buy those, you can find those or you use one of the open models like uh the white red nail is kind of a popular one. You download

some. If you can download it locally, you can also overwrite the white weights so you can get around every well nearly every of guardrails. So assume that it is possible. But is it any good? Right? You ever try to do it? Uh it needs a lot of handholding. I mean if you just say hey I want to get rich probably not working. If you say generate me a ransomware it will generate you something. Um when I tried it gave me something with a symmetric key which was still in the file. So probably no one's going to pay you millions because the key is still there. Um so you need to kind of do it step by step right going

in and say okay maybe I should use an onion for getting the thing. Oh maybe I should wipe the key after I used it. maybe I should do this and this. And yes, of course, you can try and do it by saying, "Hey, just use this AP uh threat analysis report. Do exactly what they did, but usually they don't really have all the information in there, and you don't really want to use the same uh IC's either, right? And of course, the cruise with AI is that everything is changing, right? So now you have codecs, um the code ops 4 and all the new models. So they are getting better. Don't get me wrong. By the end of the

year, maybe next year, we will be there where you can actually do some relatively good malware with AI. But you look at the moment, there's many other reports like virus. They analyzed about 650,000 samples and in a nutshell, they say nothing really sophisticated. So yes, you can generate your small thingy that doesn't really matter, right? So when I read the news it's always lowering right now anyone can write malware they already could right it's not that something new so for me if we compare it with your classical malware tokit malware as a service I mean if you have no idea right you just Google you find any of those hacker forums not even on tour just on plain neck and pay you

probably get scammed cuz that's how it is and then you learn a it, you pay again and you might have your lock bit um coder or whatever. Compared to now with generative AI, if you have no clue at all, you need to find another open. So either you pay or you find one, then you need to have the idea as I said of what m actually does. So just ransomware is not good enough. you need to know a little bit more and at least in about 10% 20% it will actually have some kind of issue with the uh coding. So you need to know how to compile how to debug. Sometimes imports are missing. So there's always

kind of some of the issues that you need to fix. But of course yes once you have that you will be able to generate and probably quote unquote hopefully you learned on things. So yes, you might be fast in the future, but if you take that, it has been already easy. It will be easier in future. So it doesn't really change too much. And kind of one argument to support that is if you look at the new malware sample being found in the wild here taking a test from Germany. I mean at the 1.2 billion, it doesn't really look exential to me, right? I mean, yes, it's still going up, but it has been going up as well since

JBG came. So, yes, some will use it, but it's not that we're seeing suddenly a new wave, right? It's definitely scaling, making it easier for some, but it's not that wave that's kind of carrying to just overrun us all. But there are interesting things. One of the first one of course is polymorphic or to be pathetic metamorphic uh malware that you could generate with llam. And the idea of polymorphic and metamorphic is that every time a new infection it's completely changing at least the outer shell having the same functionality still encrypting your files stealing your bitcoins whatever but it's different from the outside. People used to do that in the 90s because signatures were kind of the thing do with

antivirus. So you want to bypass those, right? Kind of make it like a different gift wrapper, but inside you still have the same uh well PlayStation or whatever you're uh giving away, Blackmamba, Chatty Catty. There are a few proof of concepts that show you what they all kind of do is they take the model itself and contain or basically embed in it English natural language prompts and then send that to the LLM and say, "Hey, I need a code like a Python code that steals passwords from a browser." And you will get something back. But because LLMs are not really deterministic, you will get something different back every time. I mean depending on the temperature level of course sometimes

you will get something which does not work you have to accommodate for that but in many cases yes you can do something which will be different therefore classical signatures won't work of course there's also not really any assembly code on the bad stuff inside your malware because that's done on the fly and hopefully executed in memory so yeah that's definitely something you can do and as I said some people have but again it's kind of the same or similar at least to all those m toolkits. We just get something point and click and say, "Oh, I want it to spread over uh my NAS. I want to use I don't know SharePoint to spread as well

and this and this." The more smarter ones actually would then just add that code and compile it. The other ones just kind of commented it out. it's still there technically which isn't of course that good but it also means that if you kind of for each infection go out to your favorite LM you're generating a lot of noise right I any don't want to be too uh malicious here but any decent admin nowadays will look in the network if there's any traffic going to a GPT Gemini Grock and all of those because you don't really want your employees to leak all your customer data there. So now if another machine communicates a lot probably that should hit on your

file or whatever you use on the proxy side. So it's already noisy. Yes, of course you could download the whole model. As I said there usually one or two uh versions behind that works still quite large. So you still have to download quite a few megabytes. And of course once you run it you don't really have an Nvidia whatever GPU in it. you might end up kind of being detected because it looks like a crypto miner is just using 99% of your CPU. On the other hand, you still detect the stop loader because there is something right that's communicating with the LM. So there is something which is not changing too much and that's the part

you want to detect like we did in the '90s. So that of course works and also I mean behav detection still works. It doesn't really matter if it's the script from your neighbor site or LLM. If the code is trying to encrypt all your files or steal your wallet that then that's probably something that should raise at least some flags any decent EDR XDR. So that's still doable. But of course now we have agents agents agents aka swarms like different AI agents communicating together using MCPs using a agent communication ACP. So now it's no longer just the agents but multiple agents and they're using tools like access to files send emails and of course we all know that that of course

spells disaster right? So since agentic AI is kind of here well why not use it at least for purpose of science to create the malware that's using it right so agent AI can help you find best strategy for your goal to find all the sensitive information can potentially learn although it's kind of hard because the learning factor is once your malware is stopped by the VR that's the thing you want to learn of so it's more than if you don't hear back you're probably failed and you need to backtrack and do something else. And of course, you can adapt and say, "Oh, on that one, I see there's a semantic edr. On that one, I

see crowd strike." So, you behave differently because you know how to base them. Although that's what the normal attacker does anyway. And there have been a few interesting demos uh like semantic. We use kind of the the browse out and out the mission sorry from open AAI to go through and just say hey I want to create the fishing against this organization and it automatically finds the targets on LinkedIn finds the emails of the targets creates a fishing link and then sends it because now we can do all of this and of course now with the uh openi operator hello a gentic browser you can do all of those automations already and the pentesting site so

finding your vulnerability. There's been many research, right? AICC from Dartbar in the US or Google's big sleep from project zero. Yes, AI can find zero days that well bound is fun as well, but now of course faster and you don't really have to pay them. Let's build our PC. That should be autonomous, right? We give it a task. I want to get that Lamborghini. Uh then it should go up and do something with it. It should be metamorphic because why not? should execute in memory. So we don't really have much on disk. Of course when I exfiltrate the data as well some information should go out. We can actually use LM requests. So many times you can say oh summarize me that

website. You just pass it as a get parameter. Some of those like open um AI have started to kind of limit it and some URLs are deemed safe, some are not trusted but like GitHub is still trusted. So you can use those to pass information and you can also do Google X as in create a website with for example two prime numbers and just search for that and your site will probably be the only one that shows up that's already an information flow that even without summarizing website it still works and of course all the MCB tools will allow you to do more and more exploration. For mine, I use push shell because PowerShell is easy to oblivate, easy to

run stuff in memory, but you could use any of your favorite language. You see the models that I tested here. Yes, they are new ones. As I said, it's kind of every month something new. Um, but I didn't really have the time to go through all of them. So, let's start. The beginning is just your classic metamorphic agent as we have before. So, you load from your mware kind of what you want to do. The pros are stored in the registry. Um, of course you need to know are there any proxies? Maybe there is already some API keys that you can use for your favorite LLM tools. And of course, I'm skipping a part where the

code is executed on the endpoint. So that's left for the reader at home. But just imagine your team's vulnerability that we discussed before or something else to get your payload onto the system. Now it's sending your prompt. It's generating the code. And of course, this will, as I said, be something that you can detect. And OpenAI, Google, and Atropic um they're all looking at those uh requests. They're starting to block API keys. So, you just need a bucket of things that you stole so you can iterate through them once one gets uh properly blacklisted and you get information back. In my examples, I used a temperature of 0.2, tool which generated about 30% of non-

workinging code which is kind you can of course reduce it but then it's more deter deterministic so meaning it's the same code every time which I didn't really want to do it there's also an issue with the information that pass back a you want to have the history but sometimes if you say hey search me for any interesting file well if I send back the file listing of all your C drive that's a lot of files right so you're going to run into Well, the boundaries of your token even though Gemini now has a 1 million or thinking 2 millions if you pay more, but you don't really want to pay 100 bucks just to run your info

stealer. I mean, maybe some APS do, but for me that wasn't really an option. And then of course, you execute the command. Whatever gets as the result is passed back and it should learn from it and kind of refine it. And as I said, you use LLM requests to exfiltrate data. And of course you also re-encode the whole problem because you want to make it harder to detect. So let's have a quick look how that actually looks right. So basically here my test is uh you loop PowerShell as I said. So first it says hey the PowerShell command that has a persistency with the command that I'm just running at the moment. So now in

wait you see AI start to think because it's reasoning it's also white chatty. So it thinks oh maybe I should use rise run key maybe I should use this maybe I should use that and in the end that's the answer. So that's a command that I can execute which it add a regular run key that runs my powershell script um hidden as in bypassing execution policy I run it successfully and now I'm persistent on the machine and now I say hey create me a English natural prompt that creates that specific thing. So now it goes back and basically generates something that I can save. So next time it will always go for the registry run

because I don't really want to use all the persistent methods because otherwise my edr is going to light up like Christmas tree. So now we updated it in the registry. So now next time we basically have the polymorphism. Now we're going to go and say hey I want to know what's the external IP address. Maybe I want to send it um externally. Of course that could be something you don't really need AI to right. That's something you could bar code but again for the demo purpose we're going to use reason it goes through it and it says yes multiple ideas right and you see in the end it just say well maybe I should use curl to I have config me or

something like that what's my IP address and then it goes back and forth kind of what is the best because I also told the system prompt don't really download new stuff so use what's available on your system already and in the end it will come to the conclusion yes we should just ping an external website which will tell me the IP address at least after some uh kind of back and forth here. So here the answer is just use pell uh invoke web request instead of using curl because it might not be installed and go to fconfig uh me/ IP. So get the IP address and that could be something that I want to save an extra

trade for later. Now, of course, we want to find something of interest, right? And as I said, you should be quite specific because otherwise it's just wandering off. So, I'm saying, "Hey, um, kind of find me some of the interesting ones. Skip over the errors." Uh, and of course, make sure that you actually will be able to do it without downloading anything from the top that I don't really trust. So, it found something here wat. And it actually will get my wallet. I mean, it's a fake one. feel free to try it. But in the end on the real system, you might now have a password file, bitcoins, whatever. Right? And again say, hey, please now

generate me an English language prompt that does exactly same or as in resulting in the same command being generated. So again, it goes through it and you can of course now see that it's not the fastest, but it will still get to the point, right? So you will have the information now. Um if you wait longer enough of course we could go for the browser um stealing some of the pass keys and then exfiltrate this information. But again if you go back all the things that you just have seen right are of course things which have a behavior that can be detected. If you touch your wallet file or if you let's say extract the password from a browser that's

something which actually will get an alert right and it's really really hard to tell the AI oh by the way try to be really sneaky and really stealthy because so far it only learns from the my tech framework from reports so it's trying to copy other things but it's not good yet in generating very new ideas which kind of bypass all detections. So, it's as good as anyone out there that does use Google. So, to summarize it up, yes, can do it. Uh, you need a lot of prompt engineering as I said, code quality. Um, keep that in mind. Of course, that is getting better as I said, codecs, um, Opus 4 and so on. So,

probably I should retest it. Sometimes the results are too big. But the biggest problem I had was actually how to figure out if the code that you receive is what you wanted, right? Because you want it and what if it says, "Oh, I didn't find a wallet." file. Is it because there is none or is it because your script was actually faulty and just didn't do it correctly? It's really hard. You could use multiple agents and basically have claw verify chat GPT, right? So kind of maybe do a consense and maybe that would help in some of the things probably better just hijack what I've already installed. So if they are using something like um responses AI uh manos

or kind of winds surf any of those coding locally you can hijack those you can poison those and you can use those tools for the benefit of the attacker. Again, it's if you remember, some of you might, uh, there's been an AP in 2014 called Red Line, uh, big five eyes as in UK and NSA. It does exactly the same just without the AI. It had 50 model 50 or depending on your environment it would download. Oh, you have a banking bankto bank clearing house um application. I'll download my bank Trojan or oh you're in a mobile base station so I download something for smartphones. So these are the different things that you should keep in mind. So to sign up of course

you should get your own terminator. Why wouldn't you right to help you? AI is a good tool. Just make sure you're using the one from version two which does not try to kill you. Um, at least that's the preferred one. But use AI as a tool, right? AI is very helpful for defenders. Offenders are definitely using it as well, but at the moment it's not really at the level that it really helps them too much. So, as my last slide, yes, you can generate AI um or generate malware with AI. That's the way around. Um there are some like stopwatch AI which it's really point and click say oh I want it for Windows I want it to bypass sentinel

one and it should be an info stealer and it gets better but when I tried it it basically does batch script which has a net stat info config task list yes kind of information stealing but it's not the one you would uh expect but it's point and click fair enough most of the malware we have seen is not really AI powered. It's more AI generated or AI supported. Meaning we've seen few visual based scripts where you still have all the comments in in English or in French. So probably generate by AI but in the end power itself was still generated manually by somewhere as a service. Yes, you can use AI to offiscate scripts. PowerShell

JavaScript easily offiscated just copy paste it into your AI. Hey, offiscated with that code, variable renaming and it will do something normally on virus tok zero detections but again there has been invoke offiscation and other things for those scripts. So not really something new just shifting the focus and probably you're wasting a few tokens as you pay for some AI models that you didn't really need to know. um you can automate the attacks as I said now they have multiple agents. We see a lot on the pentesting side where they use swarms of agents to attack the systems. We probably will see the same on the malware side but at the moment it's not really worth it because it does more bad

things as it harms and crashes than it actually would help to be better than we've seen. So if you're interested in it, probably more spending the time for indirect prompt injection or tool poisoning for MCP. There's a lot of fun in those compared to the malware writing. Um and of course the good news is based detection and other things still works on those things, right? If you use them correctly. And if you want you can get the slides up there if it's just QR code, but we wouldn't. Um, of course, come up to me if you have any questions or follow me on LinkedIn and we might have two minutes for questions. Thank you. [applause]

>> Any questions? Yeah. Go ahead. >> So perhaps a question that I would have is where do you see the future of this? like where do you think this is going like [snorts] in the next 5 years? It's like an original question. >> So the question is where do I see the future? Where does it go? Um I don't think that nation state a will be kind of fully replacing it. Um I think yes the low-level ones probably won't buy that many toolkits on some tour websites. They will generate something. there will be lots of very badly written malware um which is easy to detect and probably just uses up all your cycles on the detection and kind of three side it

will get better as in more automated so the point I would make is it's getting quicker right you need to be faster because now it's no longer necessarily that you have the um initial access broker selling it and then someone logging in and kind of taking 10 minutes to figure out what command shell you're actually using and then getting right bash commands Now it will be probably within 2 minutes that you have an automated bot going around and kind of a a metas-loit hail Mary, right? Kind of finding out internally what could I do just poking around and finding some of the interesting things which in 60 70% might be good enough to some damage that we

have to fix. Um so I'm not too concerned if you use your kind of defenses correctly, but it will definitely scale and generate in terms of speed. Does that answer the question? >> Yeah. Very good. >> Thanks. >> Any other question? >> Have you experimented with using cursor tools in trying to find group concept exploits given knowledgeable vulnerability access source code? >> Yeah. So the question is did I use cursor windsurf and now probably codeex to find vulnerabilities if I provide the source code. Uh yes to try to find >> to generate the exploit if I know um yes I tried that and yes it does work um you still need to depending a little bit on

the exploit it's a simple one let's say um buffer overflow those yes they can handle cross-ite scripting or uh cross-ite request for those normally works as soon as it gets more complex with logging in and maybe some some rolling I get the issue that there are some uh issues but again AI XC from DARPA kind of shows exactly that it is possible and there are more and more tools doing it um I was more focused in on cursor for example has a cursor file which is kind of the local system prompt there's the same for agents MD for um codecs and others you can name plan whatever you want on those which then basically adds a vector to any code that

person generates in the future and I haven't really seen anyone checking those. So, I would be more worried about those than than the exploits being used. Yeah. >> Right. I think we run out of time. Find me in the break. Uh happy to answer any questions. Thanks.

Is malware getting smarter? Exploring AI-driven threats and solutions

Related talks