
So I have the pleasure of introducing our next speaker Axel. Uh she is principal security researcher at Forinet and she's well known in the community for reverse engineering and RAD 2 and today she's going to talk a little bit about the use of LLMs and how that's kind of evolving reverse engineering I think would be fair to say. So I won't do any more introductions. I will just hand over over to you. Thank you very much. So yeah, the introduction has mostly already been done. It's my first time in Norway. My first time of course at Besides Christensen, but I'm really excited. And u what is really cool is well if you have any question at the end
of course feel free to um come and talk to me u in English um because uh I only picked up like three words yet. So it's going to be like a little bit difficult otherwise. And my daily job is to uh work on malware. Okay. So I reverse them to detect them. There are many malware. So I'm only focusing on those which are for mobile phones, Android or iOS. and on IoT uh malware. Okay, those and with regards to IoT, I am the founder in France of the phone CTF which is capture the flag which is dedicated to uh smart devices. So, smartphones, connected objects and things like that. So um this talk has the buzzword of artificial
intelligence in it. Uh for some of you uh it attracts people for some others and have them run off. So for those of you who were uh yeah trying to run off I can just say that I'm also going to talk very much about malware. Um so it will also uh discuss how to reverse uh malware and how um we can do it faster with or not with artificial intelligence. And the thing I learned actually yesterday was that actually it seems that in Norwegian uh AI translates basically to KI. So if you see AI you can say KI kic intelligence I guess. So yeah that's a kind of uh one of the things I learned.
to reverse uh the malware. Uh we are going to use throughout this presentation this assembler which is known as R2. Are there people in the room who know Radar 2 or have already used Radar 2? Uh yeah one not that many. Gitra perhaps a little bit better a little bit more. Okay. Um so actually in terms of de decompilers and disassemblers, GDRA is a little bit better known but it is far more recent. The other one is nearly 20 years old. It's really usable professionally because um it can uh support many different architectures, many different uh binary formats, things like that. The look and feel is a little bit surprising. You'll have the surprise because I have a few demos. It feels
like well if you've already used vi it feels a little bit like vi because you have a lot of crypted commands but the good thing is that it's very scriptable and yeah the screen went off. There we go. That's no problem. Okay. Um so I'm going to use R2 AI and R2AI as the name goes. It's a plugin for R2. So plugin for our disassembler and AI means that we are going to have Raider 2 get some assist assistance from the AI. I'm just highlighting there that it's assistance with the AI. Do not expect the AI to do everything for you. Uh it won't work uh that way. It's really like you've got to guides the AI if you want some
interesting um input as a return. It's implemented uh in a way which is pretty cool in the sense that uh it is not tied to a specific LLM. Sorry, it might not be clear. R2AI here is an open-source tool that I am using but I am not um a developer of R2AI. It's not like my tool. It's just something that I use and I'm show going to show how to use it to uh to use malware. Okay. So, it's not tied to a given LLM. If you want to use it with GPT, you can. If you want to use use it with anthropic code, you can. Uh code, you can. If you want to use it
with Mistral, you can. Whatever. And actually, I presented this like two weeks ago um at another conference and I had to update the slide because in two weeks it's changed. Okay. Um, this one had been out like on June the 25th, something like that. And it works straight away with R2AI. As long as the API hasn't changed too much, most of the time it works straight away and you can use the model. So um when we have a malware and we want to do something about it and get some assistance from AI, we have two different modes with R2AI, the direct mode and the automatic mode. Going to use um a case as an example. So
it's a little bit like shell code. If you were at the talks a little bit before lunch about shell code, this is uh a malicious shell code which is used um compromised Linux hosts and we want to reverse that one and the good thing about it is that it's very small. Okay, it's a few kilobytes and in terms of reversing the smaller it is the quicker you get into reversing. So normally it should be fine except that in this particular case well this one is a little bit hard because as you see if you have a look at all the strings that you have well um I mean Norwegian isn't better for me but this isn't really
helpful. Okay. And then when you upload it in GDRA and ask it to decompile stuff, um, well, this is supposed to look like normal C code. And I I think you'll agree that it's really not very understandable that way. Um, if you have any experience in reverse engineering, you might notice the line, this one. No, I'm not going to play with the mouse, but the line swi interrupt is probably g going to give you a clue about what is happening afterwards. Okay, so in the direct mode what happens is that you are just sending a single request to the artificial intelligence and you get some answer back. like all the time you send a context and in that
context well uh you said plenty of things. R2 AI prefills some some things and this is supposed to help us reverse this uh malware to get something better that what we have with GRA and understand better what is happening and this is where I'm going to try and do a live demo. Now the difficult part of it is to switch Yeah, the difficult part is that I don't see that on my own stuff. Okay, so we're going to load the sample and the sample is yeah, shell code elf. So this is R2 and I told you there were kind of cryptic commands. AAA means analyze all and do all your preliminary stuff. Okay, it does that. And then I'm
going to list um the functions and there is only one fortunately. I told you it was a really um small shell code and we're going to decompile it. and you get that and it's a little bit like gedra honestly it's not that clear okay uh unless you are used to uh reverse engineering because there are no calls to functions in there to help you out about what is happening okay so now we're going to ask the artificial intelligence for some help so minus h to get all the commands that you can have. And now I'm going to say that I am going to use um going to use this one. And now I've got to specify the model. M
O T E pill um code straw latest. Yeah. And now I'm going to tell it minus D is to decompile the current function.
Okay. So it's the the funny part of it is that I never get of course exactly the same answer. So each time I do the live demo I have a few surprises. Okay. Um uh the the naming here with this particular model isn't that good because it's still using registers and in normal C code you wouldn't use like variables named EAX or something like that. The good thing is that you see like socket calls here. Okay, which is pretty good. Uh, I'm going to try shortly to improve that a little bit.
No, it's R2 AI. A AI.
Um I So, this is the default prompt. And I'm going to do something like a better um decompile the function. Um do not use register names as variables but something needing full. Um I want readable C code as good dev would right. Yeah it's a bit long. Okay sorry. Uh and we are going to try that again and hopefully it will be a little bit better.
So you get lots of text. It's always wordy, but it is indeed better. Okay. Um I think this is starting to be a little bit more readable. We see that well obviously there is a socket which is created. Uh it goes here to um an IP address. We'll check this a little bit afterwards because it's kind of a little bit strange. It goes there, connects there. Um it's changing uh the memory uh yeah the permissions for this uh area reading data and okay we'll go back to the slides. Um okay. Wow.
So um it's a little bit like uh in our um first talk where he said do not assume anything. Do not guess anything. When you're a malware analyst and you get help from the AI, do not assume that the AI is saying something which is true. Okay, you've got to check it. So for instance, whatever is going to matter to you. Um we see here for instance that it is using many sockets but we didn't see any calls in the disassembly to sockets. So this is kind of a bit strange and we've got to be sure it's not a hallucination. So we see that this is generated by the AI. Actually it comes out from this
assembly code which is there and those three lines are actually indeed um performing a call to socket because it is the interruption software interruption a here is performing a system call and then which system call is P is put in the register EAX and it's going to be the number 66 and number 66 is socket And it's a bit more complicated in that case because socket call is actually a multiplexer for several different socket related calls. So here depending on the first argument here it's going to go either to sockets to bind to connect to listen or to accept. Now we're going to see that in detail. So this is the assembly when we are
doing an exor between EBX and EBX. This is the classical trick actually to put zero in EBX. Then we multiply EBX by anyway EBX is zero. So we're going to have in EAX which handles the re the result zero. Then we push on the stack EBX. So we're pushing zero. We increment EBX. So we have one in there. We're pushing that. We're going to push one on the stack. We push two on the stack. And then we put 66 in EAX. Then we put all uh well put the stack in ECX and we do the software interrupt. When we have the software interrupt, we see that we have 66. So this is going to do um the multipplexer socket call. The
first argument of this socket call is an EBX one. If you remember from previous call, it means that it's going to call socket and the arguments for socket are are in ECX. So we have AFET sock stream TCP. So in this case the AI was perfectly right and we checked it but everything is fine. Now another case here uh I told you it's a bit strange in my opinion in my experience when we have a malware which is contacting something which is local this is the local host address doesn't it's possible but it's suspicious for me so I checked the assembly the corresponding assembly is here and if you map this assembly okay to the C
structure which corresponds there well you You're going to have AFITE but you have 23 23 is here 236B there okay this is going to be the port so it's not port 80 but port 27427 and same the IP address is not the local host port but this remote IP address so actually the AI was wrong we are not contacting something which which is local. We are contacting a remote website which is which makes much more sense actually. Um we go a little bit faster on on the next ones. Okay. But again we have to check everything it says here. It is uh setting the protection and it's saying it's for this address. Well not really.
If you check uh the assembly it's setting um those permissions for the stack. And here when it's reading something from the actually it's not reading from zero it's reading from the socket. So the input socket is sending it stuff. And then another uh typical error that we see very often in AI is that it omits things. Sometimes it thinks that things are details and it says okay I'm not going to talk about it. But the fact the the issue is that what is details for it isn't always details for you. Okay. And here so what we have is pretty important is that here we're setting the stack as readable writable and executable. Then in the socket the remote end is writing
a payload and on the stack and then the AI completely forgot the last step which is executing what has been sent. If you're not executing it, that's kind of a bit strange and but it will be useless, but that is not a detail of course. So if we recap, we have an infected host. Okay, we create a socket to the remote end. It sends a malicious payload and it executes it. So we got this with the help of the AI but we had to check a couple of things from that because um everything wasn't totally correct. Now the automatic mode the automatic mode in my humble opinion. Wow. Okay. Um the automatic mode is uh even more
powerful because you are tell telling the AI hey uh I have uh tools that you can use. And for instance, uh it's just like if you were telling the AA, hey, I've got a screwdriver and if you want to use in your stuff, if you need a screwdriver, I can lend you a screwdriver. So here we're going to tell it, well actually um for instance, okay, decompile this malware, but if it's useful for you, well, I have radar 2 installed on my host and you can send me commands for that to help you with the disassembly. The AI will say okay well locate for me then the entry point and the command is IE or two AI is going to say uh to the end
user are you okay if I run IE on your host and you decide or not in the end if you decide yes then IE will return uh the entry point address and we'll send back that back to the AI and then it continues until the AI I may have like many other questions uh to uh to to till the end until it is able to solve uh your question. It can do many other things actually not only um uh execute radar commands. It can also execute JavaScript programs, Python programs and everything. The only big issue there is that it's running on your host. So, you've got to check what the AI sends you to be sure that it's
not going to be harmful one way or another for you. Um, so we're going I'm going to show you again another demo, but this one is recorded, which should be like a little bit uh easier to show um on another malware, Leadix. Okay. Uh it's more complicated. This one it's bigger and everything. We're not going to do everything about it, but we noticed that there are offiscated strings like this one. And um we know also that the function that deoffiscates it all is this one. So we're going to ask R2AI to do the deopiscation for us.
Yeah. There we go. I'm going to clear that and
play just there we go. So we're loading R2 on our sample levix. Same thing. We're going to do the triple e a thing list functions. There are four more. Okay. Um but we are going to disassemble. And here well um it's up there and my font is a bit big but we have here the deopiscation function. Okay. And I'm going to locate all places in the main which are using the deopiscation function. This is this command there. We see it's used in several places. And I'm going to decompile. No, disassembled. Sorry. For instance, this place and before. So yeah, I did it the other way around. So you see here the offiscated string here and then after that the code
goes up there and it calls the deopiscation function. Now we configure R2AI to use our favorite model and this particular case I get very good um results with clothes and uh we are going to trigger the automatic mode with minus a and then we have our question and we're basically going to tell the AI hey I have some offiscated strings can you do my work while I go to sleep and um and deopiscate them. For me, I'm joking. Most of the time you can never go to sleep because uh you just have like to hold its hand all the time like you know
it's nearly done.
So you can send some input like um to okay um to the thing and then you see actually the AI doing its work. So for instance it it tells us okay let me examine first the offiscation function. Then here it is trying to locate uh the offiscated string and actually it struggles which is quite amusing because it didn't do it the right way. So but you see it's going to improve itself eventually it will actually get it
and this is the correct commands and here this is the offiscated string and now so we let it do a little bit it so you see hey here it is sending our two commands to us and then it builds a script a javascript to deopiscate the string and I just have to review that and change a little bit if I fish and accept to run that or not. And there you go. Can you see it? This is the deopiscated string. So, it is actually touching in the in the chrome tab. And there we go. Back here. Sorry. And I've nearly finished. This is done for promote. We don't have much time but basically what is important there if
that if you have packed malware well decompilers don't know how to handle packed malware. Uh you always have to provide them unpacked things first. And it's the same with AI. If you send to the AI a packed sample it won't be able to process it. Please well first uh if you want to get some answer first unpack it and then uh tell the AI to run on that. And that's basically the takeaway. Takeaway well um it will save you time but please do check everything that the AI says that matters to you and do remember that the AI is an excellent storyteller. Um so it will kind of convince you that something is true. Uh
you should really pick on facts and make sure it is really true if you don't want to make a fool of yourself. Okay. And the other thing also is that if you try just like for two minutes the AI and it's not giving you a good output, do not uh let go immediately. You've got to uh persist a little bit. Try again, improve your prompt. It's a little bit like if you have a smart intern in in front of you. Your intern is not going to give you just like sparkling great results immediately. You'll probably have to guide your intern a little bit to get the good results. it will happen but you need a little bit more time
and that's all thank you very much for your attention everywhere and I'd say yeah so the maintainer and the creator of R2 is this one Sergi Alvarez and Danielle is one of the main contributors to R2 AI so those are the ones who build the tool that I can use to uh reverse the malware
Thank you very much. Are there any questions them all into silence? Do you think AI is going to change the face of cyber security? >> Uh I don't know but I'm lucky actually I didn't have like somebody tell or an AI in the audience asking me a a question right now. >> Okay. Uh question. >> Yes. So I saw radar too. Uh but have you talked to binary ninja or gedra or any of these? Are they developing their own competitor for this to? Uh >> um they are. So gra is um has a plug-in which is called gedra mcp a modx control protocol um which is working also quite fine but uh there are a couple of other
issues with it is that you cannot control the same way the commands which are sent to your host. you don't have the fine grain um possibility to see exactly what is going to be run on your host. So this can be dangerous when you're analyzing malware. Okay. And um IDA pro same thing. You also have some MCP um stuff which are being developed for binary ninja. I'm not sure. I don't know. >> Thank you. Thank you very much. You're welcome. Final round of applause.