← All talks

BSides Berlin 2023: Bar Lanyado - AI Package Hallucination

BSides Berlin17:33240 viewsPublished 2024-01Watch on YouTube ↗
About this talk
About the talk: Revolutionary research exposes new attack technique using ChatGPT! Discover how attackers could exploit its hallucination to spread malicious packages, posing a grave threat to developers and production systems. About the speaker: Bar is a security researcher at Lasso Security. For the past 6 years, he has worked as a penetration tester and security researcher. During his career, Bar has tested and researched various areas such as Mobile & Web applications, reversing, supply chain attacks, and more. https://twitter.com/BLanyado
Show transcript [en]

hello everyone thanks for joining my talk I'm ban and today I'm going to talk about a new attack technique that I found which called AI package ucation uh this technique using generative ey tools like CH GPT in order to spread malicious packages but first let me introduce myself as I said my name is Balan today I'm working working at as a security researcher at Las of security which is a company that provides an end to end Security Solutions for the llm era I'm an us contributor in the past six years I used to hack and research many fields in the security area like application mobile security H Supply chains and today I focus on llm before I the talk I just want to

thanks to vcan cyber for the opportunity to perform this research while working there and toal kman and y divinsky for helping me during it so let's start generative AI is the fastest growing technology ever I mean only chpt itself got 100 million users in less than two months I mean it's faster than any other app any other technology in my opinion the evolution of the generative Ai and the llm is the biggest thing that happen since the establishment of the internet and it's much more bigger than the establishment of the cloud but with this great power of these tools come some great vulnerabilities here we can see the newest top 10 for llm and we can find

here some new vulnerabilities like pump ejection which is one of the most popular subjects for llm security here now these days we can always find some papers that publish how they bypassed CH GPT prompt and succeed to make it to generate some malicious code or just generative window CD case and so on from the other hand we can see uh some familiar vulnerabilities like insecure output handling which talks about that H using handling the output of the these models insecurely could lead us to xss csrf ssrf which which are vulnerabilities that we know how to handle but in my opinion one of the most interesting topics in this s top 10 is the over Reliance one because this is

the first time that us actually talk not just about the system itself but about the people that using it and it's a big change but it wasn't like this since the first date of um this all top 10 in the in the first version the beta one they just said that depending on the LM generated content without oversight it could lead to harmful consequences I mean it's almost says nothing right in the second version they start using the word hallucination they almost get into the point they start talking about legal issues and reputational damage but still it's not the right words in the next version until and until the last one it skip the same and

here they finally find the right world as I see it systems or people that using llm content without fact checking it could it could lead for misinformation and not just for legal issues like before but also to security vulnerabilities now this is a big change of how us sees vulnerabilities so I said the word hallucinations many times until now but what is llm hallucinations so it's pretty simple it's all the wrong fake and madeup answers that we're receiving from these models and I have a lot of funny examples and some weird ones but I don't have time to show them so I jump into the must interesting one which is when these models just making up facts

on the screen we can see an example that I asked jgpt how can I get some data from or API for those who don't know Ora is one of the biggest cloud security providers so I ask how can get some data from their API and I received this URL I mean from first look it seems to be a legit one right api. or. security so try to access it and found that there is no s there is no URL like that I mean there is no DNS there is no such thing and it means that jgpt just hallucinated and made up a URL that doesn't exist so why does it happen I mean why this great this great tools that almost

does anything that we asked from it why does it wrong and making up answers so I want deep dive too much for how this models works but I will split my answer into three main reasons the first one is that this model are probabilistic models and the probabilistic behavior of these models which are basing their output using our input sometimes leads to those mistakes the second reason is that these models were trained about hug imend amount of data from all over the internet now this data could contain some incomplete data some wrong data or just even some old data that is no longer relevant for these days the last Reas is that the application which based on these models

were programmed to be creative and trying always to provide us answers now we can see that in when we are trying to ask jgpt the exactly same question over and over again and we are receiving from it different answers so after covering the first part of my research the first subject of my research let's talk about the second subject Supply chains attacks Supply chains attacks is one of the growing attack vectors in the LA in the last series we can see in the numbers how they're dramatically increased then from here to here and attacker using many ways ER for supply CH attacks one of the main ways they are using is spreading malicious packages they are using many techniques

from typos squading mascu raing uh troan package and so on in order to do so so after talking about these two subjects let's talk about my research and the new attack technique this new attack technique is combining these two new two words the hallucination and the supply chain in order to spread malicious packages using those hallucinations let's see how this technique works the technique is splitted into two parts the first part is when the attacker is actually do something and the second part is when it just waiting for the victim to get into the picture so in the first part we have three components the attacker CH GPT and the package repository the attacker will will ask a

coding question CH GPT and he will ask for packages that could help him solve this question CH GPT in response will provide him a non-existing package all that I have to do now is just publish a mici package with that name that chpt gave them that's it simple as that now the victim will get into the picture and he will ask a similar question question now the similar word is really important because the victim does not need to ask the exactly same question as the attacker asked and this is what makes this attack to exploitable one than just a theoretical one so the victim ask a similar question and chpt now will answer it with the same ucin

package just in this time the package will be exist and it will be malicious now we know that developer like to trust these tools just like they were trusting stack Overflow and copy paste from there so developer so the developer will install the malicious package and it's kind of a game over but it could be even worse I mean if the attacker will combine this spreading technique with another one like Troon package and we make this package a functional one the developer might use this package and deploy it to our production systems and then the attacker will not just have a foot hold on the computer of the developer but it it will be it will have

a foothold on our production systems so before I share with you a proof of concept I would like to share quickly the research process so the first part was validating the thesis I mean what we try to find is the first package hallucination to verify that this exist so after a few questions and really it was matter of minutes until we find the first package hallucination so and then we asked more question few more question and then we find another one and then another one and then uh we chose to start collecting more question so we go to stock overflow and collected question for about 40 subjects in nodejs and Python and after collecting them we

started uh question open AI API which is the API for uh which is the API for CH GPT so we start asking those question uh and then we collected all the session and we needed to extract all the unexisting packages so in order to to extract all the unexisting packages we have needed to extract all the packages from the from the chat answers and then we checked in front of the the relevant package repository if the package exists or not if it was exist so fine we don't have what to do with it but if it doesn't exist so it means that we can use this ER name in order to upload a malicious package so the next part is

uploading the malicious package now we didn't really upload the malicious one and we didn't upload all the package that we found we just used few ones in order to prove our concept and here come the last part and the most important one and I will explain in this part we try to verify that the repetitiveness of the ucin what we did here we ask some different question on the same subject from different users from different IPS to valid that it's not biased and to see that we are receiving the exactly same illin package now the reason that this is the most important part of the research is because if in this part I would I

wouldn't receive the same illus package all the other part was worth nothing because if I receiving this the ucin at package only once it means that I can't attack with it and I can't do with it nothing so in this part we asked those question and we found that we're receiving the same the exactly same illin package for different questions so and then we found that then we understand that we found new attack technique now let's see a a short P ofc so the first step of the PC is asking a question in the attack in the attacker context so here we ask how to integrate with a rodb no. JS and to give it the answer with the patter of npm

install and the first answer was just use a Rango JS now arango JS is probably the best answer to give I mean Aro Js is the best package to use with arangodb in no. JS but what happened when I asked for more packages to do it the first answer was just use a rodb now I go and check if Rango DB is exist and I found that it's not so of course the next part will be upload and here it is now rodb is a is a package that exist everyone can download it and it's a malicious one here comes the interesting part asking a similar question in the context of the user now here we can see that I

didn't ask a similar question I asked a totally different question I ask it to write no JS code to connect with arangodb I even heavier typo in just for Rango and ask for three npm packages to do it and here we can see that in the second place here is our ucin package rodb yeah but if that's not impressed you and you said yes but it's only the second place and I will always use only the first recommendation so look what's happened when I ask bar the same question when I ask bar the same the exactly same question I received arangodb in the first place and with a code with a functional code to use and if that's not

enough look what it said it said that arangodb is the official arangodb driver to use in in in OJs it's a well-maintained one and have a large community of users I'm pretty sure that if I received this recommendation before I found this attack technique I probably use it so the user will install and it's a game over so now let's talk about the results of the research 30% of the question that we asked we received at least one illusionate package in the responses we asked almost 430 questions and we received in more for and more than 120 hallucinated package answers with at least one hallucinated package now this is a huge number I mean if an attacker

will perform um a spreading campaign with thousands or hundreds of thousands of questions he could spread thousands of malicious package to our ecosystem that CH GPT will recommend on the last question that we stayed with is how we stay safe I mean how we protect ourself from this kind of attack so my first sorry my first recommendation will be just don't trust blindly these tools yet I mean the ucin are still there some of the packages I still receiving them I CH I checked it before three weeks ago something like that and some of the packages I still received them and some of them I didn't find them but I found new ones so hallucination are still there

we this is lead me to the next bullet when you are receiving an answer from these models that you are not 100% sure that it's right just perform crosschecking with an external source that you're trusting it it's important my last recommendation will be use open source software securely regardless to LM but especially with it if you are receiving a recommendation about package that you don't for familiar is so go to the package repository and check it check the publish date I mean if the publish date is later than the training dat the latest training data date of the model it will be weird if it's recommending you about something it doesn't know check the H check the commits check the

maintenance check how well maintained the package is maybe it's a legit one but not well maintained and holds vulnerabilities so you don't want to use it check the stats comments number of downloads and if you are seeing there something suspicious just think twice before you're downloading it thank you very much I don't know if there is time for questions [Music]

so yeah thank you very much uh sadly we don't have time for questions anymore but uh surely after the event uh sure feel free to contact me I will be around perfect thank you so much sure