← All talks

Adversarial AI Attacks In Cyber Security

BSides Lancashire19:2568 viewsPublished 2024-04Watch on YouTube ↗
Speakers
Show transcript [en]

all right so good afternoon everyone um and welcome to uh my talk on AI security adversarial AI attacks in cyber security so my name is Thomas and I am from the University of Sunderland so for my so my primary area my primary research areas amongst other things are in AI security and in OT security and how to amgate between the two so with that in mind uh for this talk I'd like to give a quick overview of the current work surrounding versal AI attacks and how they have been adapted in cyber security and more importantly I also want to discuss uh share my thoughts in terms of the different AI targeted attacks in that has been identified in

existing research so with that in mind so first of all so here is a quick overview of what the topic is going to be U how this top is going to structure so first of all uh we'll be looking a little bit a quick overview rather into security analytics and we'll be looking at um AI the different base AI based attacks and before we conclude so we'll be looking having a quick overview of a new framework that has been created by Mitra the same folks who created the well no Metra um attack framework so they have recently come up with the new Frameworks around um on um addressing AI model based attacks called Atlas which

stands for adversor real threat landscape for AI based systems so when we talk about security analytics so what is security analytics so as everyone in this room will know what quick U will know what security analytics is but still Bears a little bit of a quick overview to make sure that we are in the same um on the same page so when we come to like security Analytics security analytics is all about identifying and previously undetect threats in large scale environments so that typically involves leveraging the different data that we collect from We Gather from different points within the Enterprise Network be it Network logs be it uh user access logs Etc cor uh feature correlation

event correlation and then we automate the process the detection process through U running it past a an AI or machine learning classifier so when it comes in more recent times depend due to the large amounts of data that have been generated by different s Solutions in recent times we have decided to use data big data analytics along with other large scale um machine learning models such as deep learning and reinforcement learning Etc so typically when it goes when it comes to security analytics here is a very generalized version of how it typically works so from given the network logs and the user access logs and and the different bits and Bs of information that we get from different

points within the network infrastructure we typically try to correlate generate a graph or something similar to that effect in order to represent the features in a more coherent manner so in this example what we have done is we have generated we have represented the features that we have gathered from um gathered in the form of a graph so here we got three different features we got the um fairly self-explanatory and here amongst the different fields so we have different um edges so to speak of this graph so we what what we want to do is we want to detect which of the network flow or the features are indicative of a potential malware potential attack so in this case

so we represent all of these features that we gather in the form of a graph and then we pass it through to a machine learning or AI classifier so in the T when when it's when it has done its job typically we will end up creating being able to identify a potential a potential um attack which is highlight which are rather highlighted in Red so this is how typically um this is how typically machine learning models or AI models are leveraged in the context of security analytics but in more recent times precisely because precisely because of how prevalent AI based detection Solutions have been adopted in threat detection in recent times and and not to

mention the immense popularity of llm such as llama and chbt so on and so forth so attackers have tried to have attempted and some cases rather successfully in ter in order to compromise the underlying the underlying AI model or the machine learning model so the question then becomes what are the different um techniques what are the different um attacks that have been attempted against AI models so generally speaking they can fall under one of these four main categories so one is called Invasion second poisoning three inference and last but not least we have prompt injection so when it comes to invasion so what is evasion evasion well we all know what evasion is but in the context

of AI based attacks it's all about trying to by trying to pass a set of features and and engineer them in such a way that when pass when it get passed through to a classifier the the classifier gets is is none the wiser so in other words so given a set of malware features if we pass them as this it will be the machine learning model will immediately flag it up as a indic a an attack no doubt but what if we structure these features in such a way that the machine learning model or the AI model thinks it is quite benign then that means that we could do we could bypass into the um the network the target

Network infrastructure and we can perform all kinds of uh lateral movement and data experation so on and so forth so when it comes to one of the techniques that we tend to find that has been used in AI based attacks is evasion attacks evasion attack is what is called the deployment stage attack so meaning that this evasion attack typically takes place when the model gets deployed in a real world indust real world environment because the model is already has been deployed it's already life and well is running so here is our Target so as I previously mentioned so previously mentioned so it use it features the use of specially crafted features so usually from from a

red teaming perspective it is used conjunction with a typical um attack flow so and ultimately what is the the question then becomes what is the main purpose of evasion attack is two for one is to is to fol it the machine learning model or the AI model in such a way that the we could the AI model deems the features to be benign and thereby allows the attacker to get inside the infrastructure the network infrastructure and do all kinds of weird and wonderful stuff and the second benefit of it is that it targets the accuracy of the AI model so the the the main reasons for the evasion attacks are twofold so with that in mind I'd like to

share with you a very um because um because I also like cast as well so um and this and this paper in my opinion provides a very good um overview of what evasion attack is and also the the cats as well so so when it comes to evasion attack so here in this example we have a image of a cat and then we have a AI system so typically by default by default when we when the image of a cat it gets pipat the AI model will immediately say okay this is a cat if it is not a cat then it's not a cat but when we go about adding adversarial noise and to the image so from the

Native view from the eye the image still the same as far as human eyes are concerned none the wiser still the same but when it passes onto the AI system the same AI system then it gets misclassified as a dog obviously it isn't but this is how it can it can be the AI model could be tricked and more recently um we have PE we have found that this can be done through um machine learning models such as uh generative adversarial networks which allows for the automation of different uh for allows for the addition of noise and Distortion of the features be it image be it video be it Network features Etc so next stop is going to be

poisoning so poisoning attacks in U poisoning attacks against AI models is where we the attacker tries to corrupt the the parameters and parameters of the AI model as well as the features that are used in order to train the set model so you might be wondering okay Thomas so in this case we got the AI model how how do how does the how does the attacker figure out the hyperparameters I.E the details the inner workings of the AI model the qu and that's a very good question so the reason for this could be traced to two um two reasons so one is the fact that most of the AI models that are used the AI or the machine learning

Frameworks that are used in industry are more often than not open source like we're looking at like pie torch tensor flow Etc and more often than not these of the I mean given the nature of it we could identify the source we could identify the parameters and in the end the second means is through simple good old try and error so you if you look at the AI model the solution as a black box what we will do as in typical software engineering Endeavor we will throw away with a bunch of data we see what the results are and then we make some um educated guesses as to what the hyper parameters are going to be so when it

comes to poisoning attacks so the poisoning attacks is a bit more um overarching so rather than just targeting one or the other in terms of the data or the model we could Target both we can poison both the um the data or the model specifically the hyper parameters of the AI models could be targeted and in a more extreme example we could also include what is called backdoor attacks so the back door attacks is when the attacker creates a corrupts the models the AI model or the data used in the training process in such a way that given a specific um class in the following on from the previous example it would be cat for

example so given a cat image of a cat we could craft the AI model to give a response like I don't know maybe a squirel you know so in this case so we could do that as well in the worst case scenario so here it is so when it comes to poisoning attacks so poisoning attacks are typically more prevailing in uh training models that feature distributed uh machine learning techniques I.E Federated learning so in a Federated learning environment each of these each of each participating users has their own set of local data the each of them train a local machine learning model they all pass it onto a global aggregator the aggregator creates a global model that is shared

across it so in order for Federated learning environment Federated learning based AI training to work everyone all the participating all the participating devices have to be um accurate have to be um in other words play by the play by the rules so to speak but more often than not doesn't really seem to be the case so this is where the poisoning attack comes into play and nowadays especially when it comes to like device based security more and more solutions we get to see more modern um detection Solutions using um some variation of Federated learning and model sharing and um threat threat intelligence sharing Etc so in typically for typically gets um exploited in use cases like this and next up is going to

be inference so inference is all about compromising the confidentiality of both the data and um both the data and the model so in this case inference is all about treating the AI model as a blackbox try to and see how we could and try to craft the um the features in such a way that the model ends up giving us um certain private uh privacy preserving information so information so it's typically inference um and it inference attack is all about trying to identify the details trying to get the AI model to disclose the information that is being used in order to in the detection of threats and malware features and I also said in this

slide that it is iterative in nature meaning what when I meant by is that it basically means it is a try and error so we craft the features we pass it onto the model see what comes back we do it we improve upon it again we get the response and we do it over and over again and here as far as here is concern just like with the previous steps we got this could be triggered often used together with gun based um gun or generative adversarial Network spased attack vectors and last but not least we have the good um prom injection so recently I F I saw this um there was a YouTube video by I believe is called

life overflow that provides a quick demonstration of a quick demonstration on how promp injection works so it was the target of the model was C gbt for so what he did was he showcased a series of um prom injection attacks that make that are crafted in such a way that it the ch BT ends up diverging certain um privacy sensitive information so which was quite cool to see and he also demonstrated a a a dojo of a playground excuse me that where people could um play around with promt injection so the promp injection as I alluded to earlier is all about how to so it's targeted against large scale uh large language models and it's all about trying to

craft the prompt in such a way that it bypasses the pre uh bypasses predefined um rules that the Creator or in this case open AI has set up yeah so in this case and so the question then becomes so here so so far we talk about what security analytics is and we also look at how the different types of AI based attacks that have been identify in existing research so the question then Becomes of what the different approaches that people have used in existing research in order to address that in order to detect them in order to get around that and more often than not based on based on what I found so far at least is that most of these

Solutions typically involve the use of clustering based approaches so clustering or adding a layer of additional um machine learning model in order to automate the detection of the process but even then despite the best efforts so current Solutions are quite Limited in terms of two areas one is in the coverage in the overall coverage of the um the AI based attacks and two in terms of the amount of training time and the performance overhead associated with using them so as far given the N no given the recent developments of AI based attacks so the solutions against them are still in the early stages as well but it's kind of good to see um what the future brings and before we

move uh before I end this talks what I like to share is I like to share with you um what the more recent framework that have that Mitra has created and made available so which is called the atlas model which is the adversor threat landscape for AI based system so these are the same so if you look at the different aspects of this framework you will realize that they are eily similar to the Metra attack framework that has been used to understand the behavior of AP attacks so in this case and that's no surprise because these are the same folks that have created the um the Metro attack framework so as you can see um

same steps as before we got the Recon we got initial access Etc but if you look at here but if you look at the different aspects of it more often than not it always comes down to two things so either the experation of data or the compromise of of the parameters and the data that training data rather so it's so it's still um it is quite interesting to see how um how the future what the future brings and the work that still needs to be done all right so as far as um the talks are concerned so here the talks is concerned that's pretty much it so here are the references to some of the papers that in my opinion um have U

are quite useful in order to understand the different types of um AI uh base attacks so thank

you any questions yes

uh yes um that's a good question so and actually I've seen it but I can't remember from the top of my head um precisely which research paper that was but there were there has been attempts that people have done it successfully yeah Round of Applause please thank you