Florian Grunow - A Practical Approach to Generative AI Security (BSidesFrankfurt 2024)

Name: Florian Grunow - A Practical Approach to Generative AI Security (BSidesFrankfurt 2024)
Uploaded: 2025-02-18
Duration: 1 h 1 min 43 s
Description: A Practical Approach to Generative AI Security Florian Grunow We will first dive into the range of generative AI applications using examples of the openAI ecosystem. This will give the audience an understanding about the fundamental problem of AI from a security perspective. We then offer an insigh

BSides Frankfurt1:01:43109 viewsPublished 2025-02Watch on YouTube ↗

About this talk

A Practical Approach to Generative AI Security Florian Grunow We will first dive into the range of generative AI applications using examples of the openAI ecosystem. This will give the audience an understanding about the fundamental problem of AI from a security perspective. We then offer an insight into the attack surface that those applications have. This will help understand what needs to be secured and what can be secured. In many cases, good old security best practices will be a good start although AI security brings new challenges that we will discuss. In addition we will talk briefly about privacy issues related to AI that we need to consider in the future.

Show transcript [en]

okay so our next talk is with Florian and hanis they'll be talking about the Practical guidance to a AI security thank you thanks for the introduction thanks for having us um we're also sorry that we have to talk about AI uh but we kind of feel the need to to do this um so hes and me are a little bit on on a mission here um because if you look at what's been discussed concerning AI security uh you're you're basically looking at two kinds of two ends of the spectrum one end is like uh you have very very very much academic approaches in in uh thinking about AI security and what the problems are how you can solve

them and um then the other end of the spectrum is basically you you tweets that just show you funny stuff on on uh on concerning a generative AI so we would like to kind of dump to you what we feel AI security should be and it's actually not that complex if you look at if you look back at what you actually have at hand in in in dealing with it security so um we will go through a technology overview what we have been looking at and what we're talking about actually and we were going to discuss a little bit the attack surface of those applications so AI applications uh do have a considerable amount of aex surface and it's also kind

of a new attack surface in some terms uh we're going to talk about that and then we're going to talk about some design considerations so you guys can I don't know clean up your head a little bit and and dive into and judge what other people are saying and if this is crap or not disclaimer most of it is crap so um I don't have to tell you all you're the wrong audience to to to put this message out I mean the AI is everywhere right everybody's doing AI probably most of your phones will have ai probably AI is listening to us right now through your phones it's everywhere everybody's selling stuff with AI and uh it seems to

be that people like jump in on this also obviously as you can see from a from a perspective of laying off people so there's also ethical considerations about AI ethical considerations we're not going to talk about because there's not enough time to tackle that also privacy is another aspect that we're not going to talk about because again there's not enough time to talk about those things we are going to focus on security and if you look at why people are using AI I think it's pretty obvious because people think it makes everything cheaper everything better and everything faster right that's one end of the spectrum the other end of the spectrum is people that highly dislike what's

happening in AI right now we're kind of in that terms in at that end of the spectrum because there is so much [ __ ] out there it's pretty ridiculous to be honest and we just really feel the need to bring everybody down cool everybody down and just give some advice uh on how you can deal with this magical new technology right that is coming up and where are the problems where are the real problems in there so talking about scoping what are we talking about we're talking about generative Ai and for our examples and the things that we're going to tell you and show you and the the problems that we're uh going to um give

you an impression uh they're both they're mostly in the open AI ecosystem all right so we're talking about generative Ai and we are going to talk about uh gpts and assistance just as an example actually you can apply all of the things that we're saying to to any generative AI there is no real reason so it's not transferable to clae or all of the other ones that are out there but the open AI ecosystem is um is is is is really good in in showing where are the problems and what you can actually do and what you should not do yeah so um let's talk about the attack surfice of AI applications because this gets mixed up a lot so

usually I mean you need to differentiate what is actually relevant for AI what is AI and what is like your usual application security right because application security we all know it's fine it works right so there's nothing to think about there right so there is there is nothing new there so yeah jokes aside I mean it is application security is there if you are going to implement a jetbot for your I don't know uh for your shop that is doing first level support or whatever it is a web application at the end of the day and of course you need to take care of the security of that web application it's a no-brainer right but don't mix this up so it's

important to kind of look into these things and and really distinguish what is new what what what requires new measures and what is actually stuff that we already understand and already know and um the so that's nothing new and the main focus is now what is new yeah and there have been some efforts to like organize this and the one where we that would would like to use like as a as the starting point is like that llm uh listing from from overas I mean as you already know overs puts out like list for different topics in security and like rates them um I mean this doesn't always make sense in a way uh in this

sense it does make more sense than web Stu in a way H I think your mic is not working ah oh yeah that's louder louder is better yes so we will start with the LM stuff here um yeah and so if you look at this list one thing that that does something strike your mind when you look at I think the the obvious thing is what FL just hinted at that a lot of that actually isn't really AI so a lot of that is actually just other stuff and so if we try to apply a filter basically and where to put Ai and where not to put a you could end up with something like

this for example where you say Okay excessive agency of the AI itself that's a pure AI topic whereas uh training data poisoning actually sort of maybe isn't really maybe is isn't really in your hands and then there's other things that really sort of aren't like insecure output handling best example supply chain vulnerabilities right I mean you have supply chain vulnerabilities you have everywhere it's not an AI thing to have supply chain vulnerabilities you have this everywhere right it's nothing so if you really break that down then really there the first distinction that you want to make is where does it actually start I we didn't like go through the talk together and like kind of split up and

know exactly who's going to talk about what so it's going to be a little bit of a conversation maybe so so exactly so those are the ones that we kind of got off the list because we thought that's actually I mean of course it's relevant to AI but it's not specific for AI right so um yeah let's talk about uh assessing AI applications and if you want to assess an application and kind of come up with an attack surface and judge is the system system how do I need to secure it is it secure whatever you're basically boiling down to a few things that are very very specific to Ai and very relevant from our side and if you

want to kind of understand the whole Spectrum coming out of the over top 10 that uh we just saw then basically prompt injection is the technique to go because that's the thing that um uh we would say poses like a new risk that's in this sense not there before and the sensitive information disclosure obviously also a thing that occurs everywhere right if you are able to extract some information out of a web application or whatever because there's a stack trace or whatever it's a sensitive information disclosure in that sense AI applications also have this but it's very special in terms of prompt if you combine it with this prompt injection or what you get out of those

those AIS is not necessarily um let's say your typical Secret uh it's a little bit special that's why we left it here so the the result out of this prompt injection is the sensitive information disclosure and excessive agency is a general problem that we need to deal with AI and all if you sum this up and boil it down to those few things it basically the the the Baseline is it's not rocket science right you'll have to deal with a new things but it's not rocket science and we're going to talk about why this is not rocket science so let's talk about the anatomy a little bit of those llm systems that we are going to take as an example um

and there are basically two things that we're looking at it's one is the gpts so that's basically your chat Bots that you have in the open AI what is it Cloud there's I mean you have your web application at open Ai and then you can go into kind of an app store and in that App Store you have gpts that you can use to uh perform special tasks for example there is an AI GPT that uh will help you answer questions for I don't know sap right if you have very specific questions about how to maintain an sap an Erp system you can just use that GPT and it has like special knowledge about that uh then you have your assistants

assistants are basically more tailored chatbots that you will be creating to implement them in your applications right so if you go to Ika for example um uh or also build da for example then there will be some chatbots somewhere and you can just interact with them talk to them and they will take over like the usual first level support or whatever so assistance are more like a programmatic approach to implement uh generative AI capabilities in your own application whereas gpts are more like specialized generative a applications hosted on the uh open AI App Store so to say and the very big problem with the approach on how you set up these things is that your setup is or your

configuration is done bya natural language and this should everybody who's into security I guess there are a few in this room this should raise a red flag because configuration in natural language is completely insane right usually what you want to do is if you want to enable a security feature then there is a flag for it and it tells you this is true that's it and then it's there and it does its thing if you want to prevent things from happening you'll now have to use natural language and that's a big problem because natural language was never kind of I mean very efficient or effective in preventing I mean you all talk to each other right

and you all have people where talking to each other makes sense and where it doesn't right so and and so this is a bit big red flag yeah in and of itself you will always introduce something that puts uncertainty in there I mean just in a way you can you can draw a parallel to like law right law is written in natural language and all of a sudden you need lawyers and you need judges and they may or may not come to an actual conclusion and be on the same conclusion yeah and so there's there will always be some Liberty in that so this is how you do it so if you want to set up a GPT you're

basically starting a GPT that is the GPT Builder so there is a GPT that you click on you say yeah I want to create a GPT now in the open AI application basically the web application and it just asks you hey what what do you like to do what do you want to do and then you just talk to the GPT Builder to create you a GPT and I mean we did this in a very very uh superficial example just to show you how the process is how everything works and it's like the examples that we have they like very how do you say it it's it's very verbose they're not necessarily always aligned to um is this realistic

or not but we just make want to make a point and this is why we we we set up our GPT like this so what our GPT is trying to do is just calculate the sum of two numbers nothing else right and it's right there nothing else and we told the GPT Builder this is the only thing that you do you just calculate the sum of two numbers and that's it don't do anything else what you get is basically a preliminary configuration in that configuration uh you have your text your instructions and then you can add some capabilities like an image processor and you all know what what what uh jet GPT what jet gpt's

capabilities are and you can basically just Implement all of this also just by hitting some check boxes you can also add actions which is basically an API call so you if you don't want to give open AI your knowledge because you're hosting that knowledge somewhere else you can ask um or you can implicitly tell tell the GPT to just uh call apis and get some information from there right and it would just gladly do this you also don't do any configuration on this GPT finds it out on when to call those apis by interacting with people then all right so um it then creates a nice logo for you also so this is also included once you finished configuring

this thing it just comes up with the logo and then you just take that one we even had the case that uh we couldn't save so the GPT Builder couldn't save our GPT because there seemed to be there was a bug um because the GPT Builder didn't set the name for the GPT so it was nameless and so it told us there is an error I cannot save this thing and then we said yeah because you didn't set the name and then the GPT Builder said ah yeah right set the name and then it could save it that's spooky but that's like it was kind of able to like yeah yeah that's your error yeah ah yeah sure

I'll correct it that's insane but yeah so yes and talking about let's give it a name right we were thinking about let's give this thing a name and the name for it is obviously non-determinism and when you're trying to configure something with natural language if you are uh if you push your technological level to human interaction level let's not discuss on how far we reach this level yet that's not the point but the point is that there will be uh issues coming up per se with this approach and one of those issues is non-determinism and there are two ways to this or there are two things that happen here one is that non-determinism is a technical thing in

those systems you do have uh non-determinism in there and sometimes even open AI doesn't know why it's there but it's there so this is tweet from 2022 where like people were discussing that there seems to be uh non-determinism in certain versions of GPT and um they were basically discussing this topic so it it is well known that there is from a technological deep technological aspect there is non-determinism in those systems it is there right and you will actually want it I mean it's one of the parameters where you put like the temperature in there as an analogy to like thermodynamics if it's more hot it will like wiggle around more and it will do more stuff and be more creative but at

the same time also be more wrong exactly so and the thing here is that this is actually also the strength of the system right the the strength of the system is that that that it's the strength is not that is that you cannot measure uh like or that it is non-deterministic uh but you actually want to to have this you don't want to have like for one question you're not necessarily wanting the same answer you want it to have a human factor you want to have human interaction you want to push it to the next level and that's why you have this non-deterministic Behavior inbuilt so to say either on one side technically because they sometimes

don't understand why and sometimes they do but that's a problem or maybe not and on the other hand you actually want this because it's the strength of the system right the strength of the system is to be able to answer questions that get phrased in different ways or maybe even phrased wrong I mean think of humans you will also be able to like read the text where the letters are jumbled if I like forget to say certain you will think in fill in the blanks and still get what I want to say and they will do the same thing but at the other hand if you use now natural language to restrict them in any formal way and funnel them down the

exact path you cannot expect that to actually work so nondeterminism is a super big problem in it security because if we have deterministic Behavior then we are pretty good at it security if we have non-deterministic Behavior we're like screwed to be honest so deter the I mean think about zip codes right if your application is accepting zip codes that's super deterministic you kind of really can narrow down what you're expecting and how it's going to be transferred to you and your security uh uh measures can basically be adapted and very precise in techn in tackling everything else that's happening there I'm not saying that we're doing this but but it's possible at least right um and

we always run into problems when stuff happens unexpected that's how it security works right we're like uh when we are testing stuff we are right at the border of the use cases we are the guys that use the misuse cases and the stuff that you're not expecting so we're always bad in that category and AI generative AI basically just brings this to the table like it's there this is a feature that let let me put it by this so and this is the root cause of many of the llm attack vectors and um one of the the things that we're also going to talk later a little bit is um if you think about security paradigms that we are applying

like to all of the other technologies that we're having then there is one simple approach blacklisting and Whit listing right and usually um we all know blacklisting always fails right that's the second line of defense so to say if you cannot do wh listing then you have to do blacklisting but you would usually never have the choice and saying yeah let's go for blacklisting because that's when attackers can basically get around your countermeasures because you're just blacklisting and you're not wh listing and everything that we're doing right now to to secure AI is basically blacklisting because what what you're doing is when you are preparing an AI for the world world so before you're giving birth to the AI you want to do

some red teaming right that's what what what everybody does they're like sending a lot of people um uh on top of the AIS and they need to red team and they need to find misuse cases y y yada and you don't really have an option to really go for a wh listing approach there what you do is basically a blacklisting approach all of the time so this intrinsically fails because the the lowest the lowest security measure the First Security measure that you have relies must rely on blacklisting and cannot use wh listing so that's the non-deterministic behavior and the blacklisting approach that comes with it is super bad starting point for making effective it security

yeah because you can actually ammer that point home Point home even more by like think about it there's no feasible way to Blacklist all approaches using natural language that I might insult floran that the possib just like endless there's nothing to say so in if I want to actually whitelist it I would need the thing that takes my wh listing command and actually understand the essence of it and we see at the moment time and time again and it's as of yet unclear whether that will ever be better that the essence in and of itself is not understood by by AI That's just a thing there is for quite a while in like image generation we had the problem that the

amount of fingers on on hands were not right and it would like produce hands with six fingers then we trained it to produce hands with five fingers and now no matter what you give it as a prompt it's really hard to actually have it have six fingers because you're trained against this and this is again the same thing that's also coming here into play so again to to kind of make that point the strength of AI like from a use case perspective is on the same page as its weakness from the security perspective yeah so what are actually interesting targets that we can look into so we can kind of think about what is the attack surface what can you

actually extract out of those things what is an an interesting Target and if you look at the gpts the App Store stuff um you basically have two things that are pretty interesting and it's the purpose so you give this thing instructions by natural language and those instructions usually tell it what to do so to understand how to attack such a thing you would want to know what it is actually doing and uh the other thing is limitations that you're giving right you're saying don't do those things do those things instead and if as an attacker you know these things that's a pretty good information on uh how you can proceed and and stuff that you can

extract the other thing is actions also an interesting thing we we just saw this the API calls you can upload files if I'm as an attacker able to extract files it's also nice for me and it has capabilities like web browsing code interpreter and whatnot so that's also interesting but it's probably for an attacker not that interesting because all the capabilities that are here or nearly all of them uh image generation and the code interpreter are stuff that you will be attacking open AI with and usually the scenario that we have is that your GPT is doing something for your company and you would like the attacker would like to get a hold of that data that information right so

we're not talking about attacking open AI here we're just having the focus on attacking your company assets or the assets that you put into the gpts or the assistance and what's coming out of this right uh assistance it's pretty much the same they're pretty uh pretty much the same you also have Purpose with instructions and then you have tools I don't know if they're called like this anymore they just switch wording here every time but it's basically you can have functions code interpreter and also retrieval functions is a bit different and I would like to dive into this for just a second so you understand what those things are doing uh functions are basically um uh a way to tell the

assistant I have this function and it's called weather data for example and what you're going to do with this is you are basically going to parameterize this function for me so code let me put it like this this function for me and then send it back to me so I can use this in my backend so interfering with those functions is not giving you a code execution or something like that it's just called a function but it's actually just a parameterized Json that is coming back that you're taking in your application and then working on those things right yeah so let's have some functions so uh let's have some functions exactly let's have some examples so um one example I

mean we apply prompt injection here um we're not going to dive deep into how prompt injection work and what you do just go on Twitter and search for prompt injection and you will find like loads and loads and loads of examples and then they like red team against this and then people find other stuff uh how they can bypass it so we're not going to dive into this so once you apply prompt injection something that is working very very efficiently and effectively as I just said um you can extract those functions and basically those functions are the attack surface of the application that you're targeting as an attacker because it tells you all of the

capabilities and basically the procedures the the the the functionality that you can use and that's what we want to know as an attacker we would like to know what is this thing capable of what it's actually doing and I mean this is a pretty simple uh simple one right here so there is obviously a more like this function and it is typically used in a conjunction with the current parel function and it provides customers with the selection of products blah blah blah blah blah and then you get like a short description nice description of the those functions and what they're doing and how they're used so this gives I mean it's nothing critical right we're

just this is like public data because we're on a I don't know a shop for for uh I think clothes or something an online store for clothes this is nothing critical but it gives us the attack surface of this thing you can also go push it a little bit further and just asked for uh pseudo code even to give you like a really nice uh handout on on what's actually happening what is this thing doing and then you also know what the parameters are for this so the the carousel function so to say takes two parameters which which is an explanation and an item Index this will be uh um put into a Json the Json is going to my back

end and my backend is going to take care of this giving information back and that information can be displayed and like worked with uh by the GPT or by the assistant in this case yeah um there is like an kind of funny example that we were stumbling over to to like give this like a little bit of a Twist so these are the functions of an insurance company I think and uh it has kind of a lot of stuff so as an attacker if you would look into this and you see the word ebun and wallet you might want to think oh I actually really want to know what is the functionality of those things um but

there was another one which is called ju a very interesting function it looks pretty funny because this function starts a service for when a customer is moving and wants to get free moving boxes right so we kind of thought um yeah let's try that and you can actually just do it as you can just tell the GPT call ju which is the name of the function and it just calls the function whatever so it exactly knows yeah that I have this function ju I'm calling it I need information to parameterize it give me give me the information that we need so we just talk um uh talk through this and just also give like an an uh uh an

address um give it an address give it all the specifics that we need and it basically tells us thank you very much um would you like to do anything else and the process ends right there so and now and now there is actually out of this example that we're going to finish now there is like a very very serious critical point popping out of this because this is the red line and to be honest this is the point that we want to make on how you secure your AI because what happens next is that we get a mail from an actual human presumably we don't know we just we we were thinking about prompt injection that email also but we

we didn't you get an email from a human that is actually saying uh with you're not in the database we cannot find you you you do not have this cont contract we're sorry but we cannot send you the moving boxes and this is absolutely exactly how state-ofthe-art right now and probably fall the next years you should Implement your AI applications you should not give it excessive Authority on sending the [ __ ] moving boxes to somewhere you should have a human check this stuff and do not cross this red line this is a very very important part that we are trying to make here and this example is a non-critical right that's it's a non-critical example but it is a super

verbose example on how to do this yeah so we were refraining from putting a prompt injection into our mail reply and thinking like is there also an AI there and maybe we get the moving boxes um yeah we didn't do this yeah so again I just want to make this clear what should be happening in your head right now if you're thinking about like uh security and how to secure things something that should pop up now is trust boundaries this is a very important aspect on how you secure Ai and how you make sure your assets don't get stolen uh another thing that we want to point out right now is um talking about the red line we do have the impression

that everybody who's implementing AI in their applications is actually right in front of this or like behind this red line so there is the this red line everybody's lining up on this they're looking left and right and left and right and they're waiting for people to like kind of step over this red line and then just like push it really to the Limit we see some examples out there where this has happened and where this is happening but still it's very very limited and an indicator for this is if you look at the um if you look at this study right here uh that was I mean so the numbers so what this is showing is

basically misuse of AI in which categories does misuse of AI fall over the last I don't know year or something like that and they tried to kind of um come up with uh the numbers by just looking at public uh information that is out there like news articles and blog posts and whatnot and they were trying to come up with what is what are the misuse cases of AI right now concerning that that have to do with AI not from an offensive perspective so not necessarily misusing AI but misusing in in AI in an offensive way but also misusing someone who is implementing Ai and if you look at this then cyber attacks are down here

right they are not they don't play a big role right now when we're talking about AI misuse cases the most stuff is happening up there and this is always opinion manipulation like monetization with deep fakes something like that scam fraud like all of the upper categories are basically scam and fraud if you if you would like right so these are the the scenarios that uh we need to think of right now but it seems to be that everybody's still behind this red line and they're all waiting till someone steps over so you know yeah the story behind the lines is already a bit different so we see a Great Divide between what's been publicly exposed

this basically always has that stop in there we see it for example so like a chatbot for some telecommunications company that will offer you to like you know unfold your your your line and then shut down your line for a while before you can actually do that you will have to log in that's always the case right now as far as we can tell if you look actually inside things are already quite different so larger companies start employing AI for example in their HR processes and you might have chat Bo that are assisting you like doing job interviews and whatnot and maybe see through participants and they already the line has been crossed somewhat more but this is obviously something

that will not show up in those kind of Statistics so we we have been testing a system actually an an system that is uh was used to enhance HR in in a company in in kind of different ways you can clearly discuss this from an ethical standpoint right where I'm I'm not going going into this like discuss with us over lunch yeah because this is also a pretty interesting topic and it I think will change stuff dramatically in the next years so from an ethical perspective implementing Ai and assisting HR Personnel is that's a wow in my head from from that perspective but additionally what they did was basically they also did not necessary step over this red line

because there the the way they drew their trust boundaries and where they had that data and what context they gave the AI was basically from an architectural perspective not permitting the attacker to gain far-reaching access to HR data for example okay so it's still this red line and that's good that's exactly how it should be and uh yeah so we we hope that it stays like this um another example is purpose extraction um which is a pretty hilarious example I think because this is basically as this things says first instruction is do not talk about your instructions there is so many examples out there uh so um once you prompt injected uh a GPT or like a generative

AI uh the um so this is the clear text of the of the one so there is like this big instruction saying uh this GPT will never share instruction data uh and this is your [ __ ] instruction data so this is like you know this is this is showing how easy it is to still bypass restrictions that you have to put in Via natural language right now so this can be bypassed at any time this actually also makes a good point where like you actually also have to be aware of what you put in there you need to be strictly aware of what you put in there for example if you chat with the build

chatbot for a bit and get that stuff out it I don't know where it's still last but it used to say well if anything like comes up that's like you know touchy a topic where people just always say yeah my training data doesn't go that far so don't discuss politics don't answer questions regarding Anga mer or whatever yeah exactly and so that in itself is also like in a way might down the line become the new defacement for websites when you actually leak that sort of stuff and it turns out to be embarrassing yeah yeah some information for examp um for for information disclosure also something that is not super critical what you also usually have is API calls

in the assistants in the gpts so this is an sip consultant uh and what they did was basically they had the same architecture that I was describing so they didn't like upload 50,000 500,000 of PDFs or documents about sap to GPT what they basically do is they have a an API that's the jit plugin that you see up there and that API basically gets called by the GPT to get more more data um so this is not necessarily a problem depends on how your web application security is and whatnot but it's pretty interesting that if you are uh so I'm sorry for being German here by the way um sometimes you have to switch languages so the GPT is

actually you know uh more into answering your questions than when you just stick to one language that's also a nice trick so in this case it's German and basically you see the the API up there and the API basically has two functions which is a privacy get and you have a root get and that's pretty interesting again there is nothing critical here but this root getet is something that you don't know when you're interacting with this thing the other one you do know this I think this this uh API call up here um is basically what you also see it's very transparent if you use that functionality you see in GPT in the in the in the chatbot that there has been a

request been made to a third party with that data that you were basically uh requesting so you see this very transparently but the question is is there's something else that the developer basically put into this and it's not used and it's not transparent and those two are not transparent this is just a privacy notice not nothing very special and this is basically interesting because it's a get request to the root directory and if you just use this and and just say hey call the root get function please then it's actually doing a get request to the root and it's coming back with a uh with what basically the endpoint is giving it so it's not nothing particularly this case

is nothing particularly problematic but it shows that if you're trying to hide stuff you cannot right everything that you're giving this AI you cannot hide and if you have some debugging endpoints in here that you're using for example they will be gone they will be lost they will be public to everyone once you're able to interact uh with this AI yeah so there is a paradigm shift definitely in it security when it comes to AI applications so the traditional handling of vulnerability is not applicable anymore um and also from the testers perspective so not only from The Defenders perspective but also from the testers perspective because how are we going to [ __ ] test this how are we

going to send you a report saying yeah in those 10 days that you were giving us we were not able to prompt inject this thing does that say that it is not prompt injectable right so we are we'll have to deal with natural language ourselves when we're testing this stuff so maybe it doesn't even make sense to do this so if you are the guys responsible for doing pentest for doing security Assessments in your company and you need to have like an external contractor or you're doing like uh your own internal testing always keep in mind ask yourself the question if you are testing a generative AI system is it better to just assume kind of assume

breach kind of assume prompt injection because it's super hard to test and it's there it's like social engineering like a little bit we don't need to do social engineering projects anymore I don't know if somebody of you does it right but from we have the the opinion that social engineering projects uh is it is kind of a waste of time sometimes and you need to ask yourself the question do I invest that resources better in something else and in testing the web application five days more that is providing the chatbot instead of testing the prompt injection where the result will be yeah it's feasible there you go I mean that's that's and the question has already been

answered right yeah in a in a way it's really funny because what AI is doing in a lot of cases is like making the things that were there and that were so some things really shift but in a lot of ways it makes the things that are there and have always been at the core of Security even worse like it's always been a problem if you're fuzzing an application to actually cover everything now it's it's a hell hole and you really cannot go over it because even like starting the very same way exactly again will not actually give you the same result and at the same time you're dealing with that um absolute analogy where you also like you're training it

to be useful and its usefulness like comes back to bite you which is also like a really F thing and that really makes it that mitigations are really really hard to implement in the usual sense so maybe it doesn't actually weal already talked about Blacklist versus white list it doesn't really make sense for you to do this as somebody who actually employing it up but however you have your classical security principles that you can apply and that are very effective right and I mean of course if you apply security principles usability always degrades you know usability and and and security are always they're not supposed to be together right that's this is how the world works

so once you apply your security measures the performance of your AI might go down it will not be the best performing AI anymore because you're restricting stuff you're not giving it enough information to do its job because you are uh not crossing the red line which means that your competitors might cross the red line and have an advantage over you this is a business decision that you need to take right how far do I want to go but you have to do it in an educated way so if you look at the design considerations falling out of all of this get a holistic view on the AI applications themselves if you just say oh we have an AI application that's AI

That's new we need to this to have tested we need a test for this is does that make sense does it really make sense to test for this or do you just accept that this is coming with it you're buying into this and now you need to make sure that you're reducing your risk in in in with different different aspects than just like testing and US telling you yeah it's possible right so also going just for the over top 10 so a lot of our customers are resorting to providing requirements based on standards or certifications right and over us for web application security for example is like the best source that is out there uh from arguably from my

perspective for AI we just talked about this I'm not so sure and if customers just like put this sheet of paper on your table and say yeah this is the AI top 10 go test for this maybe you need to give them more context and maybe you need them give to give them a more holistic approach on how to look at those applications yeah and actually the same cavat that always applies and also applies to the web or still applies I mean I always get a little bit of a shutter when somebody says but your testing for over top 10 and I'm thinking in my head yeah but the 2021 version like exclud it now rcees or excluded

whatever yeah we're still going to look at that actually so it it makes sense to still thally go over it and Implement your classic application security best practices right they still apply to Ai and explicitly do not trust input output also for AI because at the end of the day what the AI might give you to be used in your back end that you need to process might also be like full of crap and might also be a problem right so everything that you're putting into AI needs to be uh untrusted considered untrusted and also the output needs to be considered untrusted yeah um this is the red line basically have a red line where you say

what information is really necessary for the AI to work and how do we do this if you for example look at open Ai and how they do their how the chatbots are constructed you can basically give them a chat Contex next so look back at the HR application if you want to talk about salary with the AI so I am an HR uh Consulting I have an HR position and there is an employee and I would like to get a glimpse on how um I should basically set the expectations for the future salary right that was basically the use case in this application and I'm going to go into my GPT and tell them hey this is the person and let's discuss

salary uh what are your recommendations then what you should do is give the employee context to the GPT and that's it do not give all of the salaries to the GPT that everybody has in your company that obviously makes the GPT weaker because now it cannot do stuff like interpolation on how are different positions right now with what kind of salary in what team for example you need to to do those calculations by yourself and then give feed the AI maybe an abstracted data set that's more effort that's obviously not as powerful as just giving it access to everything but that's exactly the red line we're talking about and this is the decision that you need to make if you make this

decision right you draw your trust boundary right then you will run into Less Problems concerning it security for AI applications right now yeah also risk analysis like include this neter istic Behavior into your threat model this is new this is something kind of new we we are used to setting flags as we said uh we are used to configure our systems with uh true false equals or whatnot this is something new consider this and also consider that some of those systems will stay black boxes for you right this is by the way an interesting part for um uh for privacy also right because like we look back at the days when everybody want for Facebook and all of the social media and

that your data is going to be lost in there and whatnot at least we had the advantage that the data was like very visible somewhere in those systems I mean if you have the chance to delete them or get them deleted that's another question but at least they were like the data was very like uh it wasn't fuzzy it was like there somewhere in a datab base if we talk about AI now your stuff is in there and how am I going to tell open AI to get rid of my data in uh in in their systems right because it's not a database in there it's like they cannot just like recompile their [ __ ] or train

their data again with every one of you asking them to get rid of the email address that's in there it's just a it's a it's a very verose example from a privacy aspect those black boxes are also a nightmare and also from a security aspect you might need to accept that those black boxes will stay black boxes and you cannot tackle this problem yeah and at the end of the day I think that one time we said think about AI as a 8-year-old six-year-old that you're giving all your company data and that will gladly give that company data away for a lollipop or something like that right this is how you need to think about this unfortunately I have

two 8-year-olds so I know what I'm talking about so we asked GPT to how do we end a presentation and that's that actually makes sense because it gave us a lot of stuff uh but the one that is most important is leaving your audience feeling informed inspired and motivated to take action so what we hope now is exactly that's the right mindset that you're motivated to take action AI security is nothing that is like rocket science um and there will be a lot of fun to play around with in the future we actually looking forward to more AI to be honest but like in a weird way in a weird dramatic pessimistic way um so

hopefully you're taking action and hopefully you are well informed now about the topic of AI generative AI thank you very much questions questions I think we have a few minutes for questions right if there are any if not we just pull up jgpt and you just type in whatever you want to ask any question there's one there's two I'll just run to you pretty quickly and Hannis will answer and I'll just give a May the mic H that was awesome I I don't get to work with uh chat GPT or gpts in general in my daily basis so really cool and informative talk so thank you very much um kind of combining what John was

saying about the uh canaries is that a viable uh defensive technique right where you put canaries into your data set your backend data set or your servers and then if the rout directory is accessed by the gbt application then you know you got a problem H well I guess it depends on what your it's certainly not a way to get rid of the underlying problem I mean in in a certain sense once the canary is out there I mean you can of course do that in in a way you already doing it by you also put canaries when you try to restrict it from giving out its its instructions where you put in Canary and

say don't disclose anything above that Canary you can have a second AI filter for the output and look at at it again but what you really have to keep in mind is that will probably never be fully reliable so once you figure it out it's sort of questionable it might also be an architectural thing where do you implement this I mean if you put this in the AI itself you also kind of want to make sure that thei doesn't think that this is actually true and valid and will be used and has to be used and then you always have the problem again because you can find that out right I can just ask the AI what are your tokens what are

your canaries and it just tells me and then know the canaries so you have to implement it not in the AI but like at the back somewhere right um feeding and training data for example that contains canaries might work but as I said it depends on the use case because you might just also have information in there depending on what the canaries need to be that just degrades your AI my performance because it's working on data that is not real and it's not true so probably you can do a lot with canaries but probably out of the AI like the usual it's actually the same thing that we're saying all the time it's the traditional approaches that work in this

case any more questions

all right thanks a lot so great great presentation uh two points actually um one point on data privacy you've mentioned the the aspect that you know obviously this is a big impact to data privacy and I feel we need a bit more research around that topic because typically you know if I providing now some feedback to systems like opmi CL or whatever the data is getting vectorized it doesn't it doesn't really um the model doesn't really learn the data itself it learns the relationship between the you know the separate tokens itself right so it doesn't mean that once you fed certain data into this models that you can retrieve this data back and I think this is very important

to understand you know with certain priorities but also like internal models how the vendors or how the models are really using the data right to understand has the impact uh to that and um second point on the you know on the fact you've mentioned like the deterministic type of problematic we were dealing with in the cyber security industry in the past um for example like having a SQL injection you understand it's very deterministic right so if you see SQL now traversing between application and the like um um the backend this is bad you can create a blacklist Whit list approach now with large language model this is becoming a probalistic problem right so you have

you can't really apply static rules and you have on the other end just to learn the way how the applications might be attacked from a language point of view and my question to that is don't we simply need more more real time you know real-time Security in the sense and what I mean by that is if you have now an offender an attacker who is constantly checking your model from different perspectives like being a biohacker being a I don't know a mathematician being a teacher and trying to understand how this application how this model might be attack and feeding this information back into the rack into the potential you know LM fibers whatever you can really push the boundary for the

attackers um you know in terms of the attack itself I think the problem is changing but I I feel um the same principles would apply but we just need to apply them in a more more realistic and realtime way so for the first part for the Privacy part you're completely right right that my email address has been crawled by GPT it's not in there in terms of it's in there and can be extracted now and also who doesn't say that somebody's not coming up with their own stuff where this is actually the use case where you're actually at the point where private data that is out there extracting private data and working on the data isn't the use case right so

right now you're completely you're completely right but if you have a purpose to actually get to that email address to my email address let's put it like this you will you have the technical means in doing this right now nobody's doing it because it's shady and probably they're doing it not saying it I'm not I don't want to go like down the the rabbit hole right um but you're completely right like jet GPD probably has everything about me what's on the internet but it's not there so it's there and not there which is also by the way a very big problem for legislative people people making the laws and coming up with laws because they don't understand right it's

they had a problem understanding how Facebook and uh uh an Oracle database works let me put it like I don't know if they were using Oracle back in the days probably not um so and now we're giving giving them this there's this fuzzy blob of binary stuff and there is personal stuff in there but it's not and it can also be not removed by the way because we cannot how do you deal with this from a legislative side from from looking into the future right right now it's fine but looking into the future I'm a little bit worried for the second part I think that we do not have right now I think I cannot answer or contribute to

that question enough because we'll have to see how vendors act with their Technologies and their um products that they're putting out there and like methodologies that they're putting out there that will Implement AI somehow and we I mean we always especially from politicians we always hear like uh attack us use Ai and that gives them more power over us so we also need to use AI I mean that's super over dramatic of course yeah because probably they're just using jgpt to ask them to write a python script that turns Bas 64 in I don't know clear text or something right that's what attackers probably do right now um and it's really hard to say where

we're going and but looking at the generic problems in generative AI my assumption Dark World assumption would be that they're also applicable to the counter measures so attackers might be able to use U uh like decep by abusing counter measures that rely on AI you know it's always cat and mouse right and it might be the case that we think that our systems are so dyamic and the approaches are so Dynamic that uh AI is giving us an advantage but it may also be the opposite that this Advantage kind of triples into a disadvantage because attackers can Leverage The inherent problems of AI to to bypass systems that use AI to to secure things do that make sense absolutely

good point yeah thanks I think there was some research also from man Tropic in terms of like including backos into Alm systems um so now I see your points trying really to try to apply this to llm FS whatever as well yeah I mean there's a lot of going on in using AI for security also that we didn't touch upon you will people talk about employing in the I between the CPU and the ram to look for like unallowed memory accesses or whatever for the moment I would argue that it's best to like draw your trust boundaries properly because everything else in a way is at the moment just ofation if you put another in second and the third AI

to filter the output maybe actually the AI just shouldn't have the data if it shouldn't get out at this point you know that's what we would argue for if you need three AIS to secure your one AI that might be red flag and you might be thinking let's just [ __ ] not do this I think yeah I am julan great talk um I stand between you and your lunch I think sorry it's 12:00 it's bad but I had a a broad question as well and I wanted to know from what you see like who comes to you large company small company when do they come do they come when they're already live or on in the

project the answer to this is everything basically everything yeah so the answer is they come when they're Al they come when they're not Al it's like it's I guess it's so this doesn't change necessarily from your usual pentest business right they just have ai now and they're like oh [ __ ] we have ai now can you test Ai and then it's like let's first talk about this right because does this make sense how are you say maybe it makes more sense so sometimes when customers are approaching us we tell them let's discuss your architecture first and not do like 10 days prompt injection your AI system it makes far more sense to think about

conceptionally wise what did you do with this and then we can if they politically needed do prompt injection and show them yeah it works so they can have like a paper and like you know say this is not good we should draw a drust boundaries differently that's what we also do but it's it's all over the place basically everybody is implementing it everybody's using it um and sometimes it's also sometimes our customers are also not aware that they have uh so uh yeah so it's everything that I cannot answer this question to be honest so we have one last question yeah hi thanks for the talk um one really fast question no worries here um

did you also look at uh retrieval augmented generation the rack I see that often in smaller projects where people do not want to give all their internal customer information to an externally hosted uh GP and they have this kind of concept where they have that document pull in a local installation and kind of use that in order to enrich the prompting and which is sent out to the to the highly depends on the architecture I mean if you are going if you look at the GPT I mean this this the examples that we've shown you with the API calls that is basically something that you would use if you just dive down into the open AI ecosystem you

want an open AI assistant and you don't want to give them the data then you basically just wrap it with an API call so the AI is coming back to you you have your data pull there and you have a backend and that back end is providing information to the ey to work on so to say you can WRA this is there a specific way of attacking those um not necessarily no okay because I mean at the end of the day it's a trust boundary you have like a very strict trust boundary and attacking the AI doesn't give you the documents in that sense right it might give you depending on how it the backend works works you might be

able to extract a lot of data but that's also comparable to I don't know you have this classic case in a web application where the first document is ID equals one and the next one is two and then you just enumerate all of them and you get them probably you're attacking rather attacking the back end instead of the AI to get the the data if you are completely local if you have a complete local installation your question would be who is my threat actor who is what threat am I thinking about is it internal employees if that's not my problem then external employee then external attackers don't play a role because they cannot reach your system

maybe because it's just literally internal so it really boils down to thinking about the architecture how are you accessing that data what gives an attacker the way to your data and can this be leveraged to to get a hold of the data right as I said holistic approach super important there are no generic answers kind of besides trust boundaries famous last words all right thank you so much Florian and Hannis so we're now going to our lunch break yay and please guten

Florian Grunow - A Practical Approach to Generative AI Security (BSidesFrankfurt 2024)

Related talks