← All talks

Protecting AI Systems Against Old and New Threats

BSides Munich · 202531:4934 viewsPublished 2026-02Watch on YouTube ↗
Speakers
Tags
About this talk
A practical exploration of securing AI systems through traditional and emerging threat-modeling approaches. The speakers demonstrate how conventional security measures—access control, input validation, rate limiting—effectively mitigate both AI-specific risks (prompt injection, adversarial attacks) and legacy threats (supply chain vulnerabilities, information disclosure). Drawing on a real customer engagement, they map OWASP Top 10 for LLMs against EU AI Act compliance requirements and show that robust security hygiene remains foundational for AI deployments.
Show transcript [en]

Thanks a lot. Thanks for the intro introduction. Um yes, we are going to talk about um the tip of the iceberg protecting AI systems against old and new threats. So basically we have all those fancy new AI systems. But what what can we do with traditional protection measures um to protect those new kind of systems that we see now popping up everywhere? um and they take a larger and larger place in our organizations in our IT infrastructure. Um we are first going to look at AI risks. What is the risk landscape regarding AI? What can happen? What can go wrong? Um also a bit into the regulation side. What about the um AI act? Um then we are going into a more

practical example like a strat model that we um practically did with one of our customers. Um and then we are going to contemplate a bit about the learnings from this. So um what role do AI specific threats play? What role do more traditional threats play in um protecting those kind of systems? So let's start at the beginning with the risks. Um there was a nice quote from Meredith Whitaker the president of the signal foundation who said the marketing the value ad is something like this. AI can look up a concert, block it in your calendar, book tickets, message all your friends that it's booked. So, you can put your brain in a jar. Sounds very

nice, but what would it actually need doing that, right? If we think about those examples, it sounds very convenient and AI can take off over all of our tasks. Um, it's super practical, but what does it need doing that? So, what kind of excess does it actually need if it if we wanted to fulfill those kind of tasks? And it's um a lot of stuff AI would need. For example, access to a browser, um credit card or payment information, calendar access, access to a messenger to message your friends that you maybe go on a concert together. It would probably need this kind of access to your data unencrypted because it needs to work with it and it needs

probably access to a cloud service because you cannot yet do all this on your local devices. So this creates a kind of mixture between farreaching access, large data pools, data exchange, and automated agency, which is quite problematic from a security and privacy perspective. But you could argue, where's the problem? AI systems do exactly what we tell them to do. After all, they are used to do that. They listen us to us. They like to help us. They are programmed in a way um that they support us. So nothing can go wrong, right? Well, there are other experience. Um there was this one company Replet AI um coding platform basically which um ignored a code freeze instruction and

deleted the production database of some other company that was using it to provide the services. So basically there was the command to the AI do not touch production. It did that anyway. Deleted the production database. Also claimed it was not recoverable of luckily it was. But nonetheless, everything went in a way that was not really expected. And um as uh Jason Lampton here states, there's no way to enforce a code freeze in white coding apps like Replet. There just isn't. In fact, seconds after I posted this from our very first uh talk of the day, Replet again violated the code freeze. So cord freeze does not really work. Um we could say well this is a while ago maybe it's better now

because we have some kind of guard rate systems that we can try to protect our AIS a bit better. Um also those kind of systems do not really work reliable. I found this quite quite telling. Deep safety guardrails failed every test researchers threw at its AI checkbot though like 50 tests all failed. Not a very good impression. Um, you could argue, okay, well, this is like on the coding base. Maybe for consumers it's not as bad. But if you use new browsers, maybe the Atlas browser, then you see again there are some issues with AI. So here it is about a vulnerability in OpenAI Atlas browser allowing injection of malicious instructions into chat GPT. So CH chat

GPT keeps a context basically with the open AI session. Um and here attackers found a way to inject content in your um session that you keep between the browser um and the open AI API um that is used to instrument the browser and and basically the LLM instrumenting the browser this context that is maintained there and um we will also later see that this is a quite interesting vulnerability because what was responsible here in the end was the cross-ite request for vulnerabilities. Yeah. So it was not like some fancy new issue. It was based something that we know for a long time that was exploited here to get this kind of access and it's also not limited to Atlas browser but

Atlas uh did apparently not very well protect the users against crosset regress forgery attacks um as this risk landscape continues to to evolve and it's a complex thing luckily there's an oath project dealing with it and you probably heard of it by now and this is the um oath top 10 um for large language models or genai uh the this gen AI security project at the OS group which does a very good work in in maintaining um describing those different attack vectors that they see relevant for LLM models. Um if you want to dive a bit deeper there's also like the machine learning top 10 that might be more suitable to other systems depending on your context. There's also

a bit of overlap. Um so um yeah this those are both uh two quite good resources for diving into um vulnerabilities if you deal with um Gen AI systems or LLM specifically. Um the vulnerabilities that you see here um you might have heard about some of them. Prompt injection for example I believe everyone knows by now what this is. Um basic stuff like sensitive information disclosure, supply chain attacks, data and model poisoning, improper output handling and so on. um we will um use that again in the threat model example with a bit more more detail. If you look at the risk levels for AI systems from a bit of a higher level, however, this is quite already specific.

You can go to other categories of risks that are also involved with AI systems. Now you have health and safety risks. So could the AI failures lead to physical harm or unsafe conditions? You have operational and financial risks. Errors disrupt production, lead to wrong decisions and so on. Fundamental rights and ethical risks that come with AI systems, data and privacy risks of course because uh no one loves data as much as your AI systems. Um you have cyber security and reli reliability risks. Um and then you have compliance and legal risks of course because also the landscape across AI regulations um grows. Um, one of those relevant things for AI regulations which you also might be

aware of is this AU AI act and this is um basically enforced by now and it is an attempt by the uh European Union to classify AI systems into different categories. You have uh systems with unacceptable risks for example forbidden um systems like social scoring systems that we that we do not like to see manipulative AI um detecting of emotions outside of therapotical and so on. Then we have high-risk systems that usually affect safety or health or rights of people like employee management systems, decision-m systems and so on. Um we have limited risk systems um that uh involve the risk of manipulation or deceit. Like these AI systems must be um transparent and humans must be informed about the

interactions with those systems. And then we have minimal risk systems that are basically those systems that we have known more or less for a longer time like um spam filters um traffic routing systems and so on. So um this is a classification that the European Union um developed and then there are also some specific requirements for general purpose AI like providers must provide technical documentation and so on. Now we're going to look at a bit like how to apply these risks uh in terms of a threat model and for that I would hand over to my colleague who shows us a bit around an rack system >> how this can be done. >> Hello. Um yes exactly. So thank you

Michael for the introduction. Um so we already had like a few examples about different approaches how we can use large language models. So we had the agent systems we already talk had mentioned the um yeah um normal LLM approaches but I want to talk today about an approach and this is the rack system the so-called retrieval augmented generation. Um maybe some of you have already heard about it and I think it's an really interesting approach um because um what it is it's like a normal large language model that we are mainly maybe most of us using daily already um but with like some internal data and maybe we start with the um from your point of view right side here um with

the architecture. So we have a user that is still using a normal web interface. You can imagine you have your chat GPT interface and normally interacting with your prompt and asking normal questions and then you have your API back end and you have your SageMaker there. You host it your um internal large language model any um local large language model that uh is is available. So um any open source large language model um but what is yet now newly added is this vector database and there um is some internal knowledge that you added to this system um that you can ask some internal questions and how you create this vector database is in this AWS example here for

example is that you just upload some internal documents in your SS3 bucket and with an so-called um with this document load that is some some embedding model that will create some fancy chunks and everything. Um this will create then an vector database that will send these information then to your large language model and then you can ask internal questions to your large language model and this will help you with internal questions really fast. Um maybe we also of course have an small use case and to make it a little bit easier um a user wants to identify the right supplier for a product. And just imagine you need a screw and you have a list of of suppliers that can deliver

this screw. And um you can of course just go through the list of suppliers and have a look who has like the best um the cheapest price, maybe the fastest delivery time, maybe also the best quality on your experience. Um but why why doing this? Uh you can also just upload it in your S3 bucket and then automatically let generates this vector database and then just uh ask your chat GPT or your uh prompt um what is the cheapest price if I want 100 screws per day or if I need 100 screws per day or what was the best quality in in experience from the last month just from this questions and the back end contacts

your um your uh database from the vector database and we'll then just suggest three matching suppliers. Um so you can of course also get more. This is depending on your um commands that you give with with your LLM. But you can make three three suggestions or even more and then uh depending on the suggestions the LLM makes for you um you can decide and um you can make decisions and then afterwards can contact your supplier and what you need in the end is just you need some uh documents. This can be PDFs, XML, CSV, everything you have. You can also have your normal SQL database that you can include in the system. Um and yeah just in the end need

to upload it in your system and in the end have like an large language model with internal data. um just as a small small example um but of course with internal data and large language model there come security risks and Michael already mentioned there's like the the top 10 large language model and I don't want to talk now in this small time frame about all the top 10 now but I highlighted here three important topics um I want to start of course with sensitive information disclosure this is um all over al also in topic here that uh with the large language model because when you upload documents in your S3 bucket for in in our example um you can

also you have a lot of users who have access to all the documents that you uploaded and you have then in the prompt can just ask like in these about all these these documents and then in these documents there might be sensitive information. So you need to to have a look what these documents might might might contain and you also need to check who has access to this large language model in the end. That's really important and I also want to talk about number eight and nine a little bit combined. Number eight is specifically an risk that talks about uh the rack system uh the vector and embedding weakness. um embedding weakness is for rax system is when like there's might be

some problems when you create this this um vector database. So if for example there is some problems in the vector database and you will not get the right um supplier um like recommendation um then if there is a problem that you get not the right uh supplier recommendation there's some misinformation leading so if I ask what might be the best three suppliers and all the time I don't get one supplier because there's some problems in my vector system um this will lead to some problems for me and also for the supplier. So um that's that's an that's also a risk that I cannot really see if there might have been a problem when there has been

calculating um and processing the data inside my embedding model. So that's also a risk and a security risk at all because it might lead in misinformation um that we need to check all the time. Um the other risks are also really important. These often talk a lot about architectural risks and architectural risks are often um really really also really important about threat modeling but therefore I will also again hand over to Michael. >> Mhm. Thank you. Um yeah so basically what what we did here was this um oh where's the thing gone? Yeah. Uh what we did was the system we um we wanted to to do a threat model but we also also wanted to um have

a complete analysis here in terms of the specific AI risks. So that's why also with the threat modeling we focus on the AI AI risks. Um you know the threat modeling steps usually this is a four-step framework. So you clarify what are we working on what can go wrong what are we doing about it and have we done a good job. Yeah. So this Adam Shex um fourstep framework we applied here um focusing on the LM risks. Um what we also did is that we try to embed the uh embedded embedding this threat modeling in a bit more overarching process because we have seen this European AI act. So before you threat model a system

in a way you need to bit clarify what kind of other requirements do you have on that system and usually in larger organizations you have um you have areas like data protection and so on that get also involved. Um in this case it was a bit smaller setup. So we did just a focused um process that um includes the EU AI act requirements and therefore we did this first three steps here where we involved basically um a kind of compliance check for the EUI act to to determine which kind of requirements from the AI act we would need to take then into the threat modeling activities to also consider those kind of risks. Um what we did here was first of course

going a bit into analyzes high level understanding of the system then classifying the system under the EU AI act in a risk category um identifying then the compliance requirement that derive from this EU AI act and then basically go into the threat model. So basically we merge those kind of requirements um from the AI act into the threat modeling activity and um if you if you're still new to that you can use this AI compliance checker on the website it's quite convenient. You can click through a few questions and then you get a result. So basically in this case as this was dealing more with um suppliers and product part numbers and so on. This was not like a high-risisk

system or anything. This was a default case where you had just those alphabet like like this informationational requirements to inform your users that they are interacting with an AI model because you always have those risks of hallucination of the AI even with rack systems where this risk is reduced because you provided a good information base usually. Um but you still have the risk that the AI randomly or pro probabilistically generates content that does not match with your actual data or requirements. Um yeah so in this case we just had this kind of um information uh requirements for users. If you look for high-risk systems this picture would look different. Uh so here we we were in this

part where we had the transparency and user training requirements. Um, if you're going for a high-risisk system, you have risk management, integration, data governance, technical documentation requirements recommend uh transparency provisioning human oversight requirements, accuracy, um, ensuring accuracy requirements, opt out, human intervention requirements, and so on. So, there's a lot of more detail that you need to take into account once you move in this kind of high-risk area. Um in practice also what we did is that we basically used those kind of requirements from the different input systems uh from the different standards and compiled our own threat list out of this. So we try to not have too many threats because you can go into a lot of

detail with those kind of threats but have something practical that we can use um as a checklist for our threat modeling activities. And um also this AI is really great at at matching text. Yeah, that's that's what it's designed to do. So we used AI to do that. We threw everything in. We we we um compiled it, formulated a few prompts and so on. Um then of course rechecked it a bit to make sure that it did not hallucinate that did not mix too much. So you cannot cannot um skip this step. But um like in this preparation work, AI is really great. Um what came out was uh you can probably not read this but we have a have a list where we

basically said okay this is a category of um of um threat we want to look at this is the threat like prompt injection um adversial um attacks training data poisoning back door injection model inversion membership inference attacks and so on. um then gave it like a description and a countermeasure suggestions here and also mapped it to this uh OASP LM and then um had it site the sources there just to make sure that we really um did not include any hallucinations. Um we used this list went into a typical threat model made make some documentation here with prompt injection. How likely is that the user modifies the AI prompt in this system? It was very good because the um the

prompt and whenever you can do that um avoid free text prompts to LLMs. Better construct your prompts out of strings um or like like select boxes or whatever you can use to constrain users in in the information and the commands they put in because then you can avoid this kind of of prompt injection way more easily and don't have to deal with it. Um otherwise this is this is hardly avoidable uh avoidable. Um sensitive data exposure and so on AI supply chain compromise. So these these are stuff that is a bit process related. Um denial of service, unborn consumption, this is something that you might have to configure in your cloud provider and so on. Um and if you look a

bit at the at the uh counter measures that we listed here, it was like well do input validation. We have this form constructed. There is no free text. Perfect. So we have we can do some kind of input validation. Yeah, we just ask product part numbers. Maybe they have a specific format that we can check and so on. Um ensure no sensitive data and rack files. That's the easiest way you want don't want to spill sensitive data. Don't include it into the system. Um limit uh input, limit the um the cloud resources. Ensure that you use your model from trusted sources and so on. Um and it it emerged a bit the feeling that nothing of this is really new. We

avoided prompt injection and from the rest there is there is still the risk like output integrity attacks. There is still the risk of of hallucinations and so on. Um but a lot of the stuff that we saw in the threat model was basically like okay it's it's it's the old stuff applied to a new um to a new system. So this was a bit the um the feeling that we got there and this also brings us to the third part um the learning part where we started matching a bit and we thought okay let's take the over top 10 LM and let's take the over top 10 for applications and see what what kind of

overlap is there. Yeah. And um this is not not a complete mapping here but um it looked like there was quite a bit overlap. So you have prompt injection but you also have SQL injection. They work a bit differently. It's not exactly the same thing. So the root causes also a bit similar. Um you have um things like supply chain security. Yeah. Or you have information disclosure and security misconfigurations. All those things that you do already with your with your web applications, with your services, with your interfaces, they also apply to LLM systems. There are some some new tweaks and so on. But there's there's really a lot of of overlap. And this is also

true, I think, for the new OS top 10 that just have been published um I don't know a week ago or so. I think at least there's this release candidate out there. Um they did not change too much. So it's it's still the same uh the same thing that we see. Um also here we looked into really mapping this a bit and uh looked what mapping it by the counter measures basically that were suggested by um the OAST models. So we see that this is the LLM risk and then you have the web over counterpart prompt injection to injection for example um then what what kind of counter measures were suggested and then there was a

rating on the overlap um that we that we also asked the LLM to do. So it's basically um a a word matching in a way but I think that works quite well in this case because you can really see what what kind of counter measures um are recommended um for both these kind of uh controls from both standards. Then you see that for sensitive information disclosure there's a high overlap for example supply chain vulnerability it's it's uh I think quite quite self-explanatory that this is uh really really the same things like input filtering output filtering data hygiene trusted sources encryption rate limiting um are very familiar counter measures and coming back a bit to the title of

this talk so we have this this tip of the iceberg matuffer where we said okay we have a bunch of AI specific risks injection adversial attacks and these are quite um so prompt injection let's phrase it and adversial attacks those are AI specific attacks over reliance um excessive agency because once you go into this uh this kind of MCP um automation direction you have to think about this risk which you can also constrain with access control again so there's there's again some some overlap um but the traditional risks are also quite prominent supply chain information disclosure denial of service, access control risks, data quality risks, output handling risks. So um there's a lot of a lot of common

ground I would say between the uh covering the new systems with with old measures and yeah if we if we look at this here what still helps good old protection mechanisms and proper security cons still help to reduce AI related risks. So what should you do? Of course the usual stuff input validation access control rate limiting supply chain security human in the loop um for critical process for everything where it's really really really important um sandboxing monitoring and logging encryption and IP protection measures risk awareness trainings and governance and um we also found that usually it's it's quite safe to assume that ever what you put in a um large language model or um what you put in or what what you give

in large language model access to you can not consider safe. Just just if you keep that in mind, I think you're making a good um good approach to securing your LLMs. Whatever data is accessible by an LLM or gets or an LM gets trained with or has access to in terms of rack, um it's very hard to secure this with the measures we have at the moment available. So, and um yeah, maybe let's come to the last slide here. So, what what did we take away from this? Um so traditional threat modeling works very very well also with AI systems. So you just need to enhance your threat landscape a bit but you can still use the same same

processes. Um the use cases of AI systems needs to be considered. So it really depends on where do you place your AI systems in this kind of of process chain of your organization to determine what kind of um risk level it receives from the EU AI and what kind of um requirements then you also have to fulfill. um the this kind of access to data and agency. So this is really the critical thing and this increases the protection requirements for AI systems in comparison to traditional systems and with this bit specific dimension and um still you have those very traditional defense mechanisms this kind of security hygiene that if applied provides you already a very very good ground to

protect your AI systems. Well, um do not only focus on prompt injections but really focus on the complete architecture and um then then you will be well covered. Yeah. And I guess that's it. Thanks a lot for your attention.

Okay, the audience, do we have questions? Yes, there one.

>> Hi, thank you for the presentation. Very interesting. Um, I was curious to know what are some tools that you'd recommend to let's say countermeasure all the new threats for AI. Do you have any recommendations? >> No. >> Okay. No, I do not have any recommend recommendations. So, we looked a bit into into guardrail systems, but in in my opinion, those >> they they raise the barrier for attackers, but it's not like you get get uh 100% security here. So, we did not find like a golden solution to really protect AI systems. So, um this is this is a bit by nature of the systems. Once you have those kind of data in there and

you have this kind of of prompt digestion by the systems we looked at this in in the workshop on on on uh Saturday with a lot of detail how these probabilities get processed but there's there's no really way to secure this data inside there. So um I think there are some some testing frameworks now. I heard of a frame I don't want to make any advertising. Someone told me of a framework called spiky or something which is can be used for adversarial testing. May might be interesting to take a look at. Um yeah there's this guardrails AI page I think which also has some some open source guardrails that you might experiment with. Uh but

in my opinion there's not this this golden tool. So I go for the direction of avoid giving access to sensitive data to those things. Uh or then if you do it then you have to do access control around it. Thank you. >> Any more questions there on the left side? Please throw it over there or on the right side from the entrance.

Microsoft is currently rolling out its copilo within the company settings. Therefore, it's company customers because it's helpful. But it's also business model. How do you look at that security in the Microsoft framework of of one drive, one node, outlook, power automate it's a difficult question. I know it's happening and I'm I'm yeah I mean we have seen the echol vulnerability and so on in the past. So I think there have have been enough example where it did not really really work out. So I in my opinion this yeah this this still is probably a learning space also for Microsoft to how to properly secure this. I would be careful but I know on the other hand that a lot of companies

use these tools and you you are required to use these tools. So it's also hard to avoid. So yeah I I would try to be careful as possible with what what uh it gets access to. So I think it's very hard to keep this data really really confidential. >> Thank you. One more question there. Yeah, sorry to link back to these questions about copilot integration with your ecosystem. You just mentioned just keep away from LLM all the data you consider sensitive. Well, we copilot integrated in your Microsoft tenant then the whole data is already integrated with copilot. So would that apply also in this case or what would you be your suggestion there? >> I cannot I cannot tell you how to secure

this. I mean this is always the the issue if you work with Microsoft Office tools or or office cloud. Yeah. So Microsoft has access to it anyway. So the question is how how is it protected from different users and I I cannot really answer that because I'm not familiar with the Microsoft architecture here. How it is really um separated on on an instance level or on on a physical level in the end. So if there are different instances running including different customers data and so on. So I cannot really inst.

Um I would not be happy about it. >> Yeah. I mean you usually have a DPA with Microsoft itself contra but like the question is do you have a risk that this gets leaked as you mentioned to other Microsoft tenants not the Microsoft itself. So >> okay I cannot answer is not Microsoft but you know >> is it more risk for Microsoft or is it more risk for us? >> Oh my god. Yeah this is uh I I'm not sure. I cannot really give an answer to that. I'm sorry. I mean, it depends on the contracts you have with Microsoft, but I think Microsoft also has a good legal team. So, it might be difficult to

really really hold Microsoft accountable for whatever happens, but I don't know cannot answer that. It depends on how Microsoft sets this up in the background and we I don't think we have insight into this. So, no, not really. Thanks. Yeah. >> Okay, then please a short applause. Thank you very much.