BSidesCharm 2025 - A Grounded Approach to AI and LLM Security - Lucas Tamagna-Darr

Name: BSidesCharm 2025 - A Grounded Approach to AI and LLM Security - Lucas Tamagna-Darr
Uploaded: 2025-05-18
Duration: 17 min 49 s
Description: With the emergence of Large Language Models, there has been a rapid acceleration in the development of AI capabilities. This brings with it many questions for security teams on how they should be thinking about AI security. While care should be taken on the development of LLM prompts, it is critical

BSides Charm17:4958 viewsPublished 2025-05Watch on YouTube ↗

About this talk

With the emergence of Large Language Models, there has been a rapid acceleration in the development of AI capabilities. This brings with it many questions for security teams on how they should be thinking about AI security. While care should be taken on the development of LLM prompts, it is critical to not lose sight of the fundamentals to establish secure best practices. In his role as a Senior Director of Engineering and Research Solutions Architect, Lucas Tamagna-Darr leads the automation and engineering functions of Tenable Research. Luke started out at Tenable developing plugins for Nessus and Nessus Network Monitor. He subsequently went on to lead several different functions within Tenable Research and now leverages his experience to help surface better content and capabilities for customers across Tenable’s products.

Show transcript [en]

[Music]

Uh, hey everyone. Uh, my name is Lucas Mayadar. I'm a senior director of engineering in the research or at Tennibal. Uh, and I'm going to talk to you today about uh grounded approach to AI and LLM security. Um, yep, AI again. Uh, everywhere I go, it's more AI. Um, so a little bit about myself and kind of how I'm coming at this. Uh, I've been in vulnerability management for over 16 years. Uh, and a lot of what I deal with is everyone panicking about the next big thing and, you know, how are they going to get destroyed by it. Um, same thing's happened with AI. There's a little more excitement on what it enables, but a lot

of panic about what the risks are. Um, first things, uh, AID is not a new thing. It's been around for a while. Um, we have we've been using machine learning for decades, uh, NLP, things like that. Uh, what really changed was generative AI, large language models. And when it when those became a big thing, that kind of changed for a lot of people. It put it front and center. Um, everyone started having concerns. What happens if our data is disclosed? What h happens when it starts generating malicious code? what happens when AI takes over the world. Uh, everyone had the worst case scenario. Um, uh, what a lot of organizations are trying to do right now is figure out how

do we manage this new risk? How do we enable this? How do we leverage this without it getting, you know, excfiltrating all of our data? How do we make sure our employees are not excfiltrating all of our data? Um, where do we get started?

So, uh, as with all things, uh, OASP has their helpful hot top 10. Uh, and a lot of these are, you know, look new, but they're not that different. Um, I'm going to focus on a few of them for the talk. Uh, but prompt injection, uh, you think about that a lot like SQL injection. Uh, it's same concept. Send it a, uh, you know, block of text, try and get it to do something that it's not supposed to do. Um, information disclosure. Uh really what we're looking at there is is data that you've put into the AI model going to then get used uh and exfiltrated whether that's in a public model or someone managing to

prompt inject you and excfiltrate your data. Uh and the other one that I'm really focused on is excessive agency. Uh and this is a really big one because a lot of the risks that come with AI are dependent on what it's allowed to do. If all you have is a chatbot, okay, it might say something that is inappropriate. Uh make you look bad. Maybe it reflects some cross-ite scripting code. Um, if you have a, uh, application that runs commands on your system, you have a lot more risk. Um, now, uh, this is not static. Everyone got real excited when LLMs became a big thing. We got to chat with them. We got to ask them to write

stories for us and be amazed by what they could do and then disappointed at what they couldn't do. Um, uh, that's kind of where all this started. everyone. It it put AI front and center for everyone. Before then it was very much in the domain of machine uh data scientists who were building you know large data sets. Um this is growing extremely fast which is one of the challenges we're facing. Every time someone gets up to speed on uh LLMs suddenly now they got to learn uh the next thing then they got to learn the next thing. So we started out LLMs. Uh next thing that really came along was the retrieval augmented generation. The

idea is uh with traditional LLMs get a training data set build the model that model is static you get to use it. That's why if you ask uh Jet GPT about something from this year it typically responds and says I don't know anything about that. Uh with rag now you can ask questions it the LMS can go out to a external data source pull in more data. They can get more context about what's happening. So allows them to stay more up to date. Uh, Agentic AI is the kind of current cool thing that everyone's excited about. Build an application. Instead of just having it talk to you and give you an answer and tell you

things, now you can have it do things for you like install a patch on your system, which is terrifying to me. Um, you know, the traditional one people talk about is a travel agent. Instead of the travel agent creating your itinerary, have it book all your flights, have it do all this stuff for you. Uh, it's very cool. It's very exciting. saves us a lot more time and energy from doing the mundane tasks that we don't like doing. But you're also trusting a system and hoping that nothing goes wrong with it. Uh and then most recently we've seen this new thing MCP model context protocols. So the idea with this is to create a standardized

protocol so that when you're building a uh AI application AI agent standardized protocols for getting data from external data sources whether it's a database or a parquet file or anything else a protocol that sort of sits in between now it's really great because standardization is very helpful for building applications can help with security but we're already seeing this these uh MCP protocols are also tied in with servers and services that sort of enable the protocol And unfortunately, some of these servers have vulnerabilities in them. Uh so now you've introduced another layer of complexity that introduces another layer of risk. Um going into some of the specifics. Uh so sensitive information disclosure. This is kind of the first

one that everyone got panicked about. What happens when finance uploads the budget and all of your you know budget data is now part of uh you know uh chat GPT. um what happens if someone puts someone's medical information in because they're trying to figure out what diagnosis if and uh all that. Um uh one of the areas that's a lot of concern especially for organizations is u usage of browser extensions. Uh there are a ton of browser extensions. Uh last I looked on Firefox there were depending on how I searched uh between one and 2,000 browser extensions for artificial intelligence. Most of them LLM uh tooling. Uh and the real risk there is if the data if the tool you're using

is with a public model with a public service there's typically not policies in place that prevent that data from then being used for future training. So it's not like you put that budget spreadsheet in there and suddenly it's available to everyone on the internet but over time it becomes part of the model part of the training data set and could be uh leaked out in the future. Um the other angle for information disclosure uh is prompt injection. Uh so if you're building an application that is using customer data, you've got a ton of cool telemetry. You want to help customers learn from what everyone else is doing, that can be very valuable. But with the right prompt injection, you can

get the model to respond with sensitive information based on your customer data. So there a big concern about information disclosure. Um, now prompt injection is the kind of fun cool one uh, everyone's excited about and it's actually there's some people who all they're doing all day is red teaming all the models and it's crazy what they accomplish. Every time the model is updated to fix it, they break it again. Um, the idea here is to say something to the um, uh, you know, AI, the large language model, get it to respond with something it's not supposed to. Um, it's essentially social engineering, uh, but with a computer. Um, and we all enjoy social engineering. Um, at the same time, it's like I said,

bond injection is a lot like a SQL injection attack. You send a command, you get it to do something it's not supposed to, and it responds with some information. Uh, ultimately, you're trying to get it to respond, break the guardrails it has. So, a lot of these will have guardrails that say you can't you can't write malware. You can't tell someone how to build a bomb. can't do all of this stuff. The idea of the prompt injection is to get it to ignore its guardrails and do the thing you want it to do. Um, now again, kind of going back to that um, you know, what capabilities that uh, model has, that's going to depend on how

valuable this is and what the risk is. So, in the context of a a chat application, at best you might get the prompt to respond with some uh, cross-ite scripting code. Uh but typically what you'll get out of it is reputational harm. You get the bot to respond and say something inappropriate or talk badly about your company. It's not great but uh not the worst thing in the world. As you introduce features like Gra as uh you get it to start pulling in external data or if you give it the ability to write some code, test the code, and then bring it back, you're starting to introduce risk. that AI application might execute code and if an

attacker is leveraging that they might get it to execute code that crashes the entire system or starts uh uh putting a shell a back door on the uh server. Um then if you think about the agentic AI where you're actually having a application take actions based on the data it's getting um let's go with the patching scenario. uh if an attacker was able to compromise that just kind of a worst case scenario they get the uh um patching agent to install a backd dooror now patching agent can do it because it's a trusted system it has the ability to install that stuff uh and if it's uh compromised enough you could really take advantage of that a lot of these are

worst case scenarios still seeing kind of how successful this is um one of the things that I talk about a lot uh with the teams uh and customers is in fact at remember these LLM they're probabilistic probabilistic not deterministic so if I do a SQL injection attack and I execute the attack I get a result back I know it worked I got data it's a database server it's going to be correct if you execute a prompt injection you get it to respond with some data there's some ways you can validate that it works but if what you're after is sensitive data if you're after social security numbers or credit cards cards or uh some other

sensitive data, you could get a result back that makes you say, "Awesome, my prompt injection works, but is that data actually useful?" Um, or is it, as the last speaker said, making a mistake, sending something incorrect. Um, uh, now, yes, there's value in proving out that you successfully completed a prompt injection. You got to to respond with a string that looks bad, but is that useful? And keep in mind, most attackers are looking to get something out of an attack. They're some are at it for fun, but if you can't reliably get something back, it's not going to be as useful. So, um, we and sorry, we actually we've tested out some. So, we've we've tested out some various

prompts. Uh, we tried to get customer IDs out of an application that was written and we were actually able to talk enough to the prompt and it gave us a customer ID. And I was talking with our team that did it. you know, we got into it, we were looking at it, looking at it, and it responded with a string that looked like a customer ID, but it was just a random string. It had nothing to do with that customer. So, in that context, it's great. You've proven that it can be prompt injected, but if your goal, again, if your goal is to get something useful out of that system, it's not going to be very helpful as an

attacker, as a traditional SQL injection attack where you know what you get. Um, now excessive agency. Uh again, this is kind of what I was getting at. This is really going to control how bad the attack is, what what you're going to get out of it. Um and when you're building applications, one of the things you want to be really mindful of is how much uh freedom do you want to give it? Now, there's this kind of ideal state where AI can do everything for us and we just talk to it and it does everything and it's we never have to do anything uh again. But that means someone also can trick it. So whether it's building a

chat service or building a um a agent or anything in between, you want to limit that to as little capability as possible. Um one of the ways I've seen a lot of the prompt injections is people just have this wall of text like, you know, 5,000 words long. Realistically, practical use case, you probably don't need that from any customer. Um, so a lot of this is a about limiting what people can do with the AI and what you're going to let it do. Um, if you don't need it to install patches, don't let it install patches. If you need to let it install patches, give it a very specific set and a very specific location where it can pull them

from. You're going to limit the amount of impact uh that you can get out of it. Now, these are kind of the, you know, risks that a lot of people are focused on is how can you get the LLM's AI to do something uh incorrect. Uh, but what we've seen a lot of is people getting overly focused on the underlying tool. You know, everyone's excited about these things. Everyone's excited about breaking them. It's really cool to break them. Everyone's kind of losing sight of all of the infrastructure that goes underneath them. All of the libraries, all of the services, everything that's part of it. Um and some of the research we've done uh we're looking

at a uh feature from Microsoft that um part of the featur uh part of the chat client pulled data in from third party connectors so it could reach out to connectors and they did a good job. They had uh limits so it was built on Microsoft Azure. It wasn't allowed to connect to internal services all of that. So they put good filtering in, but after doing some more research, our team found that you could use 301 redirects to bypass all of those mitigations. So we didn't break the AI. We just broke the underlying infrastructure and bypassed all of the protections. That's going to be very attractive for an attacker because again, when you get that to work, it reliably works. It's

deterministic. You know, it worked. If you break the AI, maybe you broke it. Um, so anytime you're working with this, whether it's installing applications or building out your services, you really want to focus on the underlying infrastructure. Make sure that is uh very uh secure. And the advantage there is that's something we're all more familiar with. AI, we're still getting our heads around, but securing infrastructure is something that we've got practice on. Um, so there's a lot about the challenges. Um but some of the things we can do um to kind of better prepare and better uh protect our infrastructure. Um number one thing I tell people when I talk to them just know what's in your

environment. That's the best thing you can do. Uh and the challenge again is who's installing browser extensions on their laptop? Who's who's controlling that? Um is uh are people installing you know random extensions? Are they using open models? um what applications are in use? Are you the products that you're purchasing? Do they have AI features? Are those enabled by default? What's the capabilities of those features? Um if your teams are building AI products, if they're building AI enabled uh features, understand what the limitations are. Understand the infrastructure and the libraries they're building. Also, the models they're using. Um are they using are your teams using models that are you have a contract with or are they using open

models in which case a lot of that is going to go into the training model in the future. You want to have a lot of controls around that and uh understanding what it all is and then take basic steps. So again know your inventory understand what you have understand what people are using and what people are building. When you're building the applications, make sure you have controls over what's being used, what infrastructure, what models, what capabilities are you giving it. Are you going to let it at just chat? Are you going to let it um pull in data from third party sources, internal sources? Um, all of that. Make sure that there's limited agency on what the

product can do, what the feature can do. Um, uh, and then, uh, establish strong policies. So the inventory is great, but if you don't have policies in place that tells everyone you can use uh chat GPT but not this, you can use uh Gemini but not chat GPT. You don't have policies, people are going to end up doing whatever they do. You'll have a inventory of stuff and nothing to do with it. Um uh much like a lot of companies will have, you know, specific software that they allow to be installed. Um we should all have policies. Here's, you know, these are the browser extension capabilities that we've authorized. We've done the evaluation. we trust the model, you can

use this, you can't use this. Um, it's the first line of defense against this stuff. It ensures you've got at least control over what's happening. And if people use something that's against policy, you at least have something to say, look, we're not allowed to use this. You have to go use this. Um, you want to have some flexibility because, uh, some of the models, you know, if you know someone's developing something and not using sensitive data, but they're learning, you know, you want to have ways that people can do that. Uh but um having good best practices is really critical. Any

questions? Thank you. [Applause]

[Applause]

BSidesCharm 2025 - A Grounded Approach to AI and LLM Security - Lucas Tamagna-Darr

Related talks