Securing Generative AI: Is it all an Illusion?

Name: Securing Generative AI: Is it all an Illusion?
Uploaded: 2024-07-09
Duration: 40 min 3 s
Description: Securing Generative AI: Is it all an Illusion? Rachana Doshi, Michael Samson Dive into GenAI/LLMs where innovation meets vulnerabilities. This talk is on commercially available GenAI/LLMs where you will learn how to assess and secure them while enabling your partners to rapidly deploy. Gain insigh

BSidesSF · 202440:03344 viewsPublished 2024-07Watch on YouTube ↗

Speakers

Rachana Doshi Michael Samson

Tags

StyleTalk

About this talk

Securing Generative AI: Is it all an Illusion? Rachana Doshi, Michael Samson Dive into GenAI/LLMs where innovation meets vulnerabilities. This talk is on commercially available GenAI/LLMs where you will learn how to assess and secure them while enabling your partners to rapidly deploy. Gain insights as we share lessons learned and take away a flexible framework to apply. https://bsidessf2024.sched.com/event/d579ba527debb7a761fb1e95fe3eb195

Show transcript [en]

so what we have here is we've got Michael Samson from uh from Salesforce and rashna Doshi and I'd like for you uh folks to give them your time and attention and uh let's give a round of applause and good luck you [Applause] folks all right hi friends today we're going to talk about how we can secure generative AI is it all an illusion or are we really able to get clarity on the security posture of the w that provide these llm services and not just the vendor security posture but actually the data flow the underlying data architecture and the systems that process all of our Enterprise data so let's find out how we can do

this um my name is rad DOI I'm director of third party security at Salesforce I'm uh Michael Samson I'm also at Salesforce as a security engineer all right so today we'll just talk about the situation as we were facing in very early 20 23 late 2022 we'll perform an interactive threat modeling so feel free to shout out your suggestions when we call for that um and we'll walk you through the security threats posed by these llm vendors last but not least we'll definitely talk about the technical controls you can Implement to secure your Enterprise data related to all these security threats all right um I can't take credit for the graphic but it's a really super

interesting one to me it took the telephone 75 years to get to 100 million users worldwide it took the internet about 16 years to get to um 100 million users worldwide but it took um I'm sorry 7 years to get 100 million users worldwide but it took the mobile phone 16 years to get 100 million users worldwide but it took chat GPT only two months to get to 100 million users worldwide and so artificial intelligence um mathematical machine learning models they're not new technologies but what was new and unique in November 2022 uh was the ability to generate text and content based on a prompt all of a sudden in January 2023 all anyone was

talking about was generative Ai and chat GPT we had CEOs who are announcing on their social media feeds how they're going to integrate generative AI all within their uh business products business teams were coming to us and um innovating and saying how can we use generative AI U securely within our um uh workflows and the speed at which uh The Innovation was happening and the use of um ideas of using generative AI in the business world were coming at us at unprecedented speed typically with the lower rate of AD or slow rate of adoption of tech prior Technologies compared to Jet GPT think s years or 16 years um of some of these Technologies like the internet or

Windows or mobile right um it takes some years for the technology to mature and also to reach um critical mass right and so usually with these new technologies security is an after thought it's not the first point of consideration for any of the um teams um but that was okay right in the past years that was okay because the attack vectors the threats the risk that were coming out of um using this technology um you know we security teams had time time to learn the technology and then figure out what are the threats associated with it um however as quickly as the news of generative AI adoption was coming out of the business World um our security

practitioners we were facing and looking at the bugs and the risks and the issues just as rapidly coming out of use of generative AI um at that time in early 2023 January 2023 we felt like this um cartoon over here right we were laying down the tracks as the train was barreling down at us we were building the planes engine as we were flying the plane and so for us we had to really quickly figure out how do we secure generative AI for our Enterprise like I said typically security practitioners have months years to figure out not only just how a new technology is going to work but what are the threats and attack vectors and then

over a period of time then build controls to protect Enterprise data um for our case and a lot of others that we talked to at the time we had weeks maybe months to figure out how to secure generative AI so some of the things that we were thinking about and questions we were asking ourselves was how do you even assess a generative AI system what are the things to look for what are the risks what are the threats um how do we create guidelines for developers to develop against generative AI um and what are the some of these key risk considerations what are the threats that are coming at us and so for this next

portion of the talk I'll hand it over to Michael thanks all right so we want you to walk away with uh practical information so we're going to have an interactive threat modeling uh exercise where we're going to ask for some uh audience participation uh but before we do that uh we have a quick primer on a couple topics uh the first is on retrieval augmented generation or rag probably already familiar with this I won't spend too much time on it uh but essentially rag allows you to get contextual responses out of an llm without having to retrain or fine-tune the model and you essentially achieve this by adding context uh directly to the prompt uh

before sending on over to the llm for inference so here you just have a quick example on the left where you have a standard user prompt and a prompt template on on the right you have the additional context uh injected in red now to achieve this you need to uh build out a vector database that has all of the content that you'd want to potentially include as context around as well as the uh embeddings uh for that content and you gen you generally uh generate the embeddings using an embedding model um and essentially just a numerical representation vector format of that data then along with all of that you'll store any metadata you want on

that content such as uh where you sourced it from once you have the repository built out as you're processing prompts you'll generate embeddings for those prompts query the vector database for the most similar matches and then you take the top x uh number of matches and send them send them over along with the prompt to the llm now the second item is around llm plugins for tools uh these are essentially just function calls often uh python functions and so the way you uh utilize these is that you basically tell the llm the functions or tools that are available to it along with some instructions on the arguments that are required and examples on how to uh

invoke that function then when you get the out put back you parse the output for any invocation attempts and if there are any you go ahead and uh call the function with the supplied arguments so now we have the threat modeling exercise so I'm going to explain what's uh going on uh in this diagram here and then I am going to open it up to all of you to uh kind of throw out what you see potentially going wrong here so here we have an organization that's built uh an internal web app to provide AI chat assistant functionality uh for their staff the web app runs in AWS uh they've implemented rag with uh various internal resources they have a

uh plug-in environment where they're uh executing the tools that they're making available to the llm in this scenario we could say that the tools that are being made available are a python interpreter and a database query engine uh to query a database that has a table with uh various and you could use that to pull flow log data so for example someone in it can ask their chat assistant how much traffic has XYZ website received today and uh the assistant would be able to respond with that now this organization is not hosting their own uh llm they're utilizing an a third party llm service provider uh that allows you to uh utilize their models uh via API call and

so here we have the LM service environment we'll say that they're also running in AWS using AWS stagemaker to uh build and uh run their models as they receive a prompt they'll run some supporting models against it to identify many such as uh acceptable use policy violations or anything of the sort um if they're going to go ahead and process the prompt they'll go ahead and run inference against their model to generate a completion and uh return that completion back to the calling system they also go ahead and uh store a copy of the prompts that they've received and completions they've generated in a persistent data store and then use that data store to run any uh usage audits as

well as uh use it as a training data source for uh Future model updates and so with that I will pause so feel free to kind of yell out any potential threats or uh risks uh that you see uh yep let's see a hand rais right

there yeah yeah that yeah that's uh definitely one of the threats oh yeah sorry uh the uh the threat was uh basically uh an internal actor could uh achieve uh basic arbitrary code execution uh through prompt injection uh in the plug-in environment and uh yeah uh having unrestricted arbitrary code execution is uh definitely a red flag uh yeah send your

cont yeah uh there's definitely the concern of how uh that service provider is uh utilizing uh the data that you send in and uh also whether or not the uh uh service provider is uh trustworthy um but yeah the context that you send in is definitely concern and how it's being used what the data retention policies of the part yeah what datet policies yeah the data retention policies yeah so that's uh definitely another risk uh as far as you know they could be storing it uh forever and uh yeah the use cases for that data are

questionable yeah uh so the item was a on sure there's a proper standardization and uh SQL statements uh and yeah so uh you that's a definitely one of the the threats um you want to ensure that you're properly standardizing and parameter using parameters for everything you're receiving from the users but you also want to do that for everything you're receiving out from the llm uh so you want to treat that as yet another untrusted data source okay indect resources

yeah yeah so indirect prop projection through uh the internal resources uh yeah so that's yet another uh threat so you do want to ensure that you are sanitizing everything that you're pulling in uh to that data source um and you want to make sure you're you're you're curating uh the data into the internal resource that you're utilizing padlock we are missing padlocks yeah I'll take take one more and

then yeah sorry yeah yeah so authentication uh so that only uh authorized internal users can call the app uh yeah so that's yet another uh threat so you want to make sure that you have a proper uh authentication and authoriz controls at the API Gateway all right so yeah so that was uh great you you guys uh called out a lot of them uh so I'm going to go ahead and uh run through uh a few uh so the first and probably most obvious uh is uh potential compromise of the uh Service provider's uh environment uh this can result in exfiltration of all of the uh persistent data and depending on the uh photo hold of the thread actor that

could also result in extration of prompts as they're being submitted um again based on depending on the uh foothold of the thread doctor it could also uh result in U tampering of uh the completions you receive back either on the fly as they're being returned to the calling system or by uh tampering with the uh Service provider's uh training data [Music] set then you have the risks around how the persistent data is being used so utilizing that persistent data for uh internal audits essentially uh makes your data available to a much broader audience than it typically would be uh compared to customer data um hosted by an Enterprise SAS solution and then utilizing that data for training of

Future model updates uh can result in your proprietary data being uh returned to uh unauthorized external parties uh the form of uh

completions and then there's uh potential over Reliance on uh llm output so LMS are known to uh hallucinate and the responses you receive could be outside of the parameters of what you're expecting uh so you wouldn't want to uh automatically invoke any uh sensitive uh actions uh based on an llm output and then you have the plug-in environment where you're uh executing uh arbitrary code um which can result in compromise of the underlying system access to uh unauthorized resources that you don't want your internal users internal users calling um and then contamination across users for the execution environment and then prompt injection uh pretty obvious one uh if there are any secrets that are included

as part of the prompt template or context uh through prompt objection for example you could have the llm uh reflect that data back to you uh in the form of a completion and then authorization bypass so you'll want to ensure that your rag data source is enforcing the authorization policies of your uh of your data sources so the source system uh same with uh your plugin uh that is executing queries for uh on external databases you want to either perform that in the context of the user uh if you're executing those in an elevated system context or uh service account context uh you could end up returning uh data that you didn't want queried and then in coding failure uh

you want to remember that this is still a web app and the General web app vulnerabilities still apply and the llm output should be treated as untrusted data so you'll want to make sure that you're encoding uh the output before rendering just to ensure you're not uh running into unintended uh code execution in the browser and then there's also uh availability issues so uh if the service writer uh runs into a system outage or if there's resource exhaustion uh this could interrupt any uh critical business processes that you have relying on the uh web application um so this table is just what we already discussed uh it's mostly there for those that don't have the

benefit of a live presentation so I'm going to skip over those and I'll pass it back to RNA all right thank you all for the participation that was great so um as we were threat modeling and building our assessment methodology we kept thinking about what are some of the key considerations uh we should have for for scoping the risk of these generative AI systems or llm vendors right and so um one word of caution um if you will right keep in mind every use case is different um and uh you have to consider your Enterprises uh unique use case your Enterprise context as well as the risk appetite within your Enterprise so the first consideration we want to have is what

what data what of your Enterprise data is being processed store transmitted by this llm provider uh what is the sensitivity of the data that is being shared with the vendor what will be the impact to your business if this data is leaked hacked or published in the New York Times tomorrow um you'll also want to consider uh where is this technology being hosted or who's responsible for the security of both the application itself but also the underlying infrastructure so obviously when you think about something like a supplier hostage system then all of the risks um around web applications come into play but if you're thinking about hybrid then you need to consider the shared responsibility model where the supplier

or the vendor is responsible for the security of the infrastructure but then you are responsible for the secure cons um configuration of the application or the llm itself and then if you're completely hosting this all on Prem then you need to think about not only just the application security but also the underlying infrastructure um also to consider the next two things to consider are around the type of technology that you're procuring so think about um you know what kind of a model are you getting so generative AI models generally speaking um are really good at output based on an input so they're producing you know text image videos audio all kinds of stuff versus large language models are

primarily uh outputting something large text um and then there's other kinds of models that are really good for um uh code generation or classification grouping of data and such um we also look at the types of services that these models are providing to us so like the example says you know there are so many models now that produce text video code generation is uh particularly new and interesting uh use videos audios um and so there's a lot of cons security concerns around just the kind of content that's being generated but also think about the the privacy and the ethical considerations behind some of the output that's being produced um one key concern or one thing key thing to consider is

how these uh models are being integrated within your environment so if you are using the models for automated uh workflows as an example you're going to publish automatically um an article or something based on a prompt or if you're going to generate code that will automatically get executed in your environment those are particularly risky or problem IC from um from our perspective um and then last but not least is the type of relationship you have with your vendor so third party vendors which is typically the vendors you have a contract with where you can enforce your security requirements you can have uh contractual obligations that the vendor has to meet and obviously subject themselves to your security

Audits and reviews um fourth party relationships are where you don't have direct relationship with the vendor themselves what we are seeing is there's a lot of vendors are integrating generative AI within their services and offering it to you and so even if you're not directly procuring a generative a provider you already have most likely a generative AI system within your environment that is processing your Enterprise data and so in that case you need to consider um the kinds of questions you're going to ask your third party vendor how are they securing your data in this fourth party generative AI vendor environment what kind of security Assurance are they doing on their third parties uh um and so you know again the

type of relationship with your vendor um matters now really get let's get into the meat of it right the technical controls or some of the controls you can actually Implement to mitigate the threats or the risk we just talk about one of the first ones is independent verification now not a true traditional technical control um but um we find that this is really a good proxy for telling us how mature the vendor is or their security clure is um so we typically look for independent web application pentests ISO certifications softer type to audits um these Auditors are spending months doing invasive reviews of the vendor it and security practices these are things that we definitely cannot

spend time on with our vendors and so um we look at those to really understand how deep the auditor went but also um you know the vendors maturity within um their security practices encryption again a basic security control but one that we found really important as we are evaluating generative AI systems ourselves making sure that all of your traffic all of your data is encrypted um not just end to endend but at every node in this workflow that you see right throughout the life cycle making sure that your prompts and completions your inferences are only unencrypted for short periods of time while they're being processed and then re-encrypted before it's transferred to the next step

in the workflow um that becomes really critical especially you know with the this emerging technology where you just don't know what some of the threats or de vectors are okay um zero data retention and zero training um very early on we realized that this was a key control that we wanted out of our generative AI systems and vendors so zero data detention what that means is making sure that you have the option with your vendor to um not have them store any of your data whether it's prompt inference completion or even log data so what that means means that is that on your side um of this diagram here you have to implement some key

controls you have to look for things like abuse detection you to look for monitoring of you know how um the prompts and completions if there's an issue the prompts might get dropped and so you have to be prepare to resend the prompts back because again in this work in this scenario the vendor is really not storing any any of your data anywhere it's just in memory for Transit transitory time and then it's going to process it and bring it back so if there's any any issues there again it's going to be more on your side similarly with zero zero training right so making sure that none of your Enterprise proprietary uh intellectual property data is being actually used for training

the model itself like we talked right when you train the model with this data it can invert inadvertently leak um you know to the public and so we want to avoid that data masking um again one of your you know what you got Standard Security controls but again important here here in the context of geni so data masking means you are either offis skating or replacing or doing something with or not even sending in some cases what we would call sensitive data so it could be personally identifiable information or pii it could be your intellectual property it could be Secrets it could be any kind of sensitive data that you don't want to leave your uh boundaries

and so in this case you know before you're actually sending the prompts you're scrubbing the prompts for any kind of sensitive information and then the when the completion comes back you rehydrate that same prompt of the response back with the sensor information and you and you serve it up back to the user in the UI and then for them it's a seamless experience but this way you have really minimize the risk of your data getting exposed Michael will talk over the next all right so uh human validation uh that's uh obvious one uh so if you are going to use LM output for any sensitive workflows uh make sure there is some sort of uh human validation uh in that

workflow um input standardization and parameterization uh as we kind of discussed earlier uh you want to treat the LM output as untrusted data so you want to ensure that you are properly sanitizing uh the input that you're sending over to uh your functions and also for any uh queries uh you utilizing parameters to avoid any sort of SQL ey uh type of uh uh vulnerability and when you have a arbitrary code execution you want to make sure that you are uh utilizing some proper isolation so I have Network filters in place uh ensure that you're spinning up a unique execution environment either through a container either for each invocation attempt or uh for each user session and then if you

are managing the uh underlying host uh you'll want to make sure that there's proper kernel level isolation between the containers so using something like firra firecracker or gvisor and then to mitigate against uh prompt injection jailbreaks things of that nature you'll want to utilize uh models to kind of assess uh that input and output that you're processing now there are commercial uh solutions that have come out to provide kind of like a waft type of solution for for AI Solutions however there are some uh uh open source options available so for example there's uh nvidia's uh Nemo guardrails that essentially acts as a wrapper uh for your uh generation attempts so essentially you can Define

uh guard rails that you want implemented for your input and output and then you essentially call uh NE Neo guard rails for the generation and it'll automatically apply that configuration uh for everything that it's processing uh then there's also uh LM guard so this one acts uh a little differently uh here you're essentially uh invoking a scanner you can have scanners of different types and then you pass your prompts and output or completions uh through those scanners to identify anything that's uh outside of uh your uh the parameters you've deign up parameters you've uh uh created so whether it's like toxicity or ensuring that uh all the usage remains within a particular uh use case um now with LM

guard uh for a lot of these it'll utilize uh open source models uh that'll pull down from hugging face uh Nemo guard rails is more so utilizing the LM service provider that you're actually using so it's just utilizing Uh custom prompts to have your llm uh analyze it to see if it meets whatever your guard rail is looking

for and then authorization controls um so there aren't a lot of great solutions for rag uh Vector databases to enforce uh granular access controls so generally you'll want to try to go for uh utilizing data stores or data sources that your entire uh user base should have access to or uh Slots of like departments or so and you can use like metadata filtering to uh to uh filter for those uh when you're uh utilizing that your vector database um however if you try to apply uh very granular uh controls uh there's uh you run the risk of your data sources and your database authorization controls uh going out of sync um so that's the main risk there and then for your

plugins that are calling external resources if they're calling uh non-public uh resources you'll typically want to utilize your users's context whenever possible so essentially utilizing ooth to ensure that you're still enforcing the authorization controls of that Target system and then this is still a we app so your typical secure software development life cycle practices still apply so was just a reminder of you know don't forgetting don't forget those Basics and then you'll want to ensure that there is some sort of high availability configuration in place if uh your your application is supporting a critical business process um another option is to also have a backup uh service provider you can utilize for uh for AI calls um since it is consumption

based might not add too much cost but you may be calling a different model and there's the added overhead of the additional code base that you have to maintain then these are the items we discussed uh these are the get repos for LM gard and emo guard rails if someone want to take a picture of

those all right so what can be tested um so since it since these AI enabled applic are typically web apps your web app pen testing processes uh still apply uh you'll want to also look for any sort of uh information disclosure so any sort of uh secrets that are included as part of uh system prompt or part of the general prompt template that you can get reflected back to um and to look for any uh sensitive data that may have been included as part of training data that you can get the LM to return to you uh as part of uh the completion if uh you're unfamiliar with with prompt injection uh theur Gandalf is a fun way

to uh kind of get acquainted with uh with prompt injection then you have uh vulnerable Integrations and plugins uh this adds uh pretty interesting attack surface so if you have uh systems that you're uh auditing or uh or testing that have various plugins so you'll want to spend a fair amount of time on these uh especially if there's anything like uh like a Gmail plugin or something like that where you're pulling uh a user's email for processing into the application cuz that because then you're involving yet another untrusted data source into your system and then you'll also want to test for the effectiveness of the various guardrails that you would have

implemented and so there are also various s Solutions out there that are starting to include uh AI uh functions uh within their application and so you'll want to make sure you have agreements in place with them to uh make sure that they're not going to utilize your data to train uh public models that they're going to make available to other customers and you'll want to go through the similar thre modeling exercise with those solutions to identify any potential risks there uh if your third party is uh utilizing a fourth party to you gen provider that makes it a little more tricky to audit the whole thing but uh you also you want to make sure your

third part is going through the same due diligence process you would go through and you'll want to scope the assessment um to uh the risk factors of your particular use case and so just just reinforcing that what we kind of went over still applies whether it's internal application a third party LM service provider or a third party AI enabled application and a fourth-party LM provider um or if it's a third party AI solution provider that's training their own models now it's not all doom and gloom there's also ways that you can utilize llms to uh make your life easier um these are a few projects just call out uh that can help you leverage AI uh the

first one is data con AI essentially is a project that implements a uh local uh rag framework for you uh so and utilizes uh local open source models so essentially you would have a local AI chat assistant with your data never leaving your system Tac AI is an AI enabled uh threat modeling uh project so it'll essentially take your system design as input as the AML and from there it'll generate uh data flow diagrams uh potential mitigations potential threats and potential mitigations for those threats and then Microsoft autogen is uh a framework for creating a AI agents and so you can use this framework to help you uh build out uh agents to uh perform

initial triage on Sim alerts or uh assist with initial triage um security support requests or questions and so for the Lessons Learned uh the big thing is don't forget the basics it's uh very easy to uh get pulled into the new and shiny bits and forget about the uh fundamentals uh that should be in place you'll want to as you're assessing these AI systems run through the data flow diagrams and system architectures uh a lot of the components uh that are in play are components that you're already familiar with and so as you go through that exercise you'll quickly start to identify a potential threats for your use case um and your use case may have

the same threats that we discussed here today fewer threats more threats you'll want to scope it appropriately uh and as with any technical risk uh as you're communicating technical risk to your business partners you want to make sure that you're uh Translating that into uh business risk um and then you'll want to partner with your uh business partners to uh enable them to uh innovate securely uh instead of uh waiting for them to come up with uh solutions that you uh May next on site uh you want to it's typically uh a better relationship if you get in early and uh help them find a secure solution and risk acceptance is always an option however uh going if you're going down

that path you'll just want to make sure you understand the risk and that your business partners understand the risk and is there anything You' want to add there I'll just add I'll just add or reinforce that um for us getting early on just you know starting with basic controls like authorization authentication encryption really helped us get a foothold into even understanding the genni systems um but also really we spend a lot of time a tremendous amount of time just walking through the data flow diagram the data architecture and walking through how our data will be processed by the Gen system at every single node and then looking at the security controls at every single node to say at this point is this data

encrypted and who has access and what's happening with it that was really critical for us to really then un get to the true threats of the Gen models but or the systems but also what kind of controls can we Implement in partnership with the business to enable them um yeah that's about it uh there's some additional resources like we'll publish these slides somewhere um but if you want more info on some of the threats that we discussed you'll be able to find that here uh they these resour these resources also have more threats that we just couldn't get through um just because of limited time and with that thank you uh thank you for attending this session uh so

this particular uh image generation model had some trouble spelling thank you so it worked on another attempt so thank you um and then that will'll open up for questions so the generation model also couldn't spell out question or questions so we want Q&A you actually got a couple questions uh one that came uh from we are Anonymous so definitely a take on Anonymous with vendors uh is exclusion of your company's data from storage training is that typically an admin control setting or does it require Custom Contract language um it's actually both so we with our vendors we made sure it's part of the contract language but before you can get to the contract language we

actually had to work with the vendor and the business to make sure that it's actually Fe feasible right the business use case doesn't require the actual context or the prompt to be stored or responses to be stored to serve up further prompts so that was step one and then the next step was then to figure out for the venders especially early on in January February of 2023 nobody had knew what was Zero data retention and so both our engineering teams and their engineering teams and security teams had to work together to figure out okay what do we do and then how can they process the prompts and return back without any kind of storage um and so there was some

architectural discussion there and then the last piece over there was how do we as the as their customer the vendor's customer ensure that zero data is actually turned on and so typically as far as what we've seen right now it's not a configuration that we can turn on or off it's something that the vendor turns on or off but we have visibility in our admin dashboards to actually see when when it's turned on okay thank you another question and if you folks have any other questions please go ahead and throw them into the system here um have you done work around creating frame works for developers integrating llms into product code and do you monitor or

track usage not specifically our team but we have a partner team in park security that is actually worked on a framework it's not specifically a framework but definitely we have guidance on how our internal teams can develop against generative AI um yeah we have we have a product team that's looked at it it follows some of the similar things you know your stlc best practices secure coding best practices web app best practices and then you throw in some of these um you know uh new ideas but um yeah we haven't done it particularly but our partner teams have okay well thank you we've got a couple of gifts for both of you this one's for you rash and we have something

for you Michael and this is a care of socket security so our friends at socket security have been gracious enough to sponsor these uh these gifts to our presenters uh thank you both very much thank you folks for coming

Securing Generative AI: Is it all an Illusion?

Related talks