
hello everybody the time is 1:30 uh and welcome to my presentation uh my name is Jake uh I am a global infc Analyst at Crown Holdings uh and my presentation today is bridging AI Innovation and data security uh it's an architectural approach really uh but mostly what I'm going to be covering is sort of half Technical and half strategy on how you should be Bridging the Gap between security and infrastru structure in Secure AI deployment so of course let's start off with why this matters so at this point if you're not already there that's okay but for a lot of orgs out there uh customized AI models are driving business critical functions and they're being used to automate optimize and just
do anything related to business processes just so organizations just can become overall more lean right so they'll do things like fraud fraud detection Customer Service Supply chain management and weak access controls over these can have some serious consequences uh you can expose sensitive data uh compromise these systems and just generally erode customer trust in the orc and obviously just not a good thing for anybody involved so security is foundational for Reliable AI adoption and you need to ensure that not only are systems safe but also reliable have little to downtime just like pretty much any other system that you would have in the Enterprise um challenge with all this is that security teams need to
ensure that they're assisting in Innovation they need to keep AI systems open enough to drive Innovation but not open enough to the point where threat actors can come in and abuse these permissions they need to enforce these access controls against these systems and make sure that the AI systems and the databases that they take advantage of are not you know completely open to anyone and everything and they also need to Foster collaboration and ensure that uh the business overall security and it infrastructure are working together towards you know some sort of common goal and of course they need to be secure by Design you need to make sure that they're secure from the get-go so
you don't have to go through the process of REM mediation later on and you still need to be vigilant as always so where do we start I think we should start with what I'd like to call the three piece it's kind of like the PPT people process technology stack but it goes people process and policies and overall I believe that people is the most important part of this stack because when you're entrusting individuals with these sorts of permissions and augmented abilities within systems of data that they already have access to you need to ensure that not only do they understand what they're doing with that but also that they understand the implications of the permissions that they
have so some strategies for Bridging the Divide you want to go for communication collaboration alignment and shared objectives you start off with alignment uh you want to define a unified vision and ensure consistency and goals across the teams like I've already mentioned you want to have shared objectives like having measurable Mutual goals sharing uh sharing successes and tracking progress and everything that you're doing uh and establishing clear and open communication channels encouraging feedback and fostering transparency among all silos of the organization right so at the end of the day you just need to encourage cross functional teamwork and use everybody's diverse expert for these sorts of solutions and developing and implementing these systems so some of the key risks are
obviously unauthorized access and stolen credentials or misconfigured permissions can allow these people to gain access to systems so that can lead to data xfill and you don't want that obviously API exploitation is another common one because at the end of the day a lot of these llms are just you know apis they're just rappers at the end of the day and you don't want that to be disrupted uh and the ancillary systems are also a very important piece of the puzzle because there's a lot of you know databases cloud services and network components that go into using llms and all the data that they have access to to perform tasks or different objectives in the Y
so you should go with the four pillars of access security so start with user based access controls as per usual you need to use actual you know multiactor multiactor authentication I know Microsoft just enforced it for Azure and everything but that does mean that you should still go and configure conditional access um you should use network-based access controls like you know firewalls just your usual segmentation strategies that you have for uh ensuring that only specific systems and specific people have access to these sorts of resources um you should be re rotating your API Keys uh on a regular basis and using some sort of key management system like uh Azure key vault is a good one um
but really at the end of the day you don't want those leaking because that's just a bad day for everybody involved and you want to protect your supporting systems like your authentication servers your logging and monitoring servers and everything that underpins the overall access security infrastructure and just maintains the Integrity of all these systems use principle of least privilege uh role-based Access Control conditional access policies uh just in time access is always a good thing to have on for these kinds of things and privileged identity management uh and really at the end of the day you also want to be securing the llms themselves so use uh input sanitization to prevent prompt injection you can
actually use other llms to uh verify like attributes and verbiage of different prompts uh so you can protect against prompt injection attacks that way um you want to prevent data leakage so if you're going to train your own model uh you want to use differential privacy techniques uh to protect the data during training and just anonymize it overall right uh you want to harden your output handling so just like how we did with the secure prompt design piece you want to use llms in such a way where you can detect both the inputs and the outputs and search for specific indicators of attack on each of those and over Reliance is also a big piece of the puzzle because at the end
of the day uh you don't want to be completely 100% reliant on these things right they're imperfect that's quite obvious so you want your workflows to be humanid in the loop so some extra pieces for uh security measures for specifically for your models uh you want to validate your third party data sets if you're going to use those otherwise you know you don't really need to worry about that if you're using your own internal ones um you want to employ Version Control and hashing just like how you would with any other application security project uh regularly retrain it with cleaned up data learn from different outputs that you get depending on what data you feed
it what formats you're using how much you're giving it of One Thing versus another uh encrypt your files and weights don't you know if you have an internal model that you're training you don't want any of that extra stuff to end up you know in a GitHub repost somewhere uh deploy behind apis rate limiting quotas rback and everything on that because you can you know you can rotate your API keys but at the end of the day if you are giving too much permissions to those apis you don't want to have to deal with the ramifications of that later on uh don't give llms access by default one thing I like to do is if you
are going to be feeding an llm data you don't want it to have default access to go and search in a data set you want a function of your application to be giving it the data that it needs to perform the task at hand rather than having it Go and search for that data itself um test for biases into toxicity obviously you don't want the outputs to be something negative and secure the supply chain libraries you use depend pendencies uh any vulnerabilities do your regular you know application security stuff that you would normally do against these kind of things so another piece I wanted to touch on was the importance of the prompt itself so you want to frame the
task and this you know defines the context and focus for it and you don't want it to be so open and Broad that it could give some an unexpected response you want to be definitive and excisive and all that kind of stuff and if you give it um these instructions in a specific style that's going to help guide it towards the output that you want so if you give it you know like I said decisive things uh you will get a decisive output but if you're giving it something that is kind of you know wishy-washy it doesn't really have a clear goal in mind for it it is going to give you different responses each time
and you usually you want it to be some sort of definitive thing and give it a Persona uh literally if you go and tell a model to act as a cyber security analyst it will go and do that it will give you way better outputs it will give you way better uh information and it will do way better analysis of an alert that you throw at it versus if you just kind of shoved it in there and said take a look at this and ask for step-by-step logic you can actually input extra parts in like an API call uh you can input extra parts to sort of how do I describe this you can input
extra parts to give it sort of a step by step of how it would respond it's called f shot prompting you can give it those as examples and kind of teach it how to respond the way you want or how to analyze something that you give it and just keep refining your props at the end of the day if you're getting some result for uh one versus another find out where those differences lie and just you know act on them and at the end of the day a lot of Defending AI models has to do with the infrastructure that that surrounds them so keep them in sandboxed environments Whit list IPS don't just buy default have it open and then
Blacklist stuff later enforce strict access controls and Implement consistent patching and config config management for any servers that you're deploying with this sort of stuff you know just the basics um do your data validation and integrity checks just like any other big data sort of application you would use sanitizer inputs sign and hash your models to detect alterations and employ regular Integrity scans and version tracking for all this kind of stuff go back and you want to have the ability to respond just how you would in any other case if you got an EDR alert you want to have the ability if you got an EDR alert you'd want to have the ability to respond to
it immediately and you want to do the same thing with this sort of stuff have the same controls around it as you would any other endpoint any other workstation or server and your network environment and really just treat it like any other server at the end of the day so at the end of the day uh security needs to be a business enabler and you don't want to be the department of no you want to ensure that everything you do is promoting some sort of innovation and instead of saying no just be like yes but because there's always an alternative way to help people out and get what they need done while still being you
know conscious of security so overall Implement leas privilege uh ensure that users and especially systems only have the minimum permissions necessary uh use private endpoints and segment your networks just like any other sort of thing protect the ancillary systems uh databases storage everything that's holding up a model that is doing a business critical Activity one of those goes down there goes the entire model so at the end of the day it requires collaboration between security and it to get this sort of thing to really a production ready State and bring it all together where you can actually you know have confidence that it's not going to go anywhere but uh yeah that's all I've got
for you guys if anybody has any questions