← All talks

Not Another AI Talk - Sarah Connelly

BSides Bristol19:3936 viewsPublished 2024-01Watch on YouTube ↗
Speakers
Tags
StyleTalk
Show transcript [en]

talking about AI hi hi everyone thanks very much for coming thank you thank you you hav you hear me talk yet um so thanks for coming along um so the reasoning really behind this and especially the title for the presentation um we've he a lot about AI this year um it's it's become a bit of a Hot Topic bit of a buzzword and um it its promises to do everything from Save Humanity from sort of climate change destroy Humanity um it will be the you know the next technology that's in everything and while some of those promises may become true uh I I would argue that it's not true for all of them uh it also puts us

in a very difficult position as security people because we're essentially trying to learn how to manage a brand new technology but this is more of a call for everyone to relax a little bit um we don't need to worry about it as much as we are and in fact a lot of the things that we're already doing we can continue to do even with AI so a little bit about me um obviously uh I've been introduced hi I'm Sarah um that the name isn't a typo I'm not Sarah Connor so cannot provide that level of knowledge on AI um I'm currently uh working as a lead cyber security specialist um so I work primarily with

developers on uh secure design principles um giving sort of best practice advice that sort of thing um I also have uh the secure software like cycle professional certification so really anything to do with software information security that's my wheelhouse that's my interest um at the bottom I am very introverted but I have ADHD so the talk might run over hopefully not um but come talk to me I'm very friendly um please approach me because I'm probably not going to approach anybody else but I'm more than happy to have a chat about any of the stuff we talk [Music] about okay so rough agenda for today if it stays on track um first of all um I'm

going to spend a bit of time defining what I mean by Ai and the reason I've spent some time sort of specifically calling it out is there's lots of different definitions for how you might describe something like AI um there's a lot of crossover between machine learning Technologies this is intended to be sort of a principles based talk I'm not going to get into the Technologies and I also don't have the time to get into the philosophy of when a machine is actually human or whether AI means true sentience so I'll give you some framework to understand what I mean when I'm saying AI in projects currently we'll then go back to basics um so we'll talk a little bit about

secure software principles things that we currently use today in order to sort of guide how we secure software I'll then talk a little bit about how we can apply them to AI so when we're actually looking at using something like AI in projects in businesses what we can do with what we already have and then I'll talk a little bit about unique AI risks because there are a few that we need to be conscious of um but they're not as big or complex as they might for see so on to the definition um I've tried to break it down into three broad areas as I said this isn't from a technical standpoint um so I'm broadly

looking at the logic that the application will use uh whether it needs training or not and what the outcomes of that particular application will be and I've handily broken them down into three here so we've got traditional application so this would be um if you're programming an application um it's going to be very linear you go through lines of code one by one uh it's going to be static unless you reprogram it or change it in a significant way it is always going to follow those same commands one by one uh unless you have timing errors or things like that it doesn't require any training it's built to do the thing that it's built to do

and the outcome is defined you can't really program something with traditional programming languages without knowing what you're after um because you won't be able to tell the computer how to do it essentially with machine learning however um what you've got here is evolving logic so these are your kind of algorithms that are being uh refined to do something um that essentially we don't necessarily know how it's going to do it but the outcome itself should be defined we should have a really clear sense of what that algorithm is is intending to do and that's how we tell it essentially whether it has completed it successfully or whether it hasn't and AI um moves more towards the

human Spectrum where you've that ongoing need for training it's got evolving logic hopefully the same as a human um but it's unpredictable so for tasks where we don't necessarily know what we want or we want to provide some information but we don't really know what the output should be AI is quite good for those tasks because it does have now the ability to be essentially not necessarily creating new things but presenting them in a new way or combining them in a in a novel way so back to basics I think we've all seen these um information security very basic principles you're keeping things nice and confidential um keeping things correct and keeping things available um

but to break those down when we have a traditional application so when you are um defining sort of a component for an application what you're really checking is you have assurance that it's working correctly and you have some way of checking up on it reporting monitoring controls I'll break that down in a little bit when we talk about how to apply to I but in general those are the sort of controls you're after when you're making a traditional app you also have your human controls so in an organization obviously hopefully you have users that are going to be using it uh and they are going to need some limits on their behavior so very much

like um uh you can't Define essentially what humans going to do with the system um if they have access to it they will probably do it if there's a big red button saying delete all my data they might well press it so access control is limiting what they can do and when they can do it and then you also have a really interesting principle of psychological acceptability which I'll get into but essentially humans are really bad at doing things if they don't think there's any sense in doing them so I'll talk a little bit about traditional application controls so first of all you have redundancy principles so if you're defining a traditional application for business you want to make sure that that

application is resilient against things and one way of doing that is redundancy so if you are defining an application where it's got a particular function it needs to complete if that function is unavailable what happens how do we prevent the system from breaking if that thing isn't available so for AI what we're really talking about is resilience you need to make sure that if you are incorporating AI into your projects first of all it's resilient against um either uh essentially chaos testing is a great way of doing this but if if there's unpredictable input or output or essentially the system is behaving in an unpredictable way what happens what does it do does the system continue to

function as expected low testing is very similar to that so um if you want to just hammer it with data see if it breaks see if it does things that it shouldn't do and then maintainability um if you're going to be keeping this as a component in your application for a very long time can you actually maintain it is it going to continue to function the way you expect it to over a length of time you also have equivalence so um this is another component of redundancy and this is going to be quite important for sort of large organizations or lots of traffic if you have uh an AI and it's processing information any change in the

way that it's been trained or the logic it it uses to form its decision making um could in could influence the output and if you have something that say um a spare version of it so you've got some redundancy in place any changes to it may drastically change the outcome you you don't want a situation where component a is say swamped with information so component B Takes Over It has very different learning and suddenly you have very unexpected outcomes from that process you can also look at things like containerization so any way that you can get that artificial intelligence to essentially be replicable easy to use Easy to deploy uh will make your lives a

lot easier so fail safe um pretty basic um principle for information security if something goes wrong you don't want it to be less secure um than when you started so an example of this is if you have a two-factor authentication system and that two Factor authentication system isn't responding or is taking too long to respond you don't want your login system to say oh actually you know single fact is fine it should prevent the user from accessing it so you don't have a less safe outcome the problem when you're incorporating AI into these processes however is it's really unpredictable outcome as we've spoken about so what you need to do is try and identify some

failure States but they're going to have to be rather extreme so um if you're expecting text output from a particular uh algorithm what you would want to do is set an up a limit to it if you're receiving a report from it you're probably not expecting up to like a million words hopefully um that might be a lot of processing so at the extreme end you can put some boundaries in place where at least the output is not completely rid ridiculous or will fall will cause your system to fail there's also recoverability so if something goes wrong how do you get it back to the way it was and for AI you really need to be thinking about how

you're going to replicate the output from it if you're developing an AI to do a task and you cannot do it manually that is going to be a massive risk to a business because if it doesn't work how is the business going to continue to do that function especially if it's a critical function um we need to make sure that we can continue to follow the logic even when it doesn't operate even if it does continue to operate that AI you're basically beholden to any updates it needs any kind of Maintenance and if it becomes very expensive to run if you can't do it manually you have to keep doing it and then finally data validation um

so where you have data that is either going into it or out of it you want to make sure it's as expected and this is particularly important for AI projects because you're not necessarily aware of how it's going to send you back that data what if one day it decides to send you back numbers rather than um words and letters so what you can do is have validation and I'd argue in line with human processes If You're Expecting input from random users use the same controls that you would with an AI it's unpredictable could be free text make sure it's in the format you expect and make sure that you're rejecting anything you don't expect on to the human controls because

they also apply to AI uh first of all you've got identity so where we have human actors on a system you want to make sure you know who's done what and the reason for that is obvious obvious if something goes wrong you want to be able to find the person that did it and ask them what went wrong for AI particularly if you have multiple versions of AI running or if you have multiple components that use AI you want to have some way of identifying the particular instance that was running at that time and you need to have very strict version control if you don't have strict Version Control and as the logic evolves it's making different decisions

how are you going to trace back and find out where that error could for authentication um essentially it's it's going to almost be running as if it's its own actor on your platform so really you need to have some way of validating it obviously you can't really give it passwords because as we know AI will tell you things that maybe it shouldn't if you have a chatbot for instance so instead of things that it knows we should be looking at uh other authentication methods Integrity checks can we prove that that process is running as expected the input's not coming from saying external Source um are there certificates that we can use to validate the exchange of data that

sort of stuff uh finally the big one authorization so we know we know what it is that is acting on our system or who it is and they've proven it um but how do we check what they can do so for particularly for uh AI we should be checking every single action that it takes and granular access controls will be particularly important the reason for this is if you have um if you have ai in a project and it's doing things within your system or making changes to your system and something goes wrong the ability to withdraw access for doing a particular function would be quite valuable because it prevents maybe other changes on the system or other

unintended impacts you can just withdraw access to the Troublesome part while you're trying to issue a fix otherwise you have to essentially stop it from functioning all together while you work out what's gone wrong uh separation of Duties um these may all be very familiar to you so separation of Duties making sure that one individual actor can't do everything and again this is really important for AI because uh first of all AI can have very unanticipated functionality uh especially if you're asking it to do multiple things it can get very good at doing one thing currently but if you're asking it to do multiple things sometimes it can do things in unintended ways and that's where you have a risk of

emergent functions so it doing things that weren't intended by the original sort of use case for it so really we could look at separating out instances of AI if you need to use it for several functions have a different instance for each of those functions and make sure that they are segregated they don't share permissions they don't share the same access and then finally psychological now I know I said I wasn't going to get into like the uh philosophical area of debate um but there is an instance in the future where AI does start to become more sentient or um at least displays a bit more sort of autonomy than it does now um there's there's a massive topic

discussion about AI safety it's its own discipline I cannot go over it in 30 minutes um but you can look at AI safety for looking at how to correctly motivate artificial intelligence particularly when it becomes more autonomous to make sure it's achieving the right outcomes that's not one we have to worry about yet but maybe in the future so I did say I'd talk a little bit about AI specific risks because they they are an important problem and I don't want to I don't want to people to come away thinking oh AI is nothing new um because there are things that we need to consider as Professionals in the industry when we're dealing with it but

as you'll see they're not as big as uh you may otherwise think so the first one is data quality and confidentiality so as I mentioned AI if it has access to information will use it when it's making decisions and this is primarily your algorithms that are looking at a statistical outcome so it's it's just Computing what statistically is most likely for the for the outcome for what it needs it will use anything at its disposal um there have been trials where they've been looking at um for instance recognizing tumors in uh X-rays and the AI was using file name data or font data within those uh x-rays to identify essentially the hospital that was running it and hospitals with more

patients are more likely to have positive X-rays and essentially it was just very good at identifying where it came from rather than whether there was a tumor or not in the image so as I said with that one data quality and confidentiality OB thecate it if you don't need it hide it if you can uh for testing and training so this is an important point and there are now tools available to analyze the training data and the test data that you're using for these models because sometimes these data sets can be massive if you check your contents of the data set you need to make sure that when you're training your model it's not also

in your test data very often these large data sets can contain information that is sort of shared between multiple data sets if it's already seen it and you've trained it it already knows what the outcome should be so it's not really presenting anything new the problem with that is if you have a success rate with your test data about 80% but 50% of its things it's already seen and been trained on it's not a validation of how it's going to work in the real world and very often companies will find that once it's out in the real world its success rate is an awful lot lower than it should be and then finally this is this is

probably one of the biggest ones is unrealistic expectations in business so I'm sure um over the last sort of six seven months we've all seen stories where it's like they've tried to use AI for things was it really the best use of AI or were they just trying to do an AI project there's a there's a risk that businesses will run away and just find a use case because they want to be seen to be doing AI or having an AI project so the question you really need to ask in those situations is Okay so we've got an idea for how we might use AI are there better Technologies available is there another way that we

could uh essentially program this or or manage this very much in the same way that we had a run of blockchain Technologies um over the last sort of decade it's it's an excuse to kind of use the technology and it's very exciting but it's not necessarily the best use case for it so just to go over you don't need to reinvent the security wheel that's the kind of the biggest takeaway I hope you take from from this chat um we already have some very very good principles for secure software engineering in place and using them just maybe looking at them through a slightly modified security lens is perfectly fine and in fact it's probably going to make your lives a lot easier

when the business come to you and ask you to start using these Technologies or maybe they already have there are some challenges with how we are going to be incorporating these models into applications in the future but they're not insurmountable they're not going to be incredibly difficult as long as we've got the time and resource to look at those challenges and we're not getting bogged down with Reinventing access controls or separation of Duties principles and it can be revolutionary there there's a lot of very very interesting applications for artificial intelligence particularly i' I've seen some I think there's a talk today um about how we might use it for sort of threat hunting there's going to be some

really really interesting use cases that we can start to develop and it's going to be able to do things much quicker than we can but it's please remember it's given the right use cases and um it may be that when you're talking to the businesses that you work for um sometimes maybe the hardest conversation about AI will be trying to convince them not to use it so I think I've actually run ahead which is great that doesn't normally happen um but that means we've got plenty of time for Q&A um but if you don't want to ask me anything right now you can contact me on Discord or find me today uh as I said I'm very friendly um

I'm just you know a bit of an introvert so please come talk to me any any questions from anyone I'm not sure if the stun silence is is a good thing or a bad [Music] thing