← All talks

Model Robustness Isn't Security

BSides Las Vegas · 202254:5731 viewsPublished 2022-09Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleTalk
About this talk
Sven Cattell argues that model robustness is often conflated with security in machine learning discourse, but the two are distinct concerns. The talk examines formal definitions of adversarial robustness, demonstrates the accuracy-robustness trade-off, and argues that for security-critical applications like malware detection, general accuracy and proper system hardening matter far more than adversarial robustness. Practical security layers—data protection, monitoring, defense-in-depth—outweigh robustness investments.
Show original YouTube description
GT - Model Robustness Isn't Security - Sven Cattell Ground Truth @ 18:00 - 18:55 BSidesLV 2022 - Lucky 13 - 08/10/2022
Show transcript [en]

hey good afternoon welcome to b-sides Las Vegas ground truth this talk is right oh the slide model robustness isn't security by Sven cattle a few announcements before we begin we'd like to thank our sponsors especially our Diamond sponsors LastPass and Palo Alto networks and our gold sponsors Amazon Intel and Google it is their support along with our other sponsors donors and volunteers that make this event possible the talks are being live streamed except for underground track and as a courtesy to our speakers and audience we ask that you check to make sure your cell phones are set to silent or vibrate if you have a question raise your hand we currently don't have the wireless mics active so you will

have to like speak kind of loud and we'll just have the speaker repeat your question um there's interference from the mics in another room so that's what we got to do um as a reminder the b-sides LV photo policy prohibits taking pictures without the explicit permission of everyone in frame these talks are all being recorded except of course in underground and will be available on YouTube in the future we would like you to please keep your masks on at all times and if you need to move closer to people to adjust your view or hearing please also respect social distancing with that let's get started welcome Sven thank you hi um so I'm sankatel and this is my talk

on model robustness and security um okay uh so I'm giving this talk because there's some movements in the policy space and a lot of academic thing that is claiming that model robustness is absolutely essential for security and um I know of No One Security Company who actually thinks model robustness should be included in security and dozens that think but this is pointless um but because there's a lot of loud people that are arguing that this is security yeah this is a thing so about me um I founded a startup sort of in this space doesn't actually relate to model robustness um I've got a PhD in algebraic topology uh from Johns Hopkins and a postdoc in

geometric machine learning um I founded the AI Village uh six years ago we're going to be at Defcon for the fifth time this year so um I used to work at endgame elastic on their large large malware models um like what Twitter is a kind of mathematician you can find the slides on GitHub so um because I'm making an argument that is a little thing here I have a lot of slides you can see there's uh 55 of them um so some of these slides that are here mostly to cover my ass in the argument space that aren't necessarily here for the narrative um so this is mainly a argument about definitions because things this is a

there's a technical definition in the Layman's definition and they're a bit different so the Layman definition that is here for um everyone uh it is here open AI other serial examples are inputs between machine learning models that the attacker has intentionally designed to close the model to make a mistake they're like optical illusions for machines and adversarial examples are special for tensorflow adversarial examples they specialize inputs created with the purpose of confusing a neural network resulting in this classification of General input this sounds like a legit thing that we need to be worried about we need to worry about adversarial examples from these two definitions because specialized input to that causes misbehavior is sort of the key for a lot

of security a stack Overflow or use after free bug that's specialized in you to actually take advantage of that that specialized input all the Metasploit is specialized employee so that sounds like a serious security vulnerability but um these are machine learning models so any model error because these are the statistical models they are not guaranteed any model can fit the definition if you've kind of squinted it uh this is not actually close to what the definition of a practitioner is who actually work in the space mean and if you use this definition people can sell you snake oil and even if you don't use this definition people can sell your snake oil and snake oil is being sold in

this space um the definition for a robust model and the thing is there's two different meanings of the word robust but this is what uh gets thrown around for atmosphere robustness and the other robustness um so it's sort of your model is good and the problem with that definition is both of those definitions basically mean your model is good you haven't you trained it on enough data that it actually generalizes well um and but the problem with this model is they get to check that it's robust by wiggling some minor parameters and not actually testing on proper stuff so these are the definitions I'm hoping you walk away with at the end um and we're going to go through them Point by

Point um so if you want go look at the slides online um but we're going to get started what is a neighborhood so um this is the definition of a neighborhood that I use as a mathematician the first one the so you see there's two different ones that look almost centrical one is the L2 Bull and the one is the L Infinity ball um and the L2 bull forms spheres that are nice and neat that you are familiar with and the L Infinity ball forms cubes and Hyper cubes and Sears and hyperspheres um but that's the definition that I look at and think Sears and cubes um but we can't show that to um executives

um part of the thing is neighborhoods get really complicated in higher Dimensions so here there's a big sum over K elements and that K could be 3 000 or more in the case of machine learning and that gets really funky so I'm going to show you why it gets so funky this is a classic problem that you give to little kids uh you know if you want a rope that goes around the earth how much longer than the circumference of the earth do you need to make it it turns out it's just 2 pi feet longer to make it sit one foot off the Earth and if you give this to a little kid they

get confused and tell you it's going to have to be much longer than that um also with dimensionality um you've got the scaling laws um if you are the if the height of a 3D object um the volume of the 3D object grows with the height is the square of the height and the sorry the surface area of a 3D object grows with the square of a height and the volume grows with the cube because of that large mammals have difficulty with heat so elephants have big flappy ears to combat the squaring law in dimensions and it gets more much more complicated than this this is a a puzzle that if you ask at a math

conference and people have done this and pulled the mathematicians of like what happens they get it wrong if you build this thing where you get a cube you put eight little spheres in each of the corners and then you embed a sphere in the middle that's just touching them like this 2D example and then you get a hypercube where you have 16 Series in the corner and the little guy's touching in the middle um at 10 Dimensions the little guy sticks out the sides of the of the actual Cube and in when you as you grow the size of it the little the circle in the middle eventually becomes infinite volume and get so dimensions and things get

really weird and even mathematicians who are trying like we give each other dimension puzzles all the time and you can present this at a conference and you're going to get a lot of wrong answers um so dimensions are weird and here's how it kind of relates to the security space so mnist is a data set that everyone in their uh in machine learning tests their uh models against it's been sort of a benchmark data set for donkey's years Jan lacun released it in early 90s it's 50 000 digits and it's each image of that is a 28 by 28 eight pick uh bit pixel image so each uh thing is 256 values and there's 784 Dimensions

because 24 squared and so you get that if you just bury the pixel value either up by one down by one or zero so you have three options so you can do change each pixel you can change is three things there's 784 so you get 3 to the power of 784 and that works out to be about uh 12 well 1200 bits of information in terms of like the bits information that where cryptographic key would be so Dimensions get really weird for even really small machine learning problems and this is like the tiniest the you know this was sold back in the night in the in the 90s this isn't a problem these days so

what you really need to understand of like what is a neighborhood and what this power this relates to machine learning is machine learning operates in high dimensional space and the volume of the space grows exponentially with the mention um sort of um there's little white lies in everything I'm saying but it sort of grow it grows exponentially and this sort of shows up in machine learning over and over again in the cursor dimensionality if you have a clustering algorithm and you've got and you want to prove that it converges well pretty much always You're Gonna you're gonna say well I require 2 to the D amount of data in that scaling Factor to prove that this converges so that D is

the dimension to converge you need an exponentially growing amount of data to actually get it to prove that it converges so that's the cursor dimensionality and you keep showing up and over again so this is one major problem that shows up all the time in traditional machine learning and since we don't make proofs about deep learning it doesn't really show up it's not really spoken about but it does show up in the stuff that we're talking about today so um now that you kind of get a bit of a idea of like something about the geometry of the space um we're going to talk about adversarial examples and how and you'll see why this relates to what a dimension is in a bit

so this is the um I think legally required image for um adversarial examples this is the panda given image from the second major paper from Ian Goodfellow Christian strategy and a bunch of other authors that I do not remember the names of um and he built a cheap way of producing atmosphere examples that you could get a panda add this formula here that's designed to fit within the out of the Epsilon ball uh then a little ball around your point um and produces this way of confusing a neural network this is the actual definition that they're working with uh so if you have a point and you've got the you know in this you've got a decision boundary that

comes down the thing you've got some green points you've got some blue points and you have uh this red area where this thing so that X is the point we care about that we put a little bowl around it and because the decision boundary is so close to X there's some points on the other side of it uh because the dimension a recursive dimensionality this is always true basically um so this is the definition that we work with the the adversary examples are things that are close to my point that are across the side the dimensionality thing uh decision boundary so the actual definition of adversial example that people work with is that and that's kind of complicated

um basically it just means this it's like I want to there to be no point an adversarial Point example is a point that's on the other side of the decision boundary within my little Epsilon bulb so that looks all complicated but it's not that actually it's not complicated you can code up but what that means in Python but in math that's how we write it so for robustness um you have you use that definition and you sort of like ask that no points um there's no it's artificial examples within an Epsilon ball of any of my points so you see the math definition on the top and all that means is all my points are

without outside of this circle so down here there's that red point that is the only point that violates this definition so if we didn't have that red point this model would be robust so that's all we all we're asking is like the decision boundary is just outside of this little circle around my stuff and now um so here I'm going to get into the issues so the first major issue is data just moves um so once you train a model you get this thing data moves um You release my data a while I think second issue is it's impossible to check because of the volume stuff it's you can't do it third issue it's impossible

to make an adversarial robust model in most cases in the cases that we care about as security people third issue oh fourth issue as low as accuracy so first issue uh data just moves so I have a model here when I've trained it up and I've got a bunch of green points and a bunch of blue points and I trained up a model and it came the model came up with this decision boundary so everything on this side of that line is going to be classified as green everything on that side of the line is going to be classified as blue and you know that looks like a pretty good line and it's for some reason it's just kind

of got gone this length up because models it doesn't really have data over here so it just made something up but um and at this point uh I needed I on the deadline I have to train up to this point and then I have to deploy and send this to my customers so I get points from a month later and these are my new points these are uh ones in these boxes and you can see the decision boundary is bad the the model didn't predict the model the data from later in the month is going to be uh over here it just kind of made it and it's got misclassifying all these green points so if your model was robust before you

go you're good but your data moved and it moved in a way that your model couldn't predicted and doesn't know about so that didn't help you the robustness doesn't help you if the move data just moves in a way we have this problem so this is some data that I um from when I was working at elastic we published this at icml but you can see is a very interesting thing that happens um here so what what this model is I had trained up a the production model that we released um sort of a smaller version of it that for just testing purposes and I trained it on all data up to January 1st uh

2019. so he doesn't know anything about the future it thinks uh the who it has only got data from the before the new year of 2019. so it predicts well you can see the false positive rate as I do a historical analysis it's good it's very low it's really low for a model so and that's what I care about I don't I care about very low false positive very low false negatively that's my accuracy and then the very low error rate overall but as you can see about three months after I released the model uh three months after you know this one wasn't deployed but you know three months afterwards there's a spike in the

false negative right what happened was uh there was some malware family that figured out a bypass that became prominent and uh showed up so there was a bypass for a malware family and in our case the ulcers know what the malware authors know what virus total is they can check if they're they have a bypass they can check if they've got um or if your data they their malware is correctly classified they can check if the AV vendors and the EDR vendors are going to catch them um and they do that and so they will look for areas that aren't in their space and they will find a bypass and you can see I later on

uh in after over a year after a this thing models thing it's got this massive Spike which would be completely unacceptable um this is not percent this is a thing so this is a massive Spike that would be completely unacceptable um you you can't deploy a model that that's that bad but that robustness wouldn't have saved this the red line is the amount a measurement of the drift uh the green blue and orange lines are the ones that you really care about in this image um another thing issue with robustness it's impossible to check for robustness so getting back to the demand the thing if I'm uh if I want a robustness radius of one pixel

uh my the number of bits of information for an L2 ball is 10 that means it's equivalent yeah you you've got to guess 10 uh two to the 10 different things in order to test all the values you have to go through two to the ten different things to test all the value but when I've got an L Infinity ball which is a lot of these things are claiming a robustness to I have to check twelve hundred two to the 1200 and that's not you know the uh and this is just for emness this isn't for a real model this isn't for real images this is this is just emness um and when I put yeah increase the size

of the bowl for L2 Bulls once I hit seven well that's that means this checking a L2 ball of seven pixels around a value of seven value values around a point in an emness model is takes more has much more bits than an a breaking a 256-bit encryption key so it's you have to Brute Force like AES 256-bit encryption and that's easier than checking seven pixels away from mnist um the radius is normally normal smaller than 70 not seven so here's that there's a line what I told you um this is a slice of a of how a neural network actually perceives data uh this comes the idea from this comes with the a field of math called the tropical

tropical geometry um if you know your cryptography and your finite Fields this is the geometry of the field of one element um and it relates to neural networks um so this is actually sort of how what you have to check instead of the just the bit pixels and um what I care about is if I have my image my point is going to be in one of these uh sectors so if there's a solid solid color Polytech my example is going to be in one of those things and when I cross the boundary to another color image that that means that the classification could have changed so sort of the decision boundary of a neural network is going to be contained

in the edges of this image and this is a slice of the emness of a neural network slight trained on mnist um and you this is just a two-dimensional slice of a 700 of a mini thousand-dimensional space um so it's much more complicated than this and if I want to be completely honest about my estimate for the number of bits I have to check using the geometry of the neural network information about the geometry of the neural network it's actually more like this so it means using all the math to reduce the problem space as much as possible to save yourself as much time which is what cryptographers would do if you have a really bad way of generating a key that

actually doesn't have full the full 256-bit encryption uh bits of entropy um it only has 90 bits of entropy then cryptographers can take advantage of that because they know math tricks well if for neural networks with you know the geometry of it you can take advantage of it to save yourself a lot of time and um with that with using all that well you don't get to Seven you get to 10. so even with the math tricks you don't you don't get much further so another way of making things uh robust is um adversarially training them and this is the algorithm basically the algorithm for making the adversarly training things so you build you can make

adversarial examples in some really cheap ways and just include those in these training data like always train your network on adversarial examples so that it's always going to cut you know to make sure that it classifies the other serial examples correctly so you're going to use the fast gradient sign method just start tossing in fast screen inside method generated adversarial examples um so what you and that's what these are is R star is the adversarial example that you're going to generate and you're just going to toss it in there and train on both your normal samples and these adversarial examples so the probable problem with this is the first line uh is this line find an

attack for permutation RF star well as we just said well there's that many attacker mutations yeah that's the space you have to search over to find a different attack Prem rotation so that is basically you're going to find one you're going to use your one you're going to use your maybe one maybe ten different ways of generating an adversarial example and you're going to train your neural network to correctly classify that type of adversarial example and there are many different ways to build an average real example it is not just one there are boatloads away so robustness training is a dead end and has been the dead end for a while but there are companies that'll

sell you robustness training for making your model robust um and uh they are scamming you uh uh another issue with robustness this is another way you can make things robust this is a Lipschitz continuous thing so what this lip shits requires is I can't if I give a point I can make this like slash diagram and this like cone of light if you're familiar with um um special relativity so my function has to be within this uh this shaded area uh what this means is if I have a function my function spikes up goes from here and spikes up and goes there it's changing too quickly so I can make my constant small and say

I can't with at this point I am robust is it's classifying these correctly and then if I move right next door I'm over here and I've changed too quickly so that's not good so I'm going to require that my function changes slowly and so for my neural network I'm going to require that it changes slowly and so I'm going to make things lip shits um named after all mathematician and um this isn't ready for production the way they do this is they require regulate or regularization and they put it toss in that your neural network can't do this and they require the the actual way to do this is wrong so this isn't ready for production this

is another way if you have images you can make a thing robust what you do is because my neural network in my adversary example is sort of a noise isn't within the noise bounds I add some more noise and then I remove the noise using a for example diffusion model this is this model here is the latent diffusion model that it's available on hugging base and GitHub and what you can do is if you have an image model just add a little bit more noise and then remove the noise using your diffusion model uh you know add some noise remove it uh there's a bunch of other ways of doing this you can use um high frequency and things from image

processing removal The High Frequency information because the adversarial examples tend to be high frequency information um most of the ways don't work this way probably works um the people didn't prove it but mathematically but it has some really good results and it you know that's um that's the best we can do in machine learning cool so here's the reason why you wouldn't want like you know the first two reasons were first couple reasons or you can't get it you can't make them all robust here's a reason why you don't want your model to be Rose you don't actually want this thing you it's actually a bad quality in many cases so I have my neural this is a diagram I

stole from this paper over here um I have my my green points my blue points and my line separating them and I've classified in this line correctly classifies them all so I then put my little L Infinity spheres around spheres around them my cubes and I'm going to require that this model will be robust and you can see that the decision boundary goes through a bunch of those cubes so this model isn't robust by this definite but from this Epsilon these definitions so my model actually to be robust has to be this red wiggly line that is all it's you know wastes a lot of compute to actually make that red wiggly line is possible to do but you now have

this thing where you have to make this red wiggly line what this means is your model actually beings more ends up being less accurate so this zero is the standard model this would be the uh the accuracy of the standard model something that is one pixel robust is a lot more it has got nearly um four times the error rate of one pixel you know three times the error rate of um a one of a thing when I have a small amount of data so this is three times the error rate for one pixel one pixel value robust two uh two pixels value robust and so on and as I add more data it becomes the

error rate goes back down but I need a lot of you know what if you actually follow this through it's asymptotics you're always going to be making a Trader and this is for mnist for imagenet and cfarm it's worse you require much bigger robustness and um you get a lot you know with a re with an L2 ball of like a reasonable amount it drops off drastically so the accuracy is a huge title so do you want your model to be robust well Maybe not maybe you spend way too much inaccuracy um to actually make your model robust this is um it is in many cases it's probably much more secure to have an accurate model

that generalizes well than to have a robust model because the accuracy is more important especially in malware models than the robot then the robustness because you want you need your false positive rate B to be below the annoyance threshold of your analysts and to accomplish that and their annoying threshold of your analyst because those how many binaries are passing through your network really low so you need a false positive rate of 0.01 or something like that to get an accuracy rate of that low you can't make a trade-off of that much even in the tiny bit of robustness this trade-off would kill the performance of your normal of your malware model so you don't want to make this trade-off for

security data and image data if you have a um you know a x-ray machine would you want your model you know would you feel comfort that your model made a mistake because they made it robust like oh no we could maybe get a challenge attacked by a hypothetical adversary if they got into my network and started messing with like image data instead of just ransomware in the hospital they could you know Ransom with a hospital or mess with image data in a very particular way that requires a lot of expertise well to guard against that weird Edge case some hospitals are spending money on robustness and not spending money on like improving the model thing

there is other definitions of robust but the for security applications it is not the thing and there's other Definition of robust suffer from the same dimensionality problems as regular robustness you the only way to fight more robustness correctly is to get more data we'll get more data more high quality data there's no point in making spending money on an effort on fancy tricks to make your model robust is just get more data even if you have a robustness technique you want more data so don't invest in that invest in getting more data anyway so now that I've said that it's pointless thing let's talk about like the reason why you would want this is you have

you're trying to prevent a bypass and now to tell you about how bypasses actually work so the first major machine learning attack is against spam filtering probably the first deployment of machine learning in security was in 2002 and they deployed naive Bayes for a spam um if you want to learn what naivees are there's a workshop of the village that a friend of mine is giving um by 2004 uh Sophos who were was running one of these things commercially had seen these attacks so people were obfuscating the text they were using the normal thing of putting uh HTML percent characters instead of actually writing out the text so it would be rendered in the HTML engine as proper stuff but it

was actually uh you know rendered as human legible but it was not legible to machine so they obfuscated the text um that's because they know that the the AI model is a machine thing so it's got to have to read the text um they would also put a small email with just a link like follow this link to your Viagra uh to my Viagra thing um and then they would another thing they would hide the emails in a non-deliverable because the way that was delivered it was sort of a payload inside the email so that the sca the the spam system didn't unopen the payload the same way as it opened the regular email so they would be able to get past

the spam Meetup by basically packing the email which is a problem from machine learning models for malware as well and another the last thing the the most famous attack uh packing in good words so if you have a spam email that's trying to get you to you know give money to a um you know your cousin you know uh Prince um you would then just put a Wikipedia article at the bottom of that because the Wikipedia article is good and the Bayesian filter uh counted the number of good words and bad words and kind of checked on whether the thing checked on uh um how that worked these aren't the adversarial examples that academics are studying these are

someone sat down read the paper on how the Bayesian filters work and had a good think and then made up this these strategies um and that says this is still what's happening today uh spammers uh who are spamming Facebook they every few days their Facebook deploys in the new spam model to prevent things from going through uh spammers get together try a bunch of stuff and then communicate uh figure out a bypass and then communicate it to each other it spreads quickly Facebook now has a model that's out of date they have to redeploy three days you know three or four days later because that's the value of getting spam on Facebook is that high

that the spammers are going to put that much money into just getting 5 000 people in a room that are very poorly paid to bypass the model and once one of them figures it out they communicate and uh they have a bypass they work like this they have guesses for how the spam model Works they have read some academic papers but they don't really understand them they are just trying to guess where it boxes so here's a machine learning attack against a malware model so if you want to learn about HAL malware models actually work you want to look up elastic Embers featurization system from uh piara Madison and Phil Roth and uh so you take a the idea how this works is

you take a PE file and this is the plastic picture of what a PE file looks like from Wikipedia you featurize it into a like a Json bot that has all sorts of useful information like the size of sections the number of bytes the histogram of the bytes the entropy the number of strings the import list all sorts of stuff and then you put it in a high dimensional space so for Ember it puts it in a 2000 the 381 dimensional space 1024 impulse dimensions are just encoding your Imports table because Imports are really important for checking out whether we're doing a static analysis of binary um so uh in 2019 um era researchers figured out a reverse

engineered silence malware detection system so it works the same way so it's got your PE file injected into a 7 000 dimensional space this time instead of three thousand uh 2 300 Dimension space but through the same intermediate the intermediary format and then to a deep learning model and the Deep learning model figures it out whether it's good or bad so that's how the machine learning model is supposed to work but problem that they were having was that fortnite does some shady shots Shady stuff to make sure that you're not cheating rocket League does the same thing these things that you know these are binaries that um you might not launch on your corporate network but Thailand sells to

people and they've installed fortnite on their computer and fortnite should run and it's very easy for the machine learning model to confuse fortnite with malware because fortnite does some weird stuff with process injection to make sure that you're not cheating um what silence did to fight this instead of putting putting in fortnite and Rocket league and everything on an exception list which they'd have to update every single time that fortnite updates they use some machine learning so they normalize the features um and then put it into a centroid thing and they use the concepts from earlier they put a little neighborhood around the point that represents a fortnight that they thought and this and if you are a piece of if

you're a binary who gets featurized from your PE space ends up in that centroid for fortnite they will let you run so instead of letting the machine learning model figure this out and then telling them like hey really don't misclassify fortnite they need to run on our customs machines they use this trick to sort of it's putting on all the Fortnight hopefully Forever on the exception list and they did this for a bunch of other games like rocket league and other things so adidan uh should I uh figure this out and they built a exploit for the machine Air Model that is all over things uh X Machining our model was an exploit this is big news in the AI security space

this is a poster child for the miter exception a minor thing and they didn't build an A Serial example they exploited this kind of stupid um centroid thing centroid white listing system that silence had built um it's not stupid but like you there were probably better ways to do this part of the reason why they had this problem is they included a lot of string data in their malware model and that's not a really good idea as we've seen that that's one of the oldest attacks against machine learning models packing good strings and that's how the uh exploit worked out in this case so what would actual security from reducing online models work at and here's my

opinions with some certificating so first thing um it's still software do the basics so validate your inputs if you're taking in a PNG file make sure that it hasn't got a weird payload that pillow your python library is going to uh run some old versions of pillow which is one of the the biggest things uh the biggest libraries for handling images in Python had a remote code execution exploit and if you're using one of those old uh versions of pillow you can get popped fairly easily with someone sending a image file that's got some issue double pick check your pickles it's really easy to get a you know pickles are pickles are rces inherently and this is how a lot of

machine learning models are distributed on the internet you want to download a model from arguing face it's entirely possible you get there to pickle and the mo to open that pickle you have to run remote untrusted Co you know untrusted coat on your machine but machine learning machine learning people do this because that is the normal way of Distributing a model because models get complicated and pickle is an easy way of doing this uh check your deployments for cves uh so you know scan your Docker containers make sure all the packages you're installing don't have a CV and then check like for example the pillow exploit that you have um make sure you don't have that

um Harden everything so make sure you know if you have a whole big complicated machine learning pipeline for doing inference put uh the make sure that all of that's hidden and there's a one hardened front end point um that they go to and everything else is hardened as much as you can and the last one which is you know uh refrain broken refrain and security is uh secure your S3 buckets uh that's happened um there's been places that have had people messing with their data sets on S3 because they're messed up um this is the institute for the technical Ai and machine learning recommendations these and you can see a lot of these are um

basically um classic exploits just renamed for machine learning stuff so uh artifact exploit injection that's your file uh server side request forgery all these things are things that are normally like normally happen in uh up you know an opsec environment you know Cloud deployment you can get all that sort of stuff happening it's no different from information models don't think you're special because you're using deep learning okay so once you have done the basics um and only after you've done the basics to make sure that your S3 bucket does thing the next thing you should think about is well this is probably yeah before you start really start barking on this your data set is more valuable than any

model you ever deploy treat your models are sort of disposable your data set is sometimes irrecoverable if someone deletes that permanently of your Cloud bucket Cloud infrastructure you can never get it back if you're you know cluster FS messes up you're messed your hose the attackers get hold of your data set they can ransomware this because it's worth more much more than the thing uh you a lot of companies are figuring this out um they're investing in data engineers and data quality is much much more effective of an investment than um investing in machine learning in more data scientists but this doubly applies to security make sure that like if you're saving a data set write it to a

read a write only S3 bucket that has no ability or to overwrite data and then use a provision a historical system so that you can't a hacker cannot go along and delete all data just have version you know versions of stuff and then make a pipeline that picks up the right version and does the right thing um don't you know if you if you can't if they if they have the ability to delete your data that is or mess with it even subtly you can get screwed um another thing this is another thing that machine learning security people understand very intricately well but um data scientists don't necessarily um because they have it's not their

environment and they shouldn't have to um he needs layers in the security if you just have a machine learning model checking with your PE file is detonating you're now vulnerable to living off the land attacks or You're vulnerable to all sorts of things you should have a dozen things uh you know you know money layers of security to prevent you know the you phishing protection um fishing protection to prevent the PE file from getting on your system uh frequently uploaded uh things for that own and signatures to make sure that you are catching all the the stuff that's really important um that is recent so you don't even bother your Machinery model if IMO text

dropped on your machine and you have the hash for it and you're like okay cool just don't run it don't even bother wasting compute on this thing um rules observability you know machine learning con thing and if you're deploying on a self-driving car well well this is one of the classic examples for adversarial examples well what if they put a adversarial sticker on the stop sign okay what if they just took out the stop sign just removed it they don't need to make up a very intricate difficult delicate attack with a sticker on the stop sign they could just take a hassle uh and remove it uh so you should be aware that stop signs might not be there and maybe you

should have a database this would require geofencing and things oh cool here's all my stop signs of the entire city and I have an agreement with the city that they're going to tell me when they install new stop signs and traffic lights and they're going to communicate between my car and the traffic lights and we're gonna do double checks to make sure that my Tesla or to other self-driving car is aware of how the traffic is going to be thing independent of the road sign conditions so that if there's snowing or hail or a tree grew in the way you've got a backup of a database that's a layer of security in your machine in

your self-driving car that is probably more important than serial robustness and there's so many other layers of security in your self-driving car that you can probably put in there other than adversarial robustness um last thing another thing don't deploy and forget they are models that are uh companies do not like reach training very expensive models there is a you know if you have a large language model and there's a vulnerability that large language model might have cost a 20 million dollars to train and just compute it might have cost the the amount of the CO2 Vegas generates at six months to train that model by itself they don't want to redeploy that so they don't thing and there a lot of companies

don't even think about it they train a model deploy it and they're like cool we're done we have solved spam forever uh and they have there's the bypasses or they've sold La forever you can't just sit there when you're deploying a product you have a responsibility to your customers you have a responsibility to your people that you're going to monitor it and make sure that it's still working that there isn't a bug and machine learning models have a whole new range of bugs that you think back to the thing you could have a bypass where like a new malware family shows up or this new spam family or your self-driving car doesn't recognize that new other car because it looks kind

of funky um it doesn't recognize that as a car because it has weird tail lights and it suddenly start you start start getting a bunch of crashes into it um you can't just sit there and like think that you're my any model is going to perform well over time you have to monitor it and this is really hard you um in security you do not in in the real world you do not have access to the labels I made this graph knowing all the labels uh post-purori at priori you have no idea what the label is so how do you know whether you're making an error if you just have the estimate I have my

prediction I have a I don't I don't have a label so I can't check whether I'm right you have to build estimates and complicated systems and this is a huge investment to do correctly um but we need to make those Investments um in the last um well penultimate a point is you have to understand your attackers um so attacking a self-driving car your attackers might just want to embarrass you or they might want to like uh how you know find a bug and now they've got a you got to try to get a bounty out of you but they're just trying to embarrass you and get a bounty there's not that much value in confusing a car so it's not

going to be a widespread attack it's going to happen because people if people are going to want to embarrass you as a software if you think um but like it's not as valuable as detonating getting ransomware on a Healthcare in a hospital a hospital has to be running they're going to give you the five million dollars to unransomware the hospital you go to immediate monetary value in bypassing the other systems for the security systems for that you have to actually do the threat modeling you can't just assume that you're going to be attacked in this one way because it's a sexier way of because you really want to do the math to defend yourself because I want to do that research and

so I'm going to make up a threat model so that I can do that research in the industry you're in Academia I think you should be doing Research into model robustness is important and wool is a probably a breakthrough in this thing might be needed to bring the machine learning model you know the whole field of machine learning forward but it's not a security problem the last thing is be a moving Target and this relates back to like the drift uh but don't just retrain a new model you have to make new features you've got a security onion you have to make new layers of that onion you can't just sit there because the security

people are going to innovate um but you make new features for your model don't just sit there on the model um one company has a really nice way of being a moving Target for their model they have a bank of many many times more features than they actually need and they use a subset every month um and they this is they do spam so they have a lot of Behavioral data so they use a subset of the data of the features that they could use each month this means each month's model is not as good as it could be so if you use more features you could probably train up a better spam model but this month's model and that's not

month it's week this week's model homes and behaves fundamentally different from next week's bottle so the spammers might be figuring out how to bypass this week's model and figure out a bypass but it's definitely not going to work next week because you just check you just moving ahead you you you're being a moving Target they're going to figure out a bypass each but if you can make sure that last week's bypass will not work on this week you're good you might not be able to gather enough data in that week for the bypass to figure out things to make sure your model is trained correctly to correctly classified it but if you change the way

the model works you might not have to worry about that bypass you have to worry about a different one so they 've taken this concept of being a moving Target to the next level which is actually securing their way and it is you know it is a drop in accuracy but either the drop in accuracy that actually gives up good cradle that they have found that significantly improves their stuff um and finally um respond quickly uh what's the turnaround for training a new model uh so what what Google did what one of the embarrassing things that uh Google had was in their uh pictures product um they were misclassifying black people as gorillas which is really

super racist um and awful and they didn't do their model correctly the way they responded was they just they removed all primates from their data set so that the model can't misclassify black people as uh uh gorillas because the model doesn't know what a gorilla is that's how they responded but the thing is they responded quickly and that actually is not it's bad but it's not the worst idea if you need if you want to prevent this from happening and you only have a few days to figure out how to quickly get your model to stop doing this it might take you six more months to gather the data to prevent this from happening but you need to respond in

three days so quickly removing the primates was a quick way for them to respond to this one problem and then they to buy them time to actually invest and do it properly over time um they still have a little bit of issues with this but this is one of the you know the bias in machine learning all sorts of papers not a talk about that but they responded quickly in that sort of shitty way but it was a response that thing um the code language model May generate vulnerabilities you've got a you know codex might suggest code for you that has a vulnerability in it because it's memorized the code from a stack Overflow post

and that's bad like you're you know a lot of people are gonna be oh cool codex did this include it because I'm going to trust the machine learning model and well you've now got a vulnerability that codex put there that is bad uh and we have found large language models suggesting vulnerabilities to include in your uh code and Microsoft sat on their ass for six months before they started doing anything um other linters and um uh code editors fixed it quickly because they are in the security space and they understand that they have to respond and the people in the machine learning space they sat on it for six months because how do I solve this problem with

memorizing things it's so much harder like you kind of have to do something quickly Google by Microsoft you can't just like sit on your ass for six months um so I think machine learning should be held to that standard anyway um so questions I have if you want references you can find them online um on the GitHub

[Music]

[ feedback ]