← All talks

GT - Model Robustness Isn't Security

BSides Las Vegas54:5731 viewsPublished 2022-09Watch on YouTube ↗
About this talk
GT - Model Robustness Isn't Security - Sven Cattell Ground Truth @ 18:00 - 18:55 BSidesLV 2022 - Lucky 13 - 08/10/2022
Show transcript [en]

hey good afternoon welcome to b-sides Las Vegas ground truth this talk is right oh the slide model robustness isn't security by Sven cattle a few announcements before we begin we'd like to thank our sponsors especially our Diamond sponsors LastPass and Palo Alto networks and our gold sponsors Amazon Intel and Google it is their support along with our other sponsors donors and volunteers that make this event possible the talks are being live streamed except for underground track and as a courtesy to our speakers and audience we ask that you check to make sure your cell phones are set to silent or vibrate if you have a question raise your hand we currently don't have the wireless mics active so you will have to like speak kind of loud and we'll just have the speaker repeat your question um there's interference from the mics in another room so that's what we got to do um as a reminder the b-sides LV photo policy prohibits taking pictures without the explicit permission of everyone in frame these talks are all being recorded except of course in underground and will be available on YouTube in the future we would like you to please keep your masks on at all times and if you need to move closer to people to adjust your view or hearing please also respect social distancing with that let's get started welcome Sven thank you hi um so I'm sankatel and this is my talk on model robustness and security um okay uh so I'm giving this talk because there's some movements in the policy space and a lot of academic thing that is claiming that model robustness is absolutely essential for security and um I know of No One Security Company who actually thinks model robustness should be included in security and dozens that think but this is pointless um but because there's a lot of loud people that are arguing that this is security yeah this is a thing so about me um I founded a startup sort of in this space doesn't actually relate to model robustness um I've got a PhD in algebraic topology uh from Johns Hopkins and a postdoc in geometric machine learning um I founded the AI Village uh six years ago we're going to be at Defcon for the fifth time this year so um I used to work at endgame elastic on their large large malware models um like what Twitter is a kind of mathematician you can find the slides on GitHub so um because I'm making an argument that is a little thing here I have a lot of slides you can see there's uh 55 of them um so some of these slides that are here mostly to cover my ass in the argument space that aren't necessarily here for the narrative um so this is mainly a argument about definitions because things this is a there's a technical definition in the Layman's definition and they're a bit different so the Layman definition that is here for um everyone uh it is here open AI other serial examples are inputs between machine learning models that the attacker has intentionally designed to close the model to make a mistake they're like optical illusions for machines and adversarial examples are special for tensorflow adversarial examples they specialize inputs created with the purpose of confusing a neural network resulting in this classification of General input this sounds like a legit thing that we need to be worried about we need to worry about adversarial examples from these two definitions because specialized input to that causes misbehavior is sort of the key for a lot of security a stack Overflow or use after free bug that's specialized in you to actually take advantage of that that specialized input all the Metasploit is specialized employee so that sounds like a serious security vulnerability but um these are machine learning models so any model error because these are the statistical models they are not guaranteed any model can fit the definition if you've kind of squinted it uh this is not actually close to what the definition of a practitioner is who actually work in the space mean and if you use this definition people can sell you snake oil and even if you don't use this definition people can sell your snake oil and snake oil is being sold in this space um the definition for a robust model and the thing is there's two different meanings of the word robust but this is what uh gets thrown around for atmosphere robustness and the other robustness um so it's sort of your model is good and the problem with that definition is both of those definitions basically mean your model is good you haven't you trained it on enough data that it actually generalizes well um and but the problem with this model is they get to check that it's robust by wiggling some minor parameters and not actually testing on proper stuff so these are the definitions I'm hoping you walk away with at the end um and we're going to go through them Point by Point um so if you want go look at the slides online um but we're going to get started what is a neighborhood so um this is the definition of a neighborhood that I use as a mathematician the first one the so you see there's two different ones that look almost centrical one is the L2 Bull and the one is the L Infinity ball um and the L2 bull forms spheres that are nice and neat that you are familiar with and the L Infinity ball forms cubes and Hyper cubes and Sears and hyperspheres um but that's the definition that I look at and think Sears and cubes um but we can't show that to um executives um part of the thing is neighborhoods get really complicated in higher Dimensions so here there's a big sum over K elements and that K could be 3 000 or more in the case of machine learning and that gets really funky so I'm going to show you why it gets so funky this is a classic problem that you give to little kids uh you know if you want a rope that goes around the earth how much longer than the circumference of the earth do you need to make it it turns out it's just 2 pi feet longer to make it sit one foot off the Earth and if you give this to a little kid they get confused and tell you it's going to have to be much longer than that um also with dimensionality um you've got the scaling laws um if you are the if the height of a 3D object um the volume of the 3D object grows with the height is the square of the height and the sorry the surface area of a 3D object grows with the square of a height and the volume grows with the cube because of that large mammals have difficulty with heat so elephants have big flappy ears to combat the squaring law in dimensions and it gets more much more complicated than this this is a a puzzle that if you ask at a math conference and people have done this and pulled the mathematicians of like what happens they get it wrong if you build this thing where you get a cube you put eight little spheres in each of the corners and then you embed a sphere in the middle that's just touching them like this 2D example and then you get a hypercube where you have 16 Series in the corner and the little guy's touching in the middle um at 10 Dimensions the little guy sticks out the sides of the of the actual Cube and in when you as you grow the size of it the little the circle in the middle eventually becomes infinite volume and get so dimensions and things get really weird and even mathematicians who are trying like we give each other dimension puzzles all the time and you can present this at a conference and you're going to get a lot of wrong answers um so dimensions are weird and here's how it kind of relates to the security space so mnist is a data set that everyone in their uh in machine learning tests their uh models against it's been sort of a benchmark data set for donkey's years Jan lacun released it in early 90s it's 50 000 digits and it's each image of that is a 28 by 28 eight pick uh bit pixel image so each uh thing is 256 values and there's 784 Dimensions because 24 squared and so you get that if you just bury the pixel value either up by one down by one or zero so you have three options so you can do change each pixel you can change is three things there's 784 so you get 3 to the power of 784 and that works out to be about uh 12 well 1200 bits of information in terms of like the bits information that where cryptographic key would be so Dimensions get really weird for even really small machine learning problems and this is like the tiniest the you know this was sold back in the night in the in the 90s this isn't a problem these days so what you really need to understand of like what is a neighborhood and what this power this relates to machine learning is machine learning operates in high dimensional space and the volume of the space grows exponentially with the mention um sort of um there's little white lies in everything I'm saying but it sort of grow it grows exponentially and this sort of shows up in machine learning over and over again in the cursor dimensionality if you have a clustering algorithm and you've got and you want to prove that it converges well pretty much always You're Gonna you're gonna say well I require 2 to the D amount of data in that scaling Factor to prove that this converges so that D is the dimension to converge you need an exponentially growing amount of data to actually get it to prove that it converges so that's the cursor dimensionality and you keep showing up and over again so this is one major problem that shows up all the time in traditional machine learning and since we don't make proofs about deep learning it doesn't really show up it's not really spoken about but it does show up in the stuff that we're talking about today so um now that you kind of get a bit of a idea of like something about the geometry of the space um we're going to talk about adversarial examples and how and you'll see why this relates to what a dimension is in a bit so this is the um I think legally required image for um adversarial examples this is the panda given image from the second major paper from Ian Goodfellow Christian strategy and a bunch of other authors that I do not remember the names of um and he built a cheap way of producing atmosphere examples that you could get a panda add this formula here that's designed to fit within the out of the Epsilon ball uh then a little ball around your point um and produces this way of confusing a neural network this is the actual definition that they're working with uh so if you have a point and you've got the you know in this you've got a decision boundary that comes down the thing you've got some green points you've got some blue points and you have uh this red area where this thing so that X is the point we care about that we put a little bowl around it and because the decision boundary is so close to X there's some points on the other side of it uh because the dimension a recursive dimensionality this is always true basically um so this is the definition that we work with the the adversary examples are things that are close to my point that are across the side the dimensionality thing uh decision boundary so the actual definition of adversial example that people work with is that and that's kind of complicated um basically it just means this it's like I want to there to be no point an adversarial Point example is a point that's on the other side of the decision boundary within my little Epsilon bulb so that looks all complicated but it's not that actually it's not complicated you can code up but what that means in Python but in math that's how we write it so for robustness um you have you use that definition and you sort of like ask that no points um there's no it's artificial examples within an Epsilon ball of any of my points so you see the math definition on the top and all that means is all my points are without outside of this circle so down here there's that red point that is the only point that violates this definition so if we didn't have that red point this model would be robust so that's all we all we're asking is like the decision boundary is just outside of this little circle around my stuff and now um so here I'm going to get into the issues so the first major issue is data just moves um so once you train a model you get this thing data moves um You release my data a while I think second issue is it's impossible to check because of the volume stuff it's you can't do it third issue it's impossible to make an adversarial robust model in most cases in the cases that we care about as security people third issue oh fourth issue as low as accuracy so first issue uh data just moves so I have a model here when I've trained it up and I've got a bunch of green points and a bunch of blue points and I trained up a model and it came the model came up with this decision boundary so everything on this side of that line is going to be classified as green everything on that side of the line is going to be classified as blue and you know that looks like a pretty good line and it's for some reason it's just kind of got gone this length up because models it doesn't really have data over here so it just made something up but um and at this point uh I needed I on the deadline I have to train up to this point and then I have to deploy and send this to my customers so I get points from a month later and these are my new points these are uh ones in these boxes and you can see the decision boundary is bad the the model didn't predict the model the data from later in the month is going to be uh over here it just kind of made it and it's got misclassifying all these green points so if your model was robust before you go you're good but your data moved and it moved in a way that your model couldn't predicted and doesn't know about so that didn't help you the robustness doesn't help you if the move data just moves in a way we have this problem so this is some data that I um from when I was working at elastic we published this at icml but you can see is a very interesting thing that happens um here so what what this model is I had trained up a the production model that we released um sort of a smaller version of it that for just testing purposes and I trained it on all data up to January 1st uh 2019. so he doesn't know anything about the future it thinks uh the who it has only got data from the before the new year of 2019. so it predicts well you can see the false positive rate as I do a historical analysis it's good it's very low it's really low for a model so and that's what I care about I don't I care about very low false positive very low false negatively that's my accuracy and then the very low error rate overall but as you can see about three months after I released the model uh three months after you know this one wasn't deployed but you know three months afterwards there's a spike in the false negative right what happened was uh there was some malware family that figured out a bypass that became prominent and uh showed up so there was a bypass for a malware family and in our case the ulcers know what the malware authors know what virus total is they can check if they're they have a bypass they can check if they've got um or if your data they their malware is correctly classified they can check if the AV vendors and the EDR vendors are going to catch them um and they do that and so they will look for areas that aren't in their space and they will find a bypass and you can see I later on uh in after over a year after a this thing models thing it's got this massive Spike which would be completely unacceptable um this is not percent this is a thing so this is a massive Spike that would be completely unacceptable um you you can't deploy a model that that's that bad but that robustness wouldn't have saved this the red line is the amount a measurement of the drift uh the green blue and orange lines are the ones that you really care about in this image um another thing issue with robustness it's impossible to check for robustness so getting back to the demand the thing if I'm uh if I want a robustness radius of one pixel uh my the number of bits of information for an L2 ball is 10 that means it's equivalent yeah you you've got to guess 10 uh two to the 10 different things in order to test all the values you have to go through two to the ten different things to test all the value but when I've got an L Infinity ball which is a lot of these things are claiming a robustness to I have to check twelve hundred two to the 1200 and that's not you know the uh and this is just for emness this isn't for a real model this isn't for real images this is this is just emness um and when I put yeah increase the size of the bowl for L2 Bulls once I hit seven well that's that means this checking a L2 ball of seven pixels around a value of seven value values around a point in an emness model is takes more has much more bits than an a breaking a 256-bit encryption key so it's you have to Brute Force like AES 256-bit encryption and that's easier than checking seven pixels away from mnist um the radius is normally normal smaller than 70 not seven so here's that there's a line what I told you um this is a slice of a of how a neural network actually perceives data uh this comes the idea from this comes with the a field of math called the tropical tropical geometry um if you know your cryptography and your finite Fields this is the geometry of the field of one element um and it relates to neural networks um so this is actually sort of how what you have to check instead of the just the bit pixels and um what I care about is if I have my image my point is going to be in one of these uh sectors so if there's a solid solid color Polytech my example is going to be in one of those things and when I cross the boundary to another color image that that means that the classification could have changed so sort of the decision boundary of a neural network is going to be contained in the edges of this image and this is a slice of the emness of a neural network slight trained on mnist um and you this is just a two-dimensional slice of a 700 of a mini thousand-dimensional space um so it's much more complicated than this and if I want to be completely honest about my estimate for the number of bits I have to check using the geometry of the neural network information about the geometry of the neural network it's actually more like this so it means using all the math to reduce the problem space as much as possible to save yourself as much time which is what cryptographers would do if you have a really bad way of generating a key that actually doesn't have full the full 256-bit encryption uh bits of entropy um it only has 90 bits of entropy then cryptographers can take advantage of that because they know math tricks well if for neural networks with you know the geometry of it you can take advantage of it to save yourself a lot of time and um with that with using all that well you don't get to Seven you get to 10. so even with the math tricks you don't you don't get much further so another way of making things uh robust is um adversarially training them and this is the algorithm basically the algorithm for making the adversarly training things so you build you can make adversarial examples in some really cheap ways and just include those in these training data like always train your network on adversarial examples so that it's always going to cut you know to make sure that it classifies the other serial examples correctly so you're going to use the fast gradient sign method just start tossing in fast screen inside method generated adversarial examples um so what you and that's what these are is R star is the adversarial example that you're going to generate and you're just going to toss it in there and train on both your normal samples and these adversarial examples so the probable problem with this is the first line uh is this line find an attack for permutation RF star well as we just said well there's that many attacker mutations yeah that's the space you have to search over to find