BSidesMCR 2018: Tricking Binary Trees: The (In)Security Of Machine Learning by Joe Gardiner

BSides Manchester49:04155 viewsPublished 2018-08Watch on YouTube ↗

Show transcript [en]

hello him in the back good so I'm Joe Gardner so I'm a funny a PhD student at Langston University I hope to submit sometime in the next few months I have actually managed to do some writing I'm also the president of Langston University after hiking group Iraq we look for sponsors so the upon a sponsors come speak to me or Vic out there I'm driving University of Bristol in September says that looking at ICS security at IOT and they converge to the - you can find me on Twitter at the so mojo and my website for the slides up for this talk on that afterwards it's quite a lot of paper titles mentioned in here so you can go

on down and find them afterwards so the back renters talk so back in 2013 when I first start my PhD learning University we had a project from the center for the protection of national infrastructure CP and I on there I did project I basically wrote report on Markham attic control techniques I'm ever since detection of these mainly fixed for an academic work in the field if on a media is very long and boring it's at sea to afford org there are some pretty in for graphics you can look at as well but as part of this I bet hundreds of detection papers for chromatic control techniques and pretty much every single paper I looked at use

machine learning in some form in their and the Machine and the use with very simple outcomes saw about grooms you learn in your first year second year they are across the university and one of the things we didn't see in any of these papers was the pop effect model that took into account the machine learning part this is the key part of the system which makes the decision on what's bad we consider if any was I should try to talk that the attack models only considered the use case that were looking at so if they're kind of do inception the attacker model just spoke another attacker generating domain names I hope they would be so read those

papers and I call me thinking is this about him so I started to put into the Etruscans a burning hell academia that's quite a big field cords Atmos a machine-learning I went through this about a survey paper I was took years and I published it in the ACM computing surveys back at the end of 2016 with an envoy in total so you can go find that this link at the end of the slides to make him read a free copy of that so what I want to cover today so first off why we use machine learning Indian in persecutory people don't seem to like you very much but we do use it and I've a brief introduction of what is machine

learning and some algorithms I'll talk about the models of the attacker how we put the tackler perturbation is in his tux I'll describe a few attacks some of the issues around it is how to apply them toy weapon defenses and I'm have some time questions this is quite a wide talk in terms up I'm gonna talk about law of deferred tax but nothing particularly high low tier level detail if we'll read them up or getting some papers in there I can go and look at that ready task they're quite in-depth how to actually do them it's more with awareness pep talk so if you start is what you're learning you know what to look out for I would

consider later so why do we use machine learning stuff so traditionally - tech malware things we used to signature based method so whenever I bought you an auto product or McAfee which ever used the hazard signatures for each malware that can be on infected how do you match that to your sample supernet word session and time session as well Paul must now begin photo samples per day in generated by drew - polymorphic malware so we just can't generate enough signatures and store them to be able to do this will I be if you ever seen without top covers see much over every single sample you're not gonna have space of the soul anymore [Music] we also have too much stuff to invest it

manually so if you had never come in and you have folks go through Network you can't go man you expect their officers too much data and even if you have very low or signature with show results for dirt so you can't manually go inspect all of those because that'd be fascism in a corporate network so we can use machine then start to alleviate this problem a little bit you know the computer made decisions for us to try and make it as simple for the admins to do so as part of this over we develop this model for the typical detection system and this is what Ollie or the Tejas systems we look at that have the same

structure so you have your data generators these are your hosts on the network if this is the host based amount worth detector these could be particular processes generating systems cause you have your data aggregator so this is your tap on the network at your gateway this was Krypton's a twin and this is where you start transferring some of the data into things like net flow you may apply sampling here as well you have a pre post thing so again you might play more something here you might make it stretch in a particular way for your use you know this optional data reduction step so what a lot of these detection papers do is you have data that let them

apply some sort of stage to remove some of it so for example they use what listing I have a certain based on domain names or servers anything will enter the top 1,000 elixir websites don't leave I data from there completely and that would lose quite a big chunk of the day to die and if you take our Facebook and Netflix you then have your feature extraction I'll talk about pitch in a second participated basically take your data I convert it into the format for them she learn to use you may have some pathology included so if using a classifier you have to train on some data if you're doing custom in you might have some points which you know about

from a honeypot and users have identified cursors later so you can bring the storage into a detection to help identify please better and the key part is the separation engine and this is what machine learning actually sits so this is the main function in your detector which takes your data in and I puts the malicious points and they're behind points if it's a good to taint it might even say for the malicious points which particular malware they are the poorly just wanna know what's bad and what's good and the bastard you go and look at later so just in time to probably reduce this sort of stuff so are your food into Ben generation arguments so if you're not

bored Lee this is what more often use is to generate debate names on the fly so the man will have some sort of scripts like this one which is thick for Wikipedia it's simple example or what this one does it takes the current year from up and day puts it through some function and then I put a tab a name on that day so when they give you this thing the next day will give you this in the mouth operator on the other end does the same thing which says when those domain names and then they link up and the next day it's a different domain name so it's hard to discover these each day these are usually quite well for

more tips recognizable so this is an example from the torpid botnet which was reverse engineered and taken over so all the domains that durable this online current as long the last wee letters are the curl of a slightly jump board so is a and j by january BBF february and so on and the second v letters are always X and H and they're so be demented database has these characteristics so if I were to put a signature for this that's quite easy equal to Stepford depends on like up strong have these particular characters in there and have one of these twelve three-letter monthlies at the end it's got a close ball I'll turn to play we can use

machine learning to do this so we build a classifier and we give it a load of domain names from good domain names to is take Alexa top million or so I'm gonna put in there I've met some DJ's and generate these for a long period and then it can work out how to separate them or we use clustering swished Manolis takes all the data at once and try separate i/o into separate clusters based on the structure so what is machine learning so the thing I see on Twitter all the time is people say we should early is just if statements I'm body yes it is a little bit more attention than us so machine learning is

a form of artificial intelligence as a poor goal is we take some data we've learned about it that's the learning part and we train a model based on our data so the type of classifier we take a lender training data we build a model and then we forgiving you data you should be able to assign data label based on that training data I'm boarding there's two types so supervised and unsupervised before we go into those in detail what we were to see a lot today is features and these are the properties of what's being observed so if you have your data point you generate a set features about that so for example if you have a network packet

you might have your features might be the source IP destination IP the protocol in use the packet length answer binary features which are does this token exist in the package contents and you have a few of these so substance mining is free or fall over top are thousands and this gives you a massive n-dimensional taster and that's what you put into your machine learning name it's a decision by so the first function learning is supervised and this is shown by the but you have this training data so this is a big dataset for security things will have good data so things aren't bad and bad data so they don't represent malware and this will be

labeled so you'll know at this point is good this point is bad we then put this in our system and we train that system to map those input points to the AI purse and this is usually called classification to represent this but classifying the new data so once this model is built we give a new data point and then that's I put it to our neighborhood maneet so sample apps that do this our decision tree so magnifies in the regression phase and support vector machines so let's go over two of these not quickly because these are ones that appear most in the work so the first is one perforce crossfire this is if statements submission we do

here is you generate position trees and lots of them so a decision tree is basically a structure if you look at these diagrams each point on that is a conditional so up here might be if packet lamp is greater than 10 go bite if not go left other than the between other labels here amb and what under posters is we train lots of these trees over on a subset of the training data so we take a sample of data and try to chew on that take another sample to adjourn that or we take a sample of the set of features so use a subset of features on each tree we pull the data and the system generates -

militaries you want and picks the trees I have the best information game the ones that things will have the best chance of classifiers later and what we did for new dead players you ready for every single trade I'm gonna take the majority and put it so in this example or four trees if they all three of them I put a while I piss B we take a is the IPA label okay supermajority vote the everyone is support vector machines since when that is week one in any AI course I she didn't of course they get a supervised and what this one does is press all the points in a space I propose the hyperplane separating them

so basically for a two-dimensional space it would draw a line between the two classes this hyperplane because this might be a hundred dimensions and basically many point cones whatever side of that plane is on that's the label it gets good the alternate approach is the unsupervised learning and this is where we don't have a trailing phase we just put the data in and it tries separated out there is a labeled as new we don't know what's in there yet so the main is up this is called clustering and the main arguments are K means X means how Jaco listens a bit hard to find a way to be of classifier what we can do is take our training data

we put a bit of it aside you train on mr. Pamuk use it a bit left to test it with live label data with clustering it's hard to do that because we don't have that labeled a to start with so what's the most common are good for this is k-means clustering so what we do here is very simple is we have a data and put in space we generate k find the points in that data and place them in the space and then we assign all the points to the closest by distance centroid to where it's at so if looking at all gif here you can see at the start we have our data and then we generate for Wonder

poison and all the which are the triangles and all the points were signed to those by cutter we then move that point to the mean or the assigned points and repeat that process again assign them to the closest centroid and keep doing that until the centroid stopped moving in this case adventure we get full clusters and you can see in there that roughly represents the four main groups of points in that system fanciful there's also a phone to this context means where we don't know what the K is in a balance it would repeat this for different values of K and try and find the best one for us the other one that comes up a lot today is how I walk with

Christine again this is distant space so it's trying to close points to give it in space and what we do here is we start with all the puddin Sasaram cluster and then we combine the two closest points to each other and then we keep doing that until eventually have one cluster can be represented as a dendrogram and then we read off that dendrogram to find a Crestor so here if you read off up say there for five which is back here we'll get four clusters which are these four colors if you end up at number six we get two and we just read off the level that we want to get the number Questers we need so we've got argument tested

need to know how to measure performance and there's lots of different metrics but there's four key ones that we use to measure there so the true positive rate so the malicious points are labeled as malicious we want to be as high as possible the false positive rate this is a number of benign or good points labeled as malicious and what this be as low as possible because every alert we get from her false positive has to be investigated my damn in so these below the true negative rate said number per line points there but it's benign which is good so whether to be as high as possible and the false negative rate so the number of malicious points that are

labeled as good well this below because if a point is for post negative amusing missed it and Borle what we try to do is optimize this to have the true positive rate and as high as possible and the false positive rate as low as possible if we can get those do to be good then that's usually what we aiming for because then be good as well so it's a second rate for my survey so the left most common is a list of some of the most common at well sites of detection papers and you see here what it effects about domain names range of oceanographers part of network servers host and so on or base table and then there were

trafficker malware we see the true positive rate of bas positive rate so two positive rates tend to be around than 90 to 100 percent revenge in their perfect test conditions and the false positive rates are time to 1% for 0.00001% and so on very low the key come here there is this one and these are the outcomes that these things use for machine learning and you can see in that list most of these on the forum I mentioned before came into custody and right DeForest SVM and so on or slope variance on these which is slightly more advanced but still very basic and I see on this part sort the monkey assumption that Mercy's citizens make which is what's

usually broken is that the data is linear so please bet that the malware and the traffic are separate you can draw a line between them manatee they're not so the Capricia mode this so the Maui traffic is surrounded by legitimate traffic and they're victims that are hard to separate out later so I sought the point for this and I'll be going to the interest part which is how we actually attack this stuff so what does it attacker want to do here there's two main goals they can have the first is to evade detection so if they've won at a point they don't want that point to be detected so they try and increase the false negative 8 this on their

particular point the alternate is the entire service so they want the admin to stop using the system to stop them to applying attention so they try and crease the false positive way as high as possible and if there's too many false positives the admins will turn the system off if you have a network of a billion DNS requests a day and you have a happy way I'm just point zero one percent as a hundred files and alerts you have to go and investigate I'm most admins appointed turn off straight away after Fisker hundred thousand alerts per day even if that was down to point zero zero one percent I stopped putting too many root ten thousand we don't have to do

very much to stop them using it so there's a few different models from the literature about how to calcify this stuff this one the keywords I refer to a lot today is from the greener model and it separates out the attacker in terms of the influence they have that specifically of what they're doing and the SCOOTER violation they break so the influence is positive so here for classifier they affect the training process so they affect that training data in some way to make the actual model be wrong always brought reso they start probing the system pinpoints into it I say what comes out again with the lie detection system we have their specific so we have a targeting system a

I want to make one particular point to be classified wrong or crusted wrong so I are going to be labeled as benign or a behind point to be labeled as malicious if we want to attack if I were to type mass email or make his emails players malicious so he doesn't get them they go to spam folder I've had indiscriminate this is where each one a written system we want to target any place we can I'd say there will be classified wrong I let me have this go to violation so we have the integrity attacks so good one attack points be labeled as malicious benign increase the false negatives I'm have availability so we increase the

false positives so this it becomes unusable and we don't use anymore so one that was supportive pause but this is how much this attack actually know about the system so if the attack is attacking our computer as a pen tester they can do it that map scam they know what they're doing they can attack it they know what's office with a desk boys with this it becomes to be hardly because they don't know what machinery is Sarah Lee I have come what about my computer I don't know what children and they use smarter that they don't tell you is hidden and the three main parts attacking those are the feature set so what features are being used which is

quite useful to know to know it's changed the training theta set so what waits has seen the first place so roughly how the model is built and natural arm from in use so which are the using k-means clustering are using about the forest classifier and they have some combination of his knowledge of each they might know half of each asset for a publication or white paper they might be able to estimate the training data set by if it's a PDF detector downloading though delicious PDF files of the internet and using those and the cut sparkly probably guess Bobo's engineering if they get a copy the software first engineer and they can work out what the classifier is itself

or if they're here and they're lucky they don't know anything at all which basically difficult and then we have the attack capability this is what they can actually do to aspire the attack so for the Giotto your name you see a lots they exist and a lot of work on this attacking but you have this type of influence so are they cause they took my squad tree and they've had a train date or not how much they can affect the classifiers so can they affect the portion of the headline traffic to the militia traffic so can they add more malicious traffic into there to make the probabilities dip them how much sappers could be controlled each class so how

many the tap ones can they control can they have to chop one or two from their own or can they control all of them and what features can be modified and hammer by how much so but do I know in traffic some features can't be changed so IP addresses like source like this or destination appears can be quite hard to change to keep nothing still working where someone a packet length you can pack packets or spit packets up as much as you want to have a length speed where we needed to be to do the attacks so they sought in the skip of classifiers but the same basic rules apply to question as well just forgotten types of

terminology you'll see a bit here so you use word learner to refer to the actual machine learning algorithm that's been targeted so k-means is the learner the production is the actual system that we attacking so the my this is a lot on your laptop or the detection system on the gateway and therefore which is quite pointers the soviet lunar so missus tax what you'd have to do is you build your own local version of the production system with whatever data you have available and you practice or tax on there so you're not sending they tried to virustotal every few seconds we go mal example into it locally and then use then add to the production won't enjoy

the actual attack so the defenders only see the Lasseter purpose should be successful and you compare banks and workouts to somebody's ass require there was a violation to find the best attack point so now the interesting bit and that's the actual attacks here now so solve the mystery miss attack going the best then my opinion once a niche thing is the mimicry attack sis's explored to a temperature attack so we have our system without that and trained inter we're trying to attack a light one and we're busily trying to miss classify our point so we have a attack what we have we want to discuss why it's benign I become a picker particular the line point and make our

point do like it or just take any random point and just try and make ourselves appear as any benign points hence movie this will be demonstrate against the run DeForest SPF values nowhere networks in literature first K applies to be much any classifier because as long as you can change your point enough you'll make it look like the but it belong points and be able to change it the main limitation here is how much you actually change that attack point so this one I do the most detail long it is when the best described in a lateral security system so a few as good as paper and ears has to target using this attack against a system called PDF rate and

this is a website that's now disappeared I want to do demo at this life because it goes there but it's not gone come with this did was you upload to the PDF file here would apply I mind the forest classifier to it and I put a score of 100 percent how malicious things is so if the hundred is completely Britisher zero is safe and you apply if I shoulder there to think are bad is so the attack here goal is to take a gym a PDF file you take your PDF of your exploits or malware in there how you try and make your mouth where file look like the delusion of power and the goal here is

reduce our score I put up my PDF a ton enough that you the person who played says again as far as about the made you could here is it in a PDF file the features that they are transfers back to hundreds of them in turn so they're all quite interlinks if you change the valuable feature it might change 1000 features as well just by changing that so you have to solve plan ahead and be careful about what you do change and this one was all built user of westerner so they took a bunch of PDF files of Mirth random on my sauce much as a thousand malware samples in it that took a load of good PDF files of the

incident scraped them and they used that to put up with her so good learner and under Boris they also knew what missed the features are because this work came from a paper about seventy percent the features are listed in the appendices say there what what the features are to apply the tackier they inject content into the PDF file to add an element these different features so in the PDF file there's a space between the Kaufman's table and the trailer where you could put data so an old PDF reader will meet the trailer first and that trailer points to where the crossmembers table is and then it we use that table and it would allow anything in between

those fields the EFA will be do from the top of the file and just go straight down and pass it all in one go so anything in between there will be picked up my PDF rate but would not be displayed by Adobe or Google Chrome or whatever so after two hundred a features speed in there furtively of making increment these arbitrary and thirty-five them thinking to change about wherever they want as a result who if they attack for has five object keywords in there so the current object feature is five if they're talked about seven they just put into that space object object separate where space and that then increases at count up to seven

well they can also do is if the author field automatic target length they just put a new author field into the space and the PDF it will read that last whatever last but today a field is there and said they left to be that said they were to make it free they just pay ABC and it will avoid it whereas Adobe or some readers normal so the so galuna they've built so say they know seventy fifth of the features because in the paper they take all the different files from the web and they test us these profundo forests as the e-learning which is natural target but they also tested with SVM as well as the

surrogate other reason for this is they want to know if you don't know what the target is and we train it with a different eye with them could we set a time the target woman and that's was yes so if they use what the forest as a surrogate and they train their tap points on average they get 28 to 42 percent reduction in the score I put it by PDF age so in most cases forward 2% reduction will be enough to put you right below that a pine tree and the files will be misclassified even if use an SVM a surrogate they still which is a dying by about 20 to 30 percent cut out which in there so

they still would you sit down enough that the tucked the pedophiles be labeled as the tines if you want to try a scientific on get up they've actually put the code up on there and you can go be both type of PDF files yourself most of the things that using that the datasets you can't get anymore and the PDF a website is now gone but you could probably substitute these for a newer dataset I find a system like virustotal and test on there and at the same event some other examples of this so video at our they test this occurs SVM directly and your networks and the purple knowledge that they know originally and think about the

target they know the training data at the classifier the feature set all of it I also live large they know about half up data and even with limited knowledge they can increase the false negative rate 2.5 so 50% false negatives even when they don't know or the data in there might affect Bayes so they change traffic here every day but on traffic and this stuff minute often identifies webpage of a target volume and they can reduce accuracy from 98 percent to 4 percent isn't there attacked initially as well they take their classifier and leave me training using attack samples they generate they still grow knee protector up to by 63 percent so even on a system has been trained to prevent

this attack instead effective I'm just gonna scroll down to my 40 percent on there because our players to Christine as well it's be much exactly the same approach in this example we want to move our TAC points over towards a divine fester and eventually it'll be custody with that this has been demonstrated against how our crusting by ----video title---- feared he could bust any Crescent I'll be right there you just have to move it close enough to the fire cluster and it will be absorbed in again here this is limited by how much I can change our body they can't change the features very much they gonna stroll to get the attack point to be close enough to the line

points and there is a limit to how far it goes if you become too benign like you become benign and your power will work anymore the next set of points are great at the center tax so these are the more muscle mathematical approach these things and basically we take a gradient descent optimization algorithm to find the optimal attack point so we start by two points and behind to find the best at a point to get our target goal of miss classification and there this one requires the attacker to know roughly what B target is the classifier or the quest thing they need to have been acknowledged with this attack to make your work properly if they don't know

very much it's not good be as successful as an immigrant AK you have to sorta classifier here to make it work you need to test this because the very intuitive approach you need to go through and also lots of times to make sure it works so the lucra offers they tested this approach against PTFE themselves using their same basic attack as a milk wine but using gradients in instead and there they reduced the squirrel down by 29 to 55% so so plenty to reduce it down even if a squirrel got hundreds n you can drop it down to 65 which probably classified as benign and I'll test it against your networks and SVM so an SVM

we can increase the false negative rate too close to one so in which every is misclassified will face one of our motivations about 20-25 my tuition z' in their newer networks are not more robust they're better as far under there they can achieve a fall so debate about 0.3 which is still pretty high as you may know everything about the classifier so those head attacks that happened what you already have a target to break the next set of what we do if we can affect battery the data so we can get data in and modify how that system is learnt so it sort of three types of tourniquets classifiers so there's label flippin there's a greater set based attack and

there's dictionary attacks and the usual Tiger here is we somehow affect the training process by example if we can samples on a heavy box we can then make that sample do every one to two to get my data collected also to be able to submit samples the virus to a bar is total sense of samples off to my detectors we give them samples they detect it and set up to them and it gets the training sets so the first set is they were flipping a missile will do here is we take the line points are we given labeled as malicious so if we're on a honeypot I'm have a malware asad el-libi n we win

that Maui do something good but I guess they were alternative malicious because his on battabox and it is malware it's been captured on there and then the it's what they saw ready taken a bit of training and then it's trained up at that point being wrong so this is mr. Massad against SVM a few times so sarataro get an airway to 50% which is changing 10% of the upper labels which will be recipes see if you can have a particular target and Bridget are sure against this victim against our best svff equal to Ln SVM which was has actually been designed to be more robust against label noise so suffers a hop lumumba label I can still attack that

version so the main limitation here is how much attack ago actually changed our training set how much do they gain so this example take from bizarre as our paper where they have SVM different dataset so it's four different to DJ sets here which have a straight line linear curve under Pollock clear winner and the black line is the hyper plane that's been generally SVM separate them so if you see in the NOFA case in the first line is a straight line if introduced random points it here so where these points accredit that's where each fit the labels on them on each side because he stood reminded me didn't change it that much in there but there

when we start doing putting you through the line and far away we start moving that line in the space and that's thinkin slight interest in there as soon as that line is moved if it's down to this case here where she's power curve will be something internship we can completely mess up the classification there so that's gonna have a lots and lots of errors in that case because this one down here this space is all behind here now that's all named as delicious in there so all the app is gonna be born and this one they're just flipping twenty labels in each case not many out there hundreds are there the next one is the

grade at the cert attack so we take a benign point and we start changing this point to move in space and typica the customer becomes this accurate so we take the point to be fit the label to say it's malicious and we start moving it in changing features so if it's a kind of a first mover attack stop changing it towards our points and this which really gets SVM by Bajau again and in this case they can achieve in a way point zero six there which is a very high this enough to stop causing errors and the next one is the dictionary attacks I which is lastly it specifies and this is particularly focusing on

systems where we have token-based features so as I said earlier if you have a blanket contents I mean how features are say this drink is contained in there so that be partly features from our hundreds of eyes a little bit basically addition appears but a bit is this way and what we do here is we take our malicious points and we start putting the data into there which is the time points old versus I've resisted against the spam Bayes filter which detects spam emails so what this does is I put on one of three ways as spam as I'm sure it doesn't know or as harm which is benign and what this does is the

indiscriminate attack is spam to a target who will be using the spam to train their detection system and it starts making the spam emails can take words that contain interesting emails in their family one to do here just run me cause and have both positives in that that this Bamford becomes unreliable so lots of people's good emails like doing it the spam filter because the spammy will start containing lots and lots of good words and what they find here was if they can fight just 1% the training set which if you have a botnet it's quite easy to send the response to do that they could cause some 90 percent false positive rate so in lunch up set up your emails I

go into the spam folder which we're deloitte everyone and he was stopping using a spam filter then they'll turn up which is to do it targeted so except for if when I make maps not receive any emails or a particular email I will start sending emails to the spam folder which contain ticker string so dear Matt who example if I said almost pumpkin in dear Matt it's like the any of emails I sent dear Matt was so good in the spam folder and he won't get them anymore but she might want an interesting case if you know food two percent of what you're talking you must going to be you can achieve with sixty percent positive rate

I know in fear of selling target it's better to be straightforward if you know who the person is what their job is you could probably guess for a few what the email condors are gonna have by their job title what's off in spirit gonna send them and so on so the next stage is clustering serving apply poison to customer as well we don't have the training stage here but we can stop in dead points in which stopped making the clusters and Jenny what won't do here to make those clusters no longer represent what the which we did so the first one is Bridget attacks and first of all we do here is we have two clusters will put

points in between the two so those clusters either merger or split off so two classes could become three two classes could become one cluster and if the persist of representing more it was a bit much harder to evaluate them afterwards to find the attack points so this would be definitely gives power crystalline but the say from whiskers came is as well and for this one you do need to know what the day is going to be like so the perfect match guess is best or home look very good estimation of what the crust can be like because if you just put points in if there's no quest airlock going to make a merge see you stand

where they are so bid you a time again they test this gets maha item maher takes the mist malware behavior reports which should disclose the frets and purses and it trusts them there's simply nothing about the target but and there's optimization problems they try to find the optimal attack point to start causing these classes spit I really do is to keep adding more more attack points and total discusses spit or merge so that it takes more than one point to do this because you need to start affecting the average points but the more and more you add dimension ups of what the cursor is going to spit so if there's a hard crust doing for their

sample they had 40 classes originally just by affecting 2% the training data they can which is just nine to five clusters so four to five that's quite a big drop that must have the effect right puts there and be able to have battery of the math words into paper so as well as to without granting read the last one is the gradient descent attack so it's painful in approaching but we using the gradient said to optimization to find that optimal at a point now as well and this one again must really gets how a clustering so again to acknowledge these occurs we get set so a PR Tools data set which is a random set of sample

data a base of a chromatic control dataset and some Hammond digits and in all three cases think laws all the classes were chapters to one by doing this attack and also this requires quite a lot by durations as two or three hundred iterations but a point need to be done so takes quite a long time to Gerry them they have an optimization with estimation which can reduce that space down so that's all the attacks which discussing today does broadly how all the different taxa model there's different barriers of them but there's a Mentos so now the issues find amis most of how difficult it is actually do this all the attacks I've talked about here

they're definitely it's very simple so it's usually density of a small number features which I'm the attacker knows techn about them when you get to so come out of control data um how I did takes me a lot more complex you have a lot more features and if me a lot had to change them so becomes more difficult to actually form these attacks I said before Sophie just can't modify it so some a destination IP addresses are quite hard to change so you need to make sure your malware they've talked to whichever serve it needs to go to if you're doing poison attacks you can target any pots you can sense advise Toto and get it in there it's not that

difficult to do that the main limitation is how much knowledge in fact is gained so how much they know about that data what system is it is more pictures in most cases there were no very much a lot of these things for big commercial systems this tray secret they don't want to publish their machine learning part whereas a lot of things we have a clipper system you publish it to get it verified by people when you have your trains detector that's quite a sensitive part of your product that is the main part so you don't want to read assigned people to go and test and play with and you keep it long time a lot of these

systems they offer from academic papers or the have technical white papers and these will contain information about the picture sets in using the algorithm so you can get some of the information from there by finding any documentation the density's one of available a lot of these big systems they would be trained on one dataset they'll be taken to a site the customer and they I did it from them to train out loud even the best results so you have to saw estimate what that customers there's going to be where someone Norton for your laptop or not one common they'd set just be easier they're missing to do is over decision to do so you call up the vendor and say

what I can do use my customer tell me how it works or we just reverse engineer it so bill I don't and try and work out what pitches it uses what day to use and so on and if you could about you can we figure out unless the details about how actually works or not the other question is if you have a full-scale detection system how would his work so the attentions aren't just especially in part they're the entire process around that they have the sampling how did conversion to different formats and so on so attention is great always it has just tested against that one part though the ovipositor system might have protections against

somebody's attacks happening before the first thing so if your data change formats when it's go through this be posting steps it might not actually come out the same when you need to later you also find that a lot of these systems don't issues one algorithm they might use two or three in sequence so they might apply clustering a burst set by the data and then apply a classifier to the questions afterwards so you have to attack those systems at the same time which means again more complicated so so one of the big questions is does actually happen in your life it's quite hard to know because we only see the advance at a point if the attack is good

if that's attacks an idiot they're not gonna be testing this each it up went on the actual target system because the debate is it identify that participating be to the site different points each time that may be using the surrogate to train that points will attract the best attack men to choose at one point and if they're successful that be misclassified it is benign and you will see the points to a proponent at since I submitted the job there's has been a lot of talks about this there's that subject says when the beast Las Vegas last week that was when a deaf cheer for Def Con as well I would say a black out there was

talk so this is becoming a lot more common over solve this year there's a lot of his happening about this the attackers know about it these are apt dog factors almost certainly probably doing this already because these papers about four or five years now they know about them that we're doing them and lastly so how can we defend against this so we've broken it how do we fix it the two main approaches from the literature are much classified systems and get doing so muddy classifier is we then use one classifier we use lots of them it's ember case we take one classifier and we train on different subsets of the training data different feature sets

one time you had to bait there's lots this is kind of how about the forest desert already so it's not the most effective way but it can help a bit those hoods you just use different classifiers to use SVM and drag the forest and and everyone at the same time and then you have freedom assistance I get hopefully one of them will cause the further point in the right way the other one that's on what interests one is the game vertical approach so yesterday we know the attacks are we know what the attacks going to do and we have taken our type ones that we have the actual data and we work out what the Machine

attack points are going to be so we try to estimate the attackers game time and workouts what they're going to this most well as long as the attackers actually play though I gain for us if they're not paying the game then it doesn't work as well and there are attacker so the problem not going to play it but these can go some way in happened to prevent these attacks happening the defense people is the Avatar machine learning are good as you see out there they have some rotations the evaluation they have is against very simple attacks only singular attacks they take one of these hasn't discussed nearby attack and they test against lap one but not others

and pretty much every paper I've read on skim machine learning uses the spam email filter it sample use case in which all of them for some reason I miss because quite an easel to get data for but that's what they will use that will test it in over smart and spans quite an easy one it's straightforward data I go said the game for its butter-poached nibble on the tie preparing the game if the attacker changed what they're doing we don't know we use this again so there's those are these secure machine and right there but why don't people use them so main when it's like awareness so people don't know about them if you don't mean up on this stuff you don't

know where they are just use what you learnt in your year to machine and of course so use Kimmy's West p.m. he's about us if you have the k-means SPF you can find literally millions of imitations of these in every single language for every type of data you want their software to do it there's like miss to do it is easy the secure ones not so much there's very few potentials of them they're easy and if you need it really good data to map you can modify it in somewhere which is too much you don't want to do a lot of these systems have reduced performance in normal case so that you positive false positive rates are changed less

attractive said they're more robust against attacks but in the normal case don't work quite as well so the headline figures of the TPP in 90% its juice in 90% and people don't want to use them as one size better than the other I won't be can we find was there's not really any good metrics for this for these machine learning algorithm so if you test them in pure pure performance you have to achieve positive a positive a precision weak or accuracy and so on lots and lots of different metrics you can use to evaluate how good they are in performance but for security there are me any that make it back simple there's no like a phone number

okay it's 90% effective you don't have that because there's different attackers the abdomen levels of attacker how much they can do so it's very hard to come up we did think about how to do it I'm looking from opening these some be smart with us to do this if we prefer these I made life a lot easier for picking these outs later so last night so what you take away from this machine learning is good it can't work for you if you did but if you do it properly if we just seek an SVM okay this into your system to a machine there because machining is the core buzzword at the moment you're putting in a page

making is worse when you make your flat model for whatever you're doing incorporate the attacks occurs but should hang it into that I said all the detention perps are they've got none of them had the machine didn't component as part up at all they are now starting to do it after a few years but more of these sort of surveys and attempt it was coming out but for most of us I've read up till 2015 favor if you had machining in the top model and if you got interested in trying to use with more skill variance so try might imitation of them a lot of do you have basic imitations and libraries you can't use

them at all use them than your bus so questions so it's asking if I did too occurs wats basis of metrics not really no but it's quite difficult to resist your body because the attack but the Tucker model can change so much miss of how much they can do it's quite hard to have any sort of singular value or curve to represent this and it's just so many variant ways they can change ya know what the tax going to do beforehand is whatever budget do data points for each attacker variant and that's quite high dimensional data already trying to visit a different tax and I wish I can do any more questions which I know awesome

thanks very much give it up to Joe Gardner

BSidesMCR 2018: Tricking Binary Trees: The (In)Security Of Machine Learning by Joe Gardiner

Related talks