Cutting Through the Buzz: Machine Learning & Artificial Intelligence Jon Zeolla TRISS 2017

Name: Cutting Through the Buzz: Machine Learning & Artificial Intelligence Jon Zeolla TRISS 2017
Uploaded: 2017-10-29
Duration: 39 min 44 s
Description: Cutting Through the Buzz: Machine Learning & Artificial Intelligence Jon Zeolla TRISS 2017 http://www.threeriversinfosec.com/

BSides Peru39:4470 viewsPublished 2017-10Watch on YouTube ↗

About this talk

Cutting Through the Buzz: Machine Learning & Artificial Intelligence Jon Zeolla TRISS 2017 http://www.threeriversinfosec.com/

Show transcript [en]

so Jonsey Ola is CTO for seaso and he also helps out with steel city InfoSec and besides Pittsburgh we or helped organize that I'll turn it now over to John hear me at all there we go all right great thank you thanks John [Music] okay thank you all right cool thank you all right everybody thanks for coming and sticking around for after lunch I know there's a bit of a tough spot to try and pay attention and learn about a complicated topic after you've just digested some delicious turkey I hope so my name is John seola which is funny because I don't really have it on that slide but these are some other things about me so I'm an intersect

professional like John said I'm the CTO and co-founder of C so which is a consulting company founder assisity info sack organizer of Issa it's Pittsburgh and I do various projects so that's who I am and so I wanted to start this off and kind of frame a little bit of why we're talking about this and actually we're lucky enough to have one of the members who of one of the authors of this report in the room Kyle raise your hand there we go that guy right there so he's one of the three people and that authored this risk survey for emerging technology domains that they you know cert CC works with us start to kind of provide this on a

yearly basis and I saw this the other day and I thought it was really relevant to this talk because as I skimmed through it I found that the number two domain that was prioritized for further study based on a number of factors was machine learning and so that's essentially what we're going to talk about today i-i've been to a lot of machine learning and AI talks and I've been to a lot of vendor pitches and I get aggravated a lot because we never actually get to the meat of like how does this stuff actually work it pretty much is just spoken like it's magic right it's just going to magically fix all of your problems it's always the

best and so I've been working over the last maybe three four years to better understand exactly now technically how does this stuff work and so this presentation is gonna be my attempt at explaining with lots of pictures because pictures help me at least with lots of pictures what exactly that means so another brief note about this report the reason why I was kind of prioritized as number two was based off of this mate matrix which says essentially how could this technology impact people you know safety privacy financial and operational related severity so this got a three a three or four then a two which kind of ranked it pretty high am I getting things right so far Kyle all right there

we go and then just a little bit more of setting the stage before we start digging into some of the details I'm gonna do some definitions you know we hear AI and we hear machine learning and there's some confusion are they the same thing are they different how are they different what's a GI what status science you know all these things so I just have five quick definitions and then we're gonna start digging in the the first definition that I wanted to give is a GI is anyone in the room actually heard of a GI before so a very small number maybe two or three percent of hands went up it's it's not a very

widely used term at this point people in the community in the the math stats machine learning communities aren't even a hundred percent sure that this is attainable but what it means is what most people when most people will call this AI but they actually kind of mean a GI which is a fully autonomous kind of system it says in the definition you know a machine that can perform any intellectual task that a human being can so this is artificially generally intelligent it can it can do any sort of task that a human could then you have AI we just kind of like the next step down consider this like a subset of a GI that

which is a device that can perceive its environment and then take actions to to maximize the chances that success at a single goal so you kind of have to establish what this goal is you know where you are going and then put it in an environment and then it kind of attempts to get to that goal in a maximally successful way which is slightly different than AGI right and then machine learning is a further subset so machine learning we're just saying it can learn with it without me giving it a set of roles on what exactly to do so I kissed or eclis we've had like expert systems and things like that where you have a knowledge base and then you have

a bunch of roles that are set by an expert and you know that's kind of like the predecessor you can consider to machine learning where we're talking about specifically what this means but you're you're happier you have an algorithm that learns without being told exactly what to do and then two more definitions I kind of just threw in here at the last minute because I get these sorts of questions a lot you know what is deep learning or data science so deep learning is again a subset of machine learning so we've got AGI under that AI than machine learning and then finally deep learning which is it's a family of machine learning algorithms based on

learning data representations and they often mirror neurons in the brain so if you've seen you know another term would be like artificial neural networks right deep learning is a little bit broader than that term but I have a picture later on that shows exactly what that is but it's a bunch of algorithms tightly connected together that then give you a result at the end that's essentially what deep learning refers to the deepness of and the width of the number of those algorithms you know you might have heard the term perceptrons that are in this neural net and that's why it's deep because there's many layers and you'll see that later and then finally data science which is just a field about

extracting data or I'm sorry extracting knowledge or insights from data so this is just kind of a scientific method that you can use to apply to data to understand what is in there what you know to extract patterns or in for any sort of information from this set of data so let's talk briefly about where we are now because it's interesting to look at kind of the basic hype cycle so you know with any new emerging technology there's this basic types and I have a so this slide kind of it comes from Gartner is that easy to see probably not so great but it has kind of five basic stages and it starts out with

the innovation trigger so like this can actually work we know that this can actually work and so that's your innovation trigger so it'll start and and you know your expectations for that technology will rise and continue to rise and when they get to the top kind of where deep learning and machine learning is right now you're kind of considered at the peak of your inflated expectation so what people the way people perceive how this can be used is higher than how it will actually ultimately be able to be used at the end and the plateau of productivity right there's also kind of deep reinforcement learning on here as well which is earlier in earlier stage and in the kind

of those three those three technologies are relevant to this presentation talking about AI machine learning but to be honest machine learning hasn't necessarily followed the normal hype cycle cycle its hype cycle hooked a little bit more like this so you know way back in the 1950s was kind of the real innovation trigger and it's went up and it's went down and up and down and up and down ever since and you know the downsides for these are typically considered AI winters or AI down terms usually AI winters is the term you'll hear and there's three predominant ones and there's a whole interesting history as to why those happened and why we were able to get out

of them again just to reenter another winter but kind of at the point where right now is a little bit better of an inflection point where it seems to be solidifying a little bit more in AI winter isn't really expected the main difference in my opinion for this is kind of the confluence of a lot of data that is available to the people doing machine learning because a lot of data is required to to do machine learning and the ability kind of like over time you know Moore's law has kind of graced us with much more compute power and the ability to actually work on that large amount of data so we simply couldn't scale up before like we can now and in

my mind that's kind of the biggest reason why I I believe that this is going to be a little bit more sustainable and also in in the past you know for the prior winters there has been expectations high expectations but they haven't lived up to them and now we're actually seeing things happen seeing things happen happening using machine learning that are tangibly good right I don't know just anecdotally I recently was looking at the pixel - from Google and around the same time they came out with the Google pixel buds that anyone hear about these Google pixel buds they're like completely blowing my mind so these are headphones that are Bluetooth you know they you listen to

music on them whatever but they can in real time translate languages from other people so you could go to another country put these earbuds on and literally get real-time translations of what people are saying to you and understand them like that is completely driven by Google Translate which is completely driven by machine learning underneath and that the fact that that could work completely blows mind but also validates that this is probably gonna stick around for a little bit because we're actually seeing tangible benefits from it and so you know this has been around for a while like I said the 1950s so there's been a lot of technologies the point of this slide is just to show that there's a lot not to

look at a new specific one and this is this is relevant to kind of the big data landscape so these are the technologies that hold and can manipulate and look at the data that you might be storing but even specific to AI machine learning here's another beautiful infographic I guess that shows all of the machine learning technologies that have been coming on there's a lot of them right so it's been is been growing and growing and coming more popular there's a lot of different vendors claiming that they do this and I can guarantee that all the vendors claiming that they do machine learning are not in this slide because it would be way more no way I could fit

it on on this slide but so another question that I get from a lot of people is that isn't machine learning really just like advanced statistics and I'm going to attempt to dissuade you from that idea using the next couple of slides because although it kind of leverages statistics you know it leverages a lot of math statistics probability income psyphon fundamentals it is not just statistics so if you were to look at something statistically a lot of times you know so I guess this this picture was an illustration of if you asked a computer to randomly generate what a network would look like what network traffic would look like a lot of us I'm sure do

network security monitoring or some level of network-based IDs things like that and can you drive a general idea of what network traffic looks like and in this computer simulation is what's it called like a normal distribution or Gaussian model and you can see you know there's a lot in the middle with this mediocre level of linkages between other systems and then to the left you have lowly linked systems so things that only talk to a sparse number of other distinct systems and then a little bit to the right which is highly linked so that'd be like your Active Directory domain controller or your web server that's constantly getting hit by like everybody but in reality you know we

don't you know networks are not do not look like this they do they're not distributed normally they're distribute a little bit differently they're distributed using which you consider like a power-law distribution so where you have a large amount of lowly linked systems a mediocre amount of mediocre linked systems and then a long tail of highly linked systems and if any of us you know and all of us that have done information security operations before you know that a lot of the strange things or the bad things that happen are kind of considered anomalies or or on the on the edges so you're not typically going to be looking at that far left as much as on the far right and the far

rights a little bit more more difficult to pull out sometimes but if you don't you're gonna miss the pile of bodies right so the pile of bodies is in that long tail and the interesting things are in that long tail not to say that they aren't in the short tail but if you completely ignore them then you gonna be missing important information and then I loved I love this jiff this is this is this next one is one of my favorites because it shows exactly how you can lie with statistics so I'm sure a lot of people have heard of like pkp hacking and things like that but this this shows like exactly what I try to explain to

people when I say the statistics sometimes aren't enough so before I show well actually I'll just go to it right now cuz it'll it should just run yeah great so all of these dots are so for each kind of iteration of this display it's recalculating the summary statistics for these dots we obviously as humans see patterns and these that that is an X right and these are horizontal lines and there's two of them and vertical lines and there's two of them but if you look at the x mean y mean standard deviations correlation things like that they are to a reasonable degree exactly the same so if you were to just to take these different data

sets and run summary statistics against them that circle would look the same as the dinosaur would as the same as the star and the same as the horizontal lines right so summary statistics are simply inadequate at a at a very high level and this would be a kind of a rudimentary way of using statistics but it it is simply an adequate for looking at the data we have right now that that's a that's a dinosaur like come on so so what I'm trying to say is that the techniques that we use matter and so what we have here are these red dots that were generated and the dashed green line you can see is kind of the ground

truth so that's the actual line and these dots were generated with that line in mind and some associated jitter or some randomness added and the other two lines the black line and the blue line are two different ways of doing a linear aggression against the red dots so they're both the same type of algorithm with differences in how that they're executed and you can see that they're drastically different right the the blue one is essentially garbage that is nowhere near correct and not useful but if you but if you wanted to naively use a linear regression technique against these red dots it is possible that you would get that blue line and that's what you could base future you know ideas

about and that would be a bad thing right so the techniques here are really important and so next we're gonna talk we're gonna start getting into the meat of so what are the core characteristics of machine learning I don't remember how my neighborhood bees into but there's about like five or six core characteristics of of machine I mean the first one is the one that anytime you read an article about machine learning you're pretty much going to hit and it's the difference between unsupervised and supervised algorithms so how many people in this room have heard these terms before supervised unsupervised machine learning all right so significantly more than before but still not up to half so

there's there's two kind of distinct differences although there's many more than that there's there's two key differences that I use to differentiate unsupervised from supervised algorithms the first one is that with unsupervised sorry with unsupervised machine learning algorithms you're what you're attempting to do is you're trying to cluster so you can see the circles the kind of outline circles there's there's two groups here right there are two clusters the algorithm is saying absolutely nothing about what those clusters are they're saying that these things are similar and these things are similar but it's not saying that these things are good or bad if these things are good or bad it's saying nothing about that it's just

these are similar I think these are similar and then it would take kind of a human to take the next steps and say are these similar things bad right the difference on the supervised side would be you are actually going to try to say this is something you're going to classify these points of information that you have the data that you've collected you're gonna say you know these blue things are bad and the orange things are good and then the other things are unknown right but you're actually going to put them into a classification and it doesn't have to be good or bad it could be good 80 to 80 percent positive good 90 percent

positive good fifty percent positive bad you know 90 percent positive you could have a ton of different groups depending you can kind of specify how many groups you have you could say I mean one of the things that you might classify is what is this so you have a picture and we'll talk about pictures a little bit later but if you have a picture of an animal the classification might be what type of animal is it right it doesn't have to be you know good we're not necessarily talking about good or bad we're just talked about putting things into applying classifications to things that's that's generalities do and and then the core of them because of this

classification and clustering difference is that for unsupervised you need it typically would use unlabeled data and unlabeled data just means a blob of things that is not typically or not purposefully structured and with supervised algorithms you would need label data which means it is structured so if you guys are familiar with you know like JSON or CSV files or anything like that you know a supervised algorithm would require you know this CSV format right or like the JSON format specifically saying that this you know tying attributes you know they're kind of considered in machine learning McPheeters right like extracting features or identifying features of the data that you have you know this is the height of the animal and this is the

length and this is the weight whereas with unsupervised you're just saying here is a bunch of numbers here's letters here is binary information it is here you go right and then you give it a couple examples of this and you say you know kind of compare them and say are they similar or not but you can't classify them you just cluster them another thing that's interesting about machine learning is that data whoops is that data is necessary to create the program itself so there's kind of traditional programming and machine learning here just trying to illustrate you know with traditional programming you have data and you have a program and then what you get from it is the output that is not

how machine learning works with machine learning you have data hopefully piles and piles of labeled data would be preferred and you give it you know kind of what the output should be and then it's going to give you a program and that program isn't necessarily answer on its own that program that was output by machine learning then needs to be applied to your data again right so you know some of the mechanisms that you might use for this is essentially called training right you're you're training the model you're training the program to get the output that you need and then you can apply that mod or you can apply that program to data as it comes in

there's a lot of different ways to do that which I'd be happy to talk about but I'm gonna leave out for now the other thing is that there's a lot of different choices machine learning is this disjoint term and there's you know there's there's lots of different choices of algorithms to use with the machine learning and some of those algorithms have assumptions in them so this is pretty easy to understand you look at it you say okay it's kind of like a bull's-eye right there is a there's a circle in the middle and there's a ring around it that's easy for us to understand but that's not necessarily easy for machine learning always understand depending on what algorithm

using so there there's a concept of kind of a big word it's parametric or nonparametric algorithms but what that means is that a parametric algorithm assumes a shape and a nonparametric algorithm does not assume a shape if you like I'm gonna fall off this is bad I'm not used to being elevated like this so this is how that would look like so this is me this is trying to apply a parametric linear algorithm against these kind of this bullseye that is not a very good that's not a very good classification to me right the things on the left are definitely not blue and the things on the right are definitely not red right the kind of the background is

showing how it would classify these things if it had been given the choice and it would be extremely wrong it chose a line because it assumed that it was a line right it is assuming a shape that is linear so a nonparametric algorithm would look at the exact same data but it's not going to assume any sort of shape it's gonna do a much better job it's gonna say well there's a cluster of things in the middle and there's a ring of things around the outside so this is the boundary that I'm going to establish for the differences between the two so the point here is that you could say that you're doing machine learning on data

but if it looks like the thing on the left then you're doing it wrong but you can still say that you're doing machine learning and data cuz that's true and so I hope you like this is kind of a great way to illustrate you know what some of the core assumptions are with with different machine learning algorithms so it's not just something you can't just like throw machine lap learning at it and it's gonna work great right you could end up with what's on the left another kind of core component of machine learning is how can you interpret the results of your machine learning so these called explain ability or interpretability essentially and this is a great quote that I found from the

Google brain one of their blogs said we want our machine systems to be explainable and frankly many of them are already more explainable than humans are so I don't know how many of you guys are into behavioral economics but there's a great book called Thinking Fast and Slow from Daniel Kahneman if you're interested in how humans make decisions I would suggest reading that book it is rather thick I did not read it but I did audiobook it twice because it was so good I did it back to back it's really great book and it explains you know how to understand how people make decisions and how flawed that decision-making process is because if you're talking

about implementing machine learning at your company if you want machine learning to be perfect then you're expecting more of it than you are of people there is definitely there are definitely some ways some algorithms that you can choose I'm going to show three examples in a second some some options that you can choose which are explainable that's very easy to say when I put this in this comes out and there are other ones that say that you put something in and then magic happens it is not explainable you cannot say why it made this decision especially at this point in time but this is the result that it gave me and especially with certain types of algorithms like neural

nets that are on that are considered online which means they're constantly evolving if you do if you ask it the question now and you ask it the same question ten minutes from now you may get different answers and that's because the algorithm itself is changing and evolving over time so that's that's kind of maybe make a little more concrete what does explain ability or interpretability mean if you were to visualize it so this first one is a simple logistic regression so this is just like we said before there you have a line and that line delineates the difference between one thing or the other the red dots on the top are considered one thing and the blue dots

on the bottom were considered a different thing and that blue line that goes through them is the difference so if I took this and I and I wanted to know why it said a dot was blue I could just plot it and say well it's below the line so it's blue that's why this is making that decision I can explain why now it might be that the line is incorrect and that's very reasonable and probably going to happen from time to time but you know why you can explain it you can understand another this is a great one this is probably my favorite when it comes to explain ability is a decision tree so this one is another machine learning

algorithm and this is kind of a human example of how you would understand using it you know decision trees will kind of identify where these splits should happen on their own what this arbitrary number is to split data up into sections to then make this tree but once it's established you know exactly what the rules are so I don't even have can read this so it says you know the score of a customer's interest in watches if the score is between 1 and 4 the decisions know between 4 and 7 then you have to then you have to do something else you have to look at their salary is their salary low average or high and if it's average you then need

to look at their age is their age greater or less than some number and that decision tree will get you to the end which is a decision yes or no this is applying machine learning you would apply machine learning to create this decision chart this and then you could automatically apply this to your data and if something were to be given a decision yes or no it's very easy to identify why oh whoa you you know your score was a six and your salary was high so the decision was yes that's why we did this so that might be intuitive and if that's all you had seen about machine learning you'd say well you can explain

all machine learning until you get something tool it looks like this where this is kind of deep learning or a neural network right and so you what you have on the left is kind of your your features your inputs right and what you have on your right the far right there's four there's four little buttons those are called perceptrons but that is kind of your output so in this in this example you have some level of inputs and it's going to give you one of four outputs right one of those four is going to be essentially the ultimate decision for this neural net and you're going to put you're going to take your data you're going to map it to all of those

points at the beginning and it's going to flow through and there are different mechanisms to do this you know you have you know you can just have like a feed for forward neural network where it's only going to go in one direction you can have backpropagation where things are gonna kind of loop back around there's a lot of ways to implement this this is kind of descent one of the simplest but not only does that happen and not only there are a lot of choices but essentially each one of those little dots is its own machine learning algorithm you know that could be each one of those could be a logistic regression and that's a lot of

logistical questions so that makes it complicated but not impossible right but there are some mechanisms that are used in training and using models that get very complicated so there's there's a concept there's a well understood and accepted concept of just randomly removing perceptrons randomly literally randomly like I'm just going to get rid of these ones because I feel like it will help the output generalize better to other use cases I am going to remove connections I am going to have dropout so these ones as they come through will be randomly picked and just not just thrown away just not gonna be factored right so how can you explain something that is randomly messing with you effectively

right randomly removing components and changing components and and you know not not accounting for them and then in addition to that these are evolving so they're changing over time so like I said before just because you did it now and you got answer a doesn't mean if I did it in two hours that I'm also gonna get answer a I could get something different and then I think this is the final one yeah this is the final kind of characteristic of machine learning is understanding that everything has trade-offs so whenever you're training a model you kind of establish a lot of different parameters to that training method and you may under fit you may over fit or you might have kind of a

balanced approach and this this is a great picture I think it's from AWS Amazon guys that illustrates this really really well and so on the left you've got underfitting which is like okay I've got these points I'm just gonna fit a line to it and there you go and on the far right you say well I want this thing to be perfect I want it to be absolutely perfect so you might over fit to it and the problem with overfitting is that it might work well for the data that you're currently training it on but you're not gonna be able to use it for data in the future because it's going to be inaccurate it's

only working you made it work really really well and I thought you gave it but now you can't reuse it for other things so it's not useful I'm put on the left if you make it generalize super well then it's gonna be really uh knack Europe so you most in most cases you're gonna have to do this what they call the bias-variance tradeoff and have kind of a balanced approach but even with the balanced approach you're still gonna have some level of false positives so understanding that pretty much all machine learning is gonna have some level of false positives so I'm a little bit low on time so I'm gonna burn through some of these really quick the

potential benefits of this are enormous I'm not going to show that but I'd love to give it out to people it's a video of how machine learning helps improve traffic and self-driving cars and it's really amazing and here's another example this is kind of a gadget that's used on a Bray bridge to you know connect cables together in different sort of things that is one that was engineered by a person by someone who's probably very good at their job right that is the optimal solution they were able to come out up with but as you kind of edge towards using machine learning to do the same solution you get something that looks like on the right

and that's kind of alien to me I don't know about you guys but I don't think I would ever design something that looked like the thing on the right and at the same time you might say like what does that even accomplish the goals that you had actually it did so it provided a 50% reduction in height 75% reduction in weight and exactly the same structural ability so I know that's not super pertinent to InfoSec but it kind of illustrates what these solutions might look like you might create a solution it looks like the thing on the left that's highly structured and rigid and does exactly the what you know fixing the problem you're trying to solve but on

the right is a much better way to do it but it's so abstract that it would be near impossible or impossible for a human to do on their own and so given all these enormous benefits and kind of how machine learning works let's talk about why it's such a mess so this is a computer vision machine learning algorithm convolutional neural network that classified this thing as a wolf now can anyone guess why this would be classified as a wolf it looks like a wolf sure that makes a ton of sense any other reason maybe why this would be a wolf so I would think the same thing I said what looks like a wolf so it's a

right well that's not that's not exactly how these machine learning algorithms work the reason why it said that this was a wolf is because there was snow in the picture so you know explain ability and interpret II and understanding why it's making these decisions it's a really interesting thing right I put a dog in in the that might have looked like a wolf like maybe a husky or whatever and with snow and that might even be what that is it probably is gonna call it a wolf because it looks like a wolf or because not because it looks like a wolf but because there's snow on the ground how many people have seen have saw this

this post so this was a researcher who made a debug build in c plus plus of hello world and then they compiled it in debug and uploaded it to virustotal and it got this pretty much every machine learning AV flagged it as malicious so so machine learning isn't magic right it makes mistakes and in fact it was in this case the only thing that said that this was that this HelloWorld script was bad and there's there's actually a great post that came out explaining why I think it was in medium I might have it in here but so I'm gonna skip that and just get to the kind of the core assumption that we have

here is that if you're gonna use machine learning you have to have clean data because like we all know garbage in garbage out right you bring in garbage you're gonna get out garbage and so that's a kind of look at an example of computer vision again and say you know if you were trying to make a decision make an engine that made a decision that that said what a Chihuahua looked like there are probably certain things that you wouldn't be concerned about I can tell you for sure for me that I wouldn't be concerned that this engine might consider a muffin a Chihuahua is it you want to the next one there a muffin a

chihuahua I mean I guess they're pretty similar but I wouldn't guess that up front similarly I would not probably think that this would confuse a labradoodle and fried chicken nor would I think that it would confuse mops with sheepdogs although that one is really confusing for me sometimes I have a hard time differentiating so I'm down to the five minute mark so I'm gonna talk to really briefly when it cannot be used and when should it be used and then throw you a bunch of pointers about more things that you can you know more ways you can learn about this so when is machine learning maybe not helpful and I think I've tried to illustrate a couple of those one is

that your data is unique very unique and I guess in addition to this I'm assuming that someone else is going to be providing you this machine learning model so if your data is unique and I mean legitimately unique and someone else give you a model I think we kind of understand that with machine learning data is a part of the program you trained this model using data it's you know k stratified cross holds or cross validation wait so you perform this training this cross validation early on but you're using data to do that and if their data that they trained it on is the is not indicative of your data then this may not help because it's not going

to generalize in that way there is no poisoning concern so if some bad guy is slowly trickling bad things into your environment that's running it through this machine learning algorithm and that this algorithm is online meaning it updates itself then it's going to start out as kind of normal because it's kind of innocuous and ramp up and it's going to adjust it's going to adjust to those changes and consider them normal even as they slowly increment become worse and worse and worse so it's gonna completely ignore that in certain scenarios and this stuff is hard so like I was just saying with the the muffins and chihuahuas that it's not super intuitive so if you don't have really really

skilled people working on this it's very easy to make really big mistakes hence hello world right and I'm not even a hundred percent sure that I would call that a mistake but that is the sort of thing that could happen there's also kind of live intuition and live interaction so if you need to interact with something that's using machine learning on the back end it needs to be intuitive the best example for this I've heard is credit approvals if I increase the salary that the person trying to get credit has I need to be able to give them more credit it needs to be intuitive if you if you you're gonna end up with something that looks like that

alien that that alien connector for the bridges where you know it's gonna be very specific and it'll say you know even though you increased your salary I still think that you should get less I think in this case you should get less credit based off of what I knew it could be very confusing and then if your data formats are constantly changing that might not be a good use case for machine learning because it needs to have things in a standardized format and you can account for that but it can get difficult - deep learning is pretty much magic I'm convinced a lot of it looks like this there's even more examples than I gave earlier boundary erosion

entanglement hidden feedback loops undeclared consumers data dependencies all kinds of different things that could affect your algorithms so it gets complicated but when is it useful I think this is that kind of like the most poignant quote that I've been able to find on this topic is if the environment is complex and feedback is delayed or ambiguous algorithms will generally and relatively consistently outperform human judgment so you know it could be useful when traditional methods are insufficient when you can't do signatures when you care about 0 days when you have focus adversaries and when you've hit the limits of basic anomaly detection and basic statistical information when you need to speed up your first tier triage and if I had to

summarize this entire talk I would say be careful if you're going to do machine learning on their own and a primary use case in security for me is initial triage and suggestions on how to classify not how to solve the problem so now what machine learning is not magic you need to find your company if you're gonna use this find your company's bounds between privacy which is data collection and data analytics you need to have lots of data to do this but you also need to respect people privacy so find where your balance is if you're going to do this if you're interested in doing this in the next few years I would suggest start now and pick

a Big Data Platform start shoving low the data into it and cleaning it up normalize it normalize it and whatever is okay for you so that if you need to renormalize it later into a different format you can programmatically and easily do that but have it stored and then for as long as possible retain your data because being able to go back and look at the history is a good indicator of the future so I don't have any time for questions because my timing is so impeccable that I got it right on like the zero here but I did want to leave you guys with a couple of pointers to more and I'd be happy to distribute these links however

or the slides but there is a Pittsburgh data science meetup group that that meets I think the first or second Wednesday in the evenings or the late afternoons on the North Shore at the Microsoft offices I would suggest if you're looking at a big data platform and we can have a conversation about this if you're interested but Apache Metron specifically for security data is a project that I'm a committer on so you know by us they're a little bit but I think it's a good solution and the Hadoop ecosystem in general and Apache spark is also prevalent and I've also assembled some additional materials and inputs like labs and there is a conference in DC on the 28th that I just

found out about I'll be there which is going to talk about specifically applying machine learning on InfoSec use cases and that additional materials is a gigantic list of white papers and I'm sorry I wish I wish that I actually had the links up there so you could take good pictures of this but I can distribute afterwards links of papers white papers and presentations and recordings of security talks at other conferences and kind of a curated list of good information that I use to learn about this stuff so that's all I've got thank you very much [Applause]

Cutting Through the Buzz: Machine Learning & Artificial Intelligence Jon Zeolla TRISS 2017

Related talks