← All talks

Applying Information Security Paradigms to Misinformation Campaigns

BSides Las Vegas · 20194:53:371.8K viewsPublished 2019-08Watch on YouTube ↗
Speakers
Tags
StyleTalk
Show transcript [en]

all right welcome to b-sides las vegas 2019 this talk is applying information security paradigms to misinformation campaigns by pablo brewer and sarah jane before we begin we just have a few quick announcements first off we'd like to thank our sponsors especially our inner circle sponsors critical staff and mail mail as well as our special sponsors amazon blackberry paranoids it's support from these sponsors as well as our other sponsors donors and volunteers that make this event possible now this talk is being live streaming so as a courtesy or as a courtesy to the speakers into the audience we ask that you right now make sure to check that your phone is on the sidelines if you have a question uh please use the

audience mic so that the youtube audience can hear you when you have a question just raise your hand and i'll make sure to bring over the audience and if you have any feedback about if you have any feedback about this talk or these speakers there is a survey in the shed entry for this talk we are the besides answers and we will not be silenced [Music]

[Music] thank you

ladies and gentlemen mr pablo

[Applause] give it up for the angry hobbit dancers thank you very much [Music] that was exciting

wow be careful what you wish okay so welcome ready to uh applying for information security paradigms to misinformation campaigns uh today we're going to be talking about mapping misinformation attacks

information attacks into existing infosec space we're going to talk about why this is actually an infosec concern and are my av guys around that can deal with this feedback

okay we'll try that okay so uh quick agenda we'll do a little bit of an introduction uh we'll talk about information warfare nation states uh misinformation and mass influence is something that like a lot of attacks started out in the nation state military realm and has now been democratized and can be done by anyone

hello i just actually have to talk into the mic can you hear me now good yeah bye bye sprint um no don't buy sprint uh so uh we're going to talk a little bit about information warfare nation states how this got started and how it became kind of more commonplace and and uh why we're seeing it now how the influence actually happens what are the mechanisms that allow for this we'll talk about uh some of the mitigations and challenges to existing mitigations we have attempted mitigations not very well and then we're going to talk about designing shared responses and a different way to look at this problem we'll introduce you to amit our misinformation security framework

and then we'll talk about how we built that and the way ahead so information warfare nation states information war fair has been around since the dawn of time frames he's the second used information warfare uh against the hittites where he would paint these murals uh about what was going to happen during the war the problem was that there was no transmission distance you had to go to where the message was and see it but back as far as sun tzu and close fit sun tzu said all warfare is based upon deception klasswitz said warfare is an act of force to compel the enemy to do our will it's not to destroy things and hurt people it's to change their minds it's

still influenced that's what really warfare is about so you know why are we sitting here talking about it in an information security conference you know where's the where's the cyber well if you look at information warfare uh and it's a shame that they changed the department of defense had this joint publication where they explained how the dod did information warfare and they had it very nicely broke out with these five pillars uh so they had psyops computer network operations or cyber if you will military deception electronic warfare operational security and really when you talk to the part of the expression but the muggles and they say cyber usually what they mean is some combination of two or more of these

pillars so they mean psychological operations combined with computer network operations or electronic warfare combined with computer network operations or some combination therein so when you look at uh when you look at cyber warfare right or information warfare or cyberspace operations offensive they're all based on influence right at the end of the day you want the adversary or the target to make decisions based upon information that you're showing them you're hiding from them that you're changing for them uh you want to deny or degrade their information stream so they make decisions that are advantageous to you or you want to give yourself a leg up so that you enhance your decision making process by giving yourself more

information than you allow the adversary to have either way if you're doing it on the internet it's some form of influence very rarely is the end result that you're going for the effect on that box it's usually on the air breathing unit at the other end of that box so let's talk a little bit about nation states i'm not a political scientist i'm a computer scientist are there any lawyers or political scientists in the room oh good don't hate me this is like the fat crayon version you know westphalian sovereignty is explained by computer scientists most international law is still based upon the west valley model of the nation state and there are three basic precepts

of it the first one is that each nation has sovereignty over its own territory and domestic affairs so i'm a country i'm a nation state my territory the things that happen inside the territory nobody should be messing with that those are mine uh the second principle is that non-interference don't mess with my internal affairs i won't mess with your internal affairs and the third one is that each nation is equal under the law regardless of size now i want to skip ahead a little bit and i don't want to go down a hole of whether or not the russian influence attacks actually changed votes or changed the result of the election but i think we can all agree that there was at

least an attempt to influence those elections i would say that those elections are internal affairs therefore sovereign affairs and had this been done by dropping leaflets into times square as opposed to memes on facebook we would be having an entirely different discussion we would absolutely not put up with that so part of our question is why do we put up with it on the internet so nation states do try to influence each other and if you work in government you work in the military they break it out they call it the dying model so these are kind of the big heavy levers that nation states can pull to influence other nations so there's diplomatic informational military and economic you

go well that's great i'm not government i work for a corporation why do i care well it turns out that you've got very similar levelers if you're a corporation so the diplomatic you've got business deals and strategic partnerships your informational instrument is your pr and your advertising most corporations don't have militaries however not all mergers and acquisitions are friendly right uh anybody that's been uh offered to be bought out by a large corporation like facebook or google and turned it down and then found out that they have a competing product that's almost just like yours would think that that is not necessarily a friendly thing and the last one is the economics so your research and development your

capital investments are certainly a way that you can affect your business sector and your competitors so these things do correctly apply so let's talk go back to the nation state thing for a minute um most of us anybody here not live in a democracy or something relatively close to a democracy oh good good i'm glad we still kind of agree that in theory we're supposed to be there so uh bruce schneier actually wrote this great paper about a year ago and he talked about well you know if you're gonna really attack a democracy or an autocracy what do you do and it turns out that a democracy requires that we have a common political knowledge it requires that all of the

members of that democracy agree on who the rulers are that the rulers are there legitimately they were legitimately elected and they understand how the government is supposed to work right and those things are transparent the things that we disagree on are things like well you know how much government influence do we want in my retirement or my medical care or you know the types of weapons that i buy and those kind of disagreements that contested political knowledge is how we as democracies solve problems and we can do it that way because we all understand how the government works and how leaders were elected until you get to influence that deals with the elections now people start to question the

legitimacy of our elected officials so these are foundational attacks on the things that make our democracies work so what's different what what why now i just said that influence has been around since the dawn of man once changed well let's go back let's go forward from ramsey's uh where i mentioned he did the mural with the hieroglyphs let's go forward to uh paper appearance you used to be able to if you were the church or the state uh you would have these learned people that were literate which was very uncommon at the time and you could get parchment or scrolls which were hard to come by in ink which is hard to come by and they could

mass-produce massive quotes messages for you by hand copying these manuscripts and then giving them to messengers and having those messengers right out by horse or sale and transmit that message and so transmission was limited who could transmit the message for the limit and who could receive it was limited because you could only make so many copies you could only travel so far and when it got there again literacy wasn't very common so you fast forward to the 1440s and you get the gutenberg press and movable type and this allows for further mass reproduction it takes a while to set up the press but once you set it up you can mass produce things quickly that they

still have to be carried these presses are still expensive literacy is becoming a little bit more common so more people can receive the message that the number of people that can mass transmit is still very small move forward again to the telegraph and the telegraph is requires infrastructure so now what's needed is that you need to be within range of a telegraph station you need to convince that telegraph station to transmit for you which requires specialized knowledge of morse code and then this is key your recipient has to know to go get the message right pony express didn't didn't exist right brown didn't help you they didn't come by with your telegram and just hand

it to you at your door you had to go to the telegraph station but now you can transmit at near the speed of light over long distances uh you go to radio uh in the marconi radio and now the marconi radio allows you to transmit at very long ranges and it requires no specialized knowledge to receive the message you just need a radio and to tune it into the frequency and you can get it in your home now something interesting happened about this tonight let me go back a second to the gutenberg press one of the things that you know the church in the state probably didn't account for is the fact that once the

press is out there right yeah it's expensive but i'm guessing the catholic church didn't foresee martin luther printing his 95 thesis on a gooseberry press and nailing them all to all the churches right when marconi invented the radio and people listened to the news on the radio they probably didn't foresee that a lot of people would mistake the war of the worlds for actual news and panic so we fail to account for the fact that these mass transmission mediums can be misused abused or misinterpreted so we move forward from the radio you now go to television and for the first time you can now transmit not only spoken original word but you can actually transmit pictures and sounds

and really up until the mid 90s this was the permanent way to reach the public now if you wanted to reach the american public if you were an american living in the united states in the 1980s you can't just walk down to your television station and go well you know i'd like to do a news broadcast it doesn't work that way you better be somebody like the president united states that says look you're going to put me on at seven o'clock because i'm doing a presidential address or i'm going to pull your fcc license and transmit and so again wide audiences you can retread people's homes requires no specialized knowledge they don't need to know what priority

that the message is coming but you still have to be somebody of import to transmit so what's different now what's different now is that social media has democratized talking to the masses you have to be no one of import to get a message to a mass populace we live in a world where katy perry god bless her has roughly twice the number of followers as a part of the united states and 50 times the number of followers of the prime minister of written and she doesn't have to answer to anybody before she reaches her 107 almost 108 million followers right now and so that's what's different we don't have authoritative sources anymore and anybody can transmit to a

mass populous and they can do it instantly so what's really going on here is powers that used to belong to the nation state now belong to the individual so the good news is the internet has conceived to give everybody a voice worked the bad news is the internet has conceived gave everybody a voice so now that we understand that we can all transmit how does that actually work how does these mechanisms of influence happen well it turns out that if you're going to be on social media and you're going to transmit mass influence you need certain resources you need certain types of accounts and this is not all inclusive these are some basic types so the first one that

you've got is bots about to relate simply stupid accounts they don't create content what they do is they amplify content they like they retweet uh and they otherwise send out your content parody accounts are exactly that they're parodies of real people real organizations they are not intended to be mis misrepresent the actual entity they're not intended to deceive you into thinking that they're at the actual entity but it happens uh the next one up is a spoof these are intended to somewhat fool people that they're the legitimate account i think it's funny that you know the president's twitter account is the real donald trump and it's the real donald trump because if you put in donald trump on twitter go

have fun going through the account there's some really entertaining ones there right so those are intended to actually have you believe that they're the real person camouflage accounts are used to infiltrate certain groups so if there's a group whose narrative you want to change uh pick any group uh you know the scouts of america then you create an account and you camouflage as a member or an interested party so you can get invited to the chat groups inside the news groups inside of the information exchanges so that you can take in the narrative and hopefully steer the narrative in some way deep cover accounts should never ever be discovered if they're done correctly those are very very time intensive

they're very expensive you can't just go out now and create a brand new persona on the internet because somebody's gonna go around and do a google search and go this person didn't exist six months ago and they're telling me that they're a 35 year old that's been working in government forever right it just smells fishy so uh these take a long time to create uh they require substantial knowledge substantial resources but the most dangerous one is the takeover the takeover is when the legitimate account of a legitimate person or organization is taken over in 2013 the associated press twitter account was hijacked somebody tweeted out that there was a bombing at the white house and president

obama had been injured and the dow jones fell so precipitously that it tripped the circuit breakers they had to suspend training and that was october of 2013. so those are the really dangerous ones so now i've got these accounts what i do with them well we say that there's five tactics we call them the 5ds distort dismiss distract divide and dismay so i'll walk through these i'll give you some very brief examples hopefully they're not not too abrasive or disturbing so distort is when you take a fact and you distort the actual fact no no the russians aren't invading the ukraine we're freeing and protecting ethnic russians dismiss you're presented with a fact you just

outright dismiss it don't don't deal with it china famously uses this all the time they're routinely accused of stealing intellectual property by the united states uh and industrial theft and espionage and their standard response is not only do we not do that but we we are the poor victims of american aggression and hacking you were the greatest defenders out there distract is you don't deal with a narrative presented to you you create a new narrative so mh17 uh was broadly shot down by russian missiles the russians said they didn't even address it what they said is why wow i wonder why commercial airliner was flying over a combat zone right divide you take the population you

divide them into two polarizing groups and you have them fighting with each other if they're fighting with each other they're not paying attention to what you're doing and the last one is dismaying those are ad hominem homonym personalized attacks and those attacks are so personal that you can't even address them and trying to address it and saying that the attackers were ridiculous you'd lend it creed in so who remembers the pizzagate scandal right yes the upper elite of the government have a secret sex dungeon in the basement of a pizza parlor it's like i can't even address that right just by addressing it i'm lending credence to the attack and the person making the attack

so you know what are mitigations um you know the only defense against these kind of things is to really understand why they happen how they happen and start kind of working your way down the line and seeing what things you can affect there to kind of break that kill chain or break that process so the first way you've historically done this is uh by using fact checkers so either manual or automatic fact checkers so everybody here i'm i'm assuming is familiar with either politifact or stokes right we've all probably used those those are manual the automated ones work in similar ways they take a purported fact they split it up into a triplicate into a triplet excuse me uh

and then they use one of two models either an open world bound or a closed world model and here's the difference in an open world model you can introduce new facts and they're assumed true unless later you find out that they run a miss of another accepted fact in a closed world model you assume that all new purported facts are false until you can verify them with previously approved facts neither of those models is ideal right so in infosec we would call these either white listing or blacklisting and they're not ideal and they definitely don't deal with things like editorials and satire and god knows we never find any of those things in the news when we turn on the

tv or on facebook or on twitter or on any of the social media so this totally works um so moving on to social media you can look at things like propagation based detection so hot based cascade and time based cascade so if you look at the the graphs there the top graph on the left there is fake news and the top right one is confirmed news anybody want to hazard a guess as to why the fake news has multiple peaks and the real news has only one peak lots of shares lots of shares by whom uh by botnets right so we talked about those bots that periodically re-amplify messages and so that's what ends up

happening right real news comes out it comes out in the news cycle it's accepted as fact we see it everybody understands it and it goes out we're on to the next news cycle fake information is periodically re-amplified because they want to keep it in the public consciousness now because it's periodically re-amplified what ends up happening is every time it gets re-amplified it reaches broader and broader networks and so what that does is that actually affects the cascading so if you go to the bottom ones the green uh line there is real news and the red line is falseness okay and so what ends up happening is you end up reaching a broader network with the fake views again because every

time a new bot re-amplifies you reach new parts of social media and new accounts that haven't seen it before so it's fairly easy to tell these things the last model there is the epidemic diffusion model and that's that little flow chart there that is a model that comes from tracking infectious diseases and so rc is your rate of contact with false information rf is the rate in which the subject is infected and rc is the rate at which they're cured now this is a nice simple model but the problem is that we really don't understand what's the catalyst that takes somebody from okay this time that i'm contact with false information i'm actually going to believe it become infected we

don't understand why that transition happens why that phase should happen so i think most people can understand the model but it's arguable how useful it is so when you take a look at these they're all kind of fall short we now have more devices on the internet than we have people on the planet the rate at which we create information is huge um the internet minute infographic gets worse and worse every year and really you want to determine if something is false news ideally before everybody else sees it but you're never going to be able to analyze the data and verify it as quickly as it's created you're just not so the speed analysis is a problem the

computational power that's required is a problem you need a whole new internet plus some there's a lack of common framework or at least there was we'll get to that shortly there's a lack of understanding of the emergence of characteristics now what do i mean by that we understand fairly well in certain circles that hey after your 15th or 16th tweet you're likely to believe something what we don't understand is how much more do you believe it after your 16th tweet your third youtube video and your fourth uh instagram uh post we don't understand that there's also a bit of cognitive friction and cognitive dissonance and here's the difference misinformation works because you're already biased to believe it

right if you if you fall for misinformation you are probably already leading that way and so once you believe that convincing somebody that what they believed is not correct is heart that's that cognitive friendship and on top of that there's that cognitive dissonance because they don't want to believe what you're now telling them is true so it just doesn't sound true it doesn't bring right those things are very hard to get over so so far we've been talking about social media and we've talked about leaflets anybody familiar with this copy of the washington post so this was fascinating this was an actual printed newspaper uh it was actually distributed at union station in washington dc this past

december you could find it at the newsstands at the coffee shops it is not a real washington post it was a piece of psychological propaganda it's the first time since world war ii that the u.s populace has been the target of physical psychological operations products it was put out by the orange group they did leave little clues in it uh that it was not the real washington post but it was convincing enough that the actual washington post felt compelled to release a story put out via their official social media that this was not them and they've since gone back and sued the group so if people are falling for this how's it going to work when we get to

deep face right you talk to the average citizen and you show them a video and they're inclined to believe that video and you want to tell them here are all of the metadata reasons why i can tell that that video is fake what they're going to tell you is i know what i saw i know what i heard and you're a government chill or you're a show for whatever group and you're trying to fool me so this problem is really only going to get worse okay so data scientist andy thinks so i would love to say we have this one ring to roll the mold beautiful data site solution the answer is actually no this is

a cross platform across the world huge community problem it's going to take a big joined up response to solve it i have a clicker cool so we're going to need to build communities and we spoke last year about the problems this year we're going to talk about the things we've done first thing we've done is we've created co-opted adaptive communities uh we've got some ms infosec people in the room david you're one of them anyone else okay i think we just got david lying around in here today but um we went out we looked for people who were working on this boundary between misinformation and infosec people who were applying infrastruct principles of misinformation people who

were looking at it in in that light and we threw them all into a channel together and got them talking to each other um i was one of the people who but the founding of the credibility coalition which is the standards body working on the standards for describing misinformation we built a standards body within that for applying equal second principles to misinformation the leads in that just to give an idea of the variety in there are an information operations person a data scientist an active warfare specialist and a social scientist um where pablo and i are both part of people centred internet who are working on misinformation at the government level um irc the um

help me here the international association thank you yeah who the icos um bodies who are response bodies for cyber security instance but we're building communities on the left to deal with misinformation that include all of the people from the right because these are all the people you need doesn't go anything sorry [Music] so and the reason we're doing this is because we need not just to admire the problem there are lots of people looking at misinformation information events um and again one of the problems is the words so we started talking about incidents and going oh look that's a nice incident and not actually responding to it so we need people to join up uh one of the existing

bodies is the the ice axe the information intelligence sharing coordination bodies and the ice cells which are the ones that the president doesn't have to sign off on so we're talking about um sorry sorry sorry i won't touch it again um so we we're talking about misinformation i sells response bodies but actually it's better to talk about cognitive security because instead of focusing on the problem you want to focus on the thing that you want you want to secure these groups of rights you want to secure the endpoints which is people communities the things that are being attacked so and and you also want to feed back to things like uh the financial things like

agriculture all of the other communities that get affected by misinformation because it's not just misinformation for its own its own sake it hits everywhere and the reason [Music] you want to do this again is not just admire the problem you actually want to respond to it you want to start talking about okay i've seen an incident i've seen the units of this incident i've seen the parts of this how can we start responding how can we start being resilient to this this is starting to sound familiar this is starting to sound like infosec we kind of thought so too so one thing we looked at was all the different views so we found people who were looking at

it as information security so some people like dan gordon people like grunk um people like danny rogers and we found people who were looking at it as an influence operations problem so people like lin people people like pablo yeah i know you're top end of it uh we thought we found people on both sides um so so pete singer and gracimov us and russian we're seeing it as complex most people talking about misinformation we're really seeing it as a social political problem you hear people talk about this as a political problem you hear people talk about originally we talked about as a media problem it was like people were talking about fake news as though it was

just like a new pollution uh but they're all talking about the same thing just from different angles and that leads to second problem which was there was no common language so a we throwing all those people in the same place and got to talk to each other and the second thing is okay let's start building language so we've talked about incidents we talk about campaigns as the the longer scale the one year two year things like the 2016 election u.s election work is a campaign but within it you get things like pizzagate which is a small scale incident um within that you see artifacts which are the the messages the users in inside that and

there's a lot of argument about what misinformation actually is versus what disinformation actually is and that's fine we can have all those discussions but we gotta go do stuff and we don't have time to argue about definitions right now we've got to go respond and fix so we've just put up a working definition got on with it and the things we need we need this lingua franca but we also need to start those defenses we need to counter move we need to counter move against if you see a reused technique for example we've seen this [Music] hack grab a document adapt the document leak it used a few times not very well in france which is

wonderful we see people building tools defense tools we'd like to know if they actually work so let's start assessing those let's start worrying about the the next thing so far a lot of the stuff we've seen has been pretty done it's done but it works uh we haven't really seen much in the way machine learning adaptation this type of ml sec attacks that's gonna come i'm expecting 2020 to be quite exciting i'm very much on the narrative level at the moment people are still talking about messages artifacts misinformation at that level what's really happening out there is narrative warfare people are fighting in terms of the stories that people have as their groundings they're fighting on

that level they're fighting where memes are fighting with stories and unless we're working at that level it's we're losing we're going to lose um and the things that we need behind this are those languages those common languages if we're going to join up if we're going to join communities if we're going to have these joined up responses we need to be able to talk to each other across all of those communities and the first piece of infrastructure we need is frameworks is uh infrastructure so the first piece we built was a framework structure so meet out let's talk about this so this is the pyramid so i've already mentioned campaigns and instances and narratives and artifacts

let's let's talk about what these things are so this whole pyramid is what somebody designing a misinformation campaign sees they are building this longer scale thing um they might be what clint what's called an advanced persistent manipulator so you've got an advanced persistent threat they have a target probably a target country a thing they want to do they might be a charm campaign like china they might have a goal of weakening another country they might have a secondary goal they might have useful idiots on that secondary go for instance if you are weakening a country through its vaccination scheme there's certainly plenty of useful idiots in there so you have this longer scale campaign you build these smaller

scale incidents within this campaign and then you use the narratives of that population and you adapt those narratives you target those narratives you've just seen with shootings recently there were original narratives and then suddenly those counter narratives being pushed in using the narratives of people and just adapting on top of those and then you'll see the artifacts underneath that you'll see individual messages you'll see bots but you'll see users you'll see those useful idiots coming through that's as a designer you'll see all of that as a responder you're going to start from the bottom so info ops is going to come from the top data scientists come from the bottom so you're going to see

first the messages you might see the box you might be lucky enough to see um unusual activity i mean the days of seeing really dumb bots posting all the time seeing these wonderfully fast rates like screaming hey i'm about look look i'm pretty much over now we're looking for the subtle things now we're looking for the subtle anomalies but you're going to have to really fight from that artifact level up to the narrative level up to the okay something's happening and this is the same fight we have in mlsec very similar tooling's needed so this this this is where we are so you have attackers from the bottom down from top down defenders from the bottom up and

the third set of people you don't see in here are the the end points the targets targets the the transmissions so keep this in mind as we go through the next part so it's a problem but infosec has things we can use for this we needed to build responsibilities build response fast and we looked around and we found frameworks and we found sticks so there's already a set of messaging formats that connect up those top level um infowaps incidents level entities to those bottom level data science entities so great we can use this and there are frameworks that already use this so we found a bunch of stage-based models so we we settled on the cyber kill chain

which already uses sticks so cyber kill chain with the attack framework underneath it in fact we we looked at a whole bunch of frameworks we looked

really useful they show you how people get radicalized we looked at some of the existing models starting to come out for misinformation and we took apart a bunch of that exact framework attack framework um was useful because we have this idea of stages but we also have this idea of techniques for each stage you can pull out individual techniques so we were looking at for instance fishing if you say fishing you know what it is you don't have to put in a long-winded explanation of this thing i've seen that involves email that you just say fishing and that shorthand so shorthand for the thing is a shorthand for the response you've seen to the thing it's a shorthand for who

probably does it and you can get on with actually responding to it we want to do that with misinformation so next so how do we how do we build this thing so we went and looked at existing campaigns existing incidents and we also looked at failed attempts france has been wonderful because france is incredibly resistant to this especially russian attempts on france they just kind of don't get it there's a cultural problem there which great failures tell you stuff um so we found about 60 something different instances our first problem was that there wasn't a master list of misinformation incidents and there certainly wasn't a master list of misinformation incidents that was in a standardized form so we built one

and we picked out 22 of these and we pulled out all of the techniques we could find in these so we've got a catalogue here's one of them um it's one of the really early ones uh the first one that i saw was 2010 some of the early russian tests were 2010 but this is 2014 just a really simple one day a bunch of people woke up in an area that had a bunch of chemical factories with a message on their phones saying there's been a fire panic um that's colombian chemicals so this is we built formats for how to describe things um we built formats for techniques so this is one of the techniques

so now we can talk about pay targeted ads what is it when was it used who uses it which incidents were used in so we know that it was using brexit and this is we we have a um github repo with the latest version of the amp framework in it so this is just one of the technique sheets from the repo so you link this together and the way we pulled this together was we just did this top down let's look at what we think the usual suspects are doing and this bottom up um what we've seen in terms of artifacts and techniques build out this thing called amit uh you're not supposed to read this

thing there's a go straight and see everything gory detail the top two lines are the big important ones so the blue line is the stages this this is equivalent to the cyber kill chain the next slide i'll show you has them in put in um smaller detail so these are 12 different stages that we think somebody creating a misinformation campaign will go through the four above that is an eight memoir for ourselves these are the four phases we think they belong to and underneath that the grey lines are the techniques that we found in those 22 campaigns plus a few that when we ran through example campaigns we realized we'd missed so phases and tactics

part of this is that most of the people looking at misinformation campaigns only look at writer boom so left to bring writer boom so writer boom is after um an attack is visible widely visible so in this case after this has hit the general public that's when most people are seeing the artifacts and when most of the analysis has happened that's not when you want to really stop these things you actually want to stop this at the planning stages to the left of so the left side um you've got planning a campaign you've got some of the preparation work so developing people it's things like uh find your useful idiots it's things like set up

your botnets it's things like set up your back stories for your trots and then micro targeting it's things like the ad network stuff most of this stuff leaves some form of trace so how do we look for those traces how do we stop at that level so we've got a bunch of work going on left a boom and other pieces are getting missed things like measuring effectiveness so if you run one incident you're going to run more incidents so how do people do basic um measures effectiveness how do they rerun so this is phase's tactics and this is where to go find it so misinfosec.org is where we're hiding out and there's an issues list so if you see

stuff you want to add in and we've put a cc license on it so you can just pull it and use it and i think this is where i hand it back to you oh okay okay so now that we've told you that this is all horrifying and broken you want to run it high uh i mean we feel we should give you a little hope and go okay where do we go forward from here so we've created this framework um it was a small coalition of the willing we'd like to get more of you involved take a look at what we've built add on to it disagree with us help us to fix it

we want to grow that coalition sj and i are going to be leaving here at the end of this week and we're going to be going up to dc to help various entities stand up a cognitive security information sharing and analysis organization as that gets announced be on the lookout for it convince your companies your businesses your corporations to join and to share threat indicators so that we can get a handle on this and contribute at miss infosec.org we want to continue to build that alert infrastructure again uh it's easy for us to take kind of a u.s centric view not everybody hears from the us certainly europe asia africa the rest of the world needs this because we're not

the only targets it would be great to have an international consortium that could do this

we need to uh refine the ttps and framework ttps are techniques tactics and procedures we looked at 22 scenarios certainly we don't know all of the campaigns that have happened uh all of you have access to campaigns i'm sure that we missed stuff uh the really important work happens when you find not only the gaping holes but when you get into these deeply held irrational religious arguments over whether this word means you know this thing or this other thing i'm a simple person if it doesn't exist in the 8 box of crayola crayons i don't recognize it as a color so we need some help there oh we've got a response meeting coming up soon

we're going to be talking about response at the technique level the tactical level and the procedure level and then for those of you that are already sharing threat indicators uh using the mitre's attack framework we are building schemas for take sticks and taxi so that you can actually share those threat indicators amongst the various information sharing communities intelligent shared communities oh and just the back end because you know this is the data science track there are some really good data scientists out there on that artifact level you should track these guys because they know what they're doing and i guess this is the yeah because there's always references this is us

[Applause] so uh this is pretty cool i think it's awesome oh sorry this is pretty cool i think it's awesome to be able to plug into the six taxi framework and be able to maybe pull this stuff into um you know tools like anomaly uh one of the questions that i have is okay awesome that we're we're doing this awesome that you know the things like the multi-state ice axe are going to have uh eyes on this but what can government actually do you know because we have a first amendment there's you know nobody trusts the government anyway if they say things are true or false so you know if if you're sitting in a

government position trying to deal with uh with these threats what do you actually do with the information yeah yeah so that's an excellent question the the short version is what does government do to solve the problem and you know again speaking for myself not for my employer the government doesn't solve this problem right industry in the community and the citizenry solve this problem what the government can do is foster those relationships and provide resources and ways for the communities to have those conversations and to be able to share that information and provide intelligence and analysis to those things so that the people that can address these problems do address these problems some of these things are

relatively easy to to address i'll just give you a very simple one and it's not this is not a panacea there are lots of problems with this solution but one of the problems right now with the internet uh and this actually just happened a few weeks ago and i'm sorry that's not true this happened this past december somebody had taken an ultra right-wing blog and taken pictures that were legitimately taken by the associated press of the protests in paris and specifically they took a picture that was published of a huge bonfire and then they took a picture that was taken of a much smaller kind of trash can fire and the narrative that this right-wing

blog told was this is the left-leaning news trying to make a mountain out of a whole hill these are both the same fire and the associated press did it exactly right within a couple of hours on their official twitter they went back and they posted i think it was about 19 tweets with the pictures going hear the original pictures there are pictures here's the original stories i went to the pictures here's an analysis we're going to certain differences to show you that these pictures were taken at different times in different places right and the reason that the right-wing blog was able to do that is when you look at media on a website you don't know who who actually

published it right you don't know what the original narrative wouldn't it be great if we could digitally sign our media with a cert so that i could right click on it go yes this picture was taken by the associated press and as part of that signature was the hash of the original url so i could go back and read the original story and go okay does the narrative that i'm being told now match with the original narrative if yes great if not this person signed it right the associated press signed it the other one did not or maybe they did sign it maybe it was the associated press and fox news to tell you a different narrative

i'm not going to tell you which one's true what i'm going to tell you is now as a consumer can make an informed decision about who you want to believe so we can enable those kind of things hopefully that answered the question any other questions and so uh the chinese government as an example seems to be really good at dealing with uh what they would consider disinformation and especially utilizing their citizenry to help do you have any thoughts as to whether examining the chinese modeling and methodology would be useful in american space

all right let me i'll take the first look at this so it won't work right and it won't work because when you talk to the average chinese citizen they legitimately believe that the chinese communist party everything that they do is to protect them from bad outside influence as americans we historically have a healthy distrust of our government right and so if the government tried to implement that we me included would immediately shout censorship so there there is some social background and there is some kind of cultural china part of the expression that goes along with that you have to understand the ins and outs of the society that you're trying to either influence or protect solutions at work

in china won't work here solutions that work in france won't work here and vice versa you really have to analyze now if the same solutions were put up by freedom of the press as opposed to the us government you might find that they have different receptions right that's what i mean i was like exactly from the government but a similar type of uh yeah this is my side of it so i actually ran an elf a prototype health campaign a couple years back using australians who had a hell of a sense of humor so there is a place for citizen-led elf campaigns but you've got to work at keeping people safe so there's infrastructure at least put

in there and we figured that this was the important piece of infrastructure to build first so this place there's definitely a place for it and there are some of that right politifact does some of these things it just needs to be bigger and broader and addressed more than just the latest presidential campaign the latest debate uh let's go over here oh i'm sorry you've got the microphone you pick them

so it was mentioned that um you know the real impact of this is in meat space and in the the human mind do you have any recommendations on the hardening of that surface is there a way to hack my next door neighbor into being more of a skeptic and to critical thinking or is uh in a follow-up i would i would ask is um is american society in particular susceptible to these sorts of things given um certain large percentages of our population and things they believe you americans are so vulnerable but there's there was a lovely lovely little leaflet put out recently about pineapple and pizza um so one of the five d's is divide um that's his fault

he actually got five that 50 put on and the final if you look at pineapple pizza and misinformation you'll find uh i think it's state department put it out which was just as beautiful as i explained of how it works and just show your neighbors that that'll help i think if people know it's happening to them it helps a little so i actually do have a suggestion let me first start off by saying we kicked your brits out once we'll do it again vulnerable indeed um you have to do things that are uncomfortable so um anybody here not on social media okay good by show of hands who intentionally follows people that absolutely infuriate them

okay that that's less than half the room for those of you that aren't in the room you should absolutely follow people that infuriate you to see what the other side sees right so if you were an american in the 1980s and you went to the news you had three options you had abc cbs and nbc and so you and your neighbor could agree or disagree on how accurate the news was but at least you saw the same coverage if you watch the same story on fox news and on cnn they're they look entirely different there is a vastly different reality and so we're not even living in a consistent reality where we can have civil discourse so you

should definitely follow people that make you uncomfortable see what they're saying and and try to at least understand you don't have to agree with but understand the argument so that you can have that civil discussion or you could do the extreme thing of spending six months driving around the country listening to people but that's okay i think we have time for probably two more questions i got five minutes left this gentleman over here has had his hand up for a while you mentioned francis being particularly adept at combating these or or they're not susceptible to this do you know some of the reasons behind it and it can't just be they're more skeptical i it must be more subtle i

imagine actually their educational system is based on skepticism so they are more skeptical that does help uh also they were prepared for it they watched america and releases um their system was set up so they have like a moratorium their election had also the hack and release and they had everything set up they just were ready for it they had better tax i think we got time for one more time please very interesting stuff um i'm interested in the partnerships that you're having with the groups whether it's your facebook or your ap and i can see some possibility that there could be some progress made there where the ap fake news comes out they can address that but how do you partner

with or address the grassroots groups your 8chan your subreddits that are created by or co-opted by nefarious groups yeah so um let me let me go back a second to that what i said before about that digital signature of where it comes from um so a couple of things if you were to do that in the united states first of all it would have to be an open system anybody that wanted to apply for a certificate should be able to get one so if you're a ultra right-wing you know horrifying nazi group and you want a certificate i'm not going to tell you no i'm going to give you one but here's the difference i said that before

the internet you had authoritative sources you couldn't be just anyone and transmit to mass media now anyone can transmit to mass media so really the impetus is on the consumer to look at the source and go am i going to grant them authoritative stakes or not and i'm not going to tell you what that authoritative status is that that is an individual choice if you want to believe something that you see in or on 4chan or on a you know fascist group or on a white path but those things trickle down into the sure they trickle down but again it's one of those things where you go okay well where did you hear that well i read that from this vlog okay i'm

i'm not going to grant that blog authoritative you know source kind of status we'll go from there we may have small disagreements we we do sometimes disagree um no we don't i i believe stronger everyone has the right to a voice but they don't have the right to a megaphone yeah so that's where i sit on that discussion so yeah i think we're out of time thanks so much for coming

[Applause]

there

legitimate news and publishing it as an advertisement on besides um

[Music]

um on a website that would maybe have a highly biased

you might be able to actually measure sentiment in an area

weird

[Music] hello hello hello

[Music]

okay

[Music]

uh

[Music]

is

thank you very much greatly appreciate it it's all her maintaining the fault for many years this is really the most probably the most important information thank you yeah this is crazy thank you very much

i have to imagine that at some point there's going to be this sort of diversion [Music]

yes

is

oh

um

looks like

uh

um

is

if i come out of this do do the levels come through are the levels coming through okay

now

all right everyone good afternoon and welcome to b-sides las vegas 2019. this talk is building an enterprise security knowledge graph presented by john before we begin just have a few quick announcements first and foremost we want to give a thank you to our sponsors especially our inner circle sponsors final mail bail mail critical stack as well as our skelly sponsors amazon blackberry and paranoid it's support from these sponsors as well as our other sponsors donors and volunteers that make this event possible now this talk is being live streamed so we ask as a courtesy to your presenter and to your other audience members well we ask that you right now make sure that your phone is set on silent

also if you have questions please use the audience like so that our youtube live stream audience can hear you if you have a question just raise your hands and i'll be sure to bring over that audience like for you and if you have comments or feedback about this talk of the speaker there is a survey in the shed entry for this talk and with that we're ready to begin please cool thanks everyone so a few quick housekeeping notes uh before we begin uh first in the spirit of what stays in vegas so what happens in vegas stays in vegas this talk reflects my own views and not those are my employer we are going to move fast through a lot

of material please keep questions to the end and also don't worry about trying to absorb everything on the slides instead look at this talk as field notes for the deck which you can come back to later or indeed now the presentation with full speaker notes is pinned here on twitter

just a few more seconds your first week as head of incident response at digicorp will draw to a satisfying and uneventful close you walk past the last of the meeting rooms along the corridor that takes you to the elevator down to reception and out to the weekend that's when you hear it the unmistakable ping the email has the c the email has the ceo on cc thousands of dollars has been lost something about an application called sunways code has been deleted the suspicion sabotage you're needed immediately on the 120th floor as you press the button to call out the elevator your brain starts to cycle through all the things your team will need to find out but it's

cold comfort that you've dealt with incidents like this before because the first thing you did when you got to digicorp was ask everyone about the firm's most critical applications and this is the first time that you've heard anyone talk about subway systems past experience is telling you you're going to need to pull together a picture with only a few pieces to go on you know where the rest are likely to be though and all their fragmented partial and inaccurate glory and so it begins the blur phone calls emails and messages coffee late meetings more coffee a tragic comic and constant struggle to keep everyone on the same page of what's going on punctuated every now and then

by a few slices of cold pizza eight days later your team have solved the puzzle you've found all the pieces and you've joined them up the picture about what happened is clear you've just come from briefing the board and as you think about the story that your presentation told in that hastily put together slide deck part of you is just glad that the last few days are over relief that you manage to join the dots but another part of you is frustrated you know that the mental model that the team's built up will soon fade from corporate memory slower for some people faster for others until eventually it returns to the fragmented state you found it in a mere

eight days ago it's privileged to be back at the uh ground truth frank no mask this time a little bit of music and a huge thank you to gabe and urban for once more curating and amazing space at b-sides here where we can share ideas at the intersection of data science and security

make it stop now so uh in the next 45 minutes we are going to look at how knowledge graphs can help security teams address the problems that we've just touched on in our make-believe incident scenario and also how we can flip the script on thinking and lists to reap the rewards of thinking and indeed operating in graphs this talk is the product of nine months work in a live operational environment testing a hypothesis that ran as follows to solve the problems we face we need to be able to join all the component parts that relate to security across business and technical dimensions in a scalable knowledge graph so it's easy to capture link visualize contextualize share

interrogate and update information in seconds for executive management and operations stakeholders at its core this hypothesis focuses on a user need which extends far beyond incident response and security operations because this problem doesn't just affect every function in a security team it also affects colleagues in many other areas as this quote from a cio each quarter my control functions bring me a report probably a pdf about what went wrong but i need to run my business today using today's information the data analytics struggle to produce meaningful timely insights that can help understand security status justify priorities and track results is not disappearing anytime soon however even if we solve that problem we've only won half the battle because as per this

fantastic blog by chris swan once we've mined valuable insights from that data they then need a proper insertion point into the decision making process that means identifying stakeholders who should receive the information we have identifying the right lens for presenting it across the various different levels of the business adjusting that lens to provide the right amount of zoom based on the audience's context concerns priorities and accountability and last but not least finding time for them to consume it this is not a simple problem it's also not a problem that's unique to the world of cyber security as data engineers and data scientists know only too well this is a problem that appears regardless of the industry

they work in so in this talk we are not just going to look at building knowledge graphs uh we're also going to look at how we can create and deliver context relevant and stakeholder appropriate interactions with the information that the graphs link together and of course which needs updating continuously is our approach valid is our implementation practical can the concepts we're working with transfer to other organizations please share your opinions and questions with us during this talk and on twitter for all things philosophical and technical please at dennis who at some point might be in portugal and might be online i'm not quite sure um he also wrote most of the code we're open sourcing today

george gigi has been operationalizing a lot of what we'll look at and is the best person to go for a programmers and detection engineers perspective on day-to-day usage of the stack that we're going to look at and for questions about knowledge graph ontology and user needs you can point them at me a note of reflection before a probably ill-advised live demo attempt uh years ago the paper a market for silver bullets described a dynamic in which neither buyers nor sellers in cyber security had the information they needed to know what effective solutions look like for their problems i would argue this largely still holds true today if recognizing that no one has the right answers is one necessary step to

surviving in this industry the other point the paper asks us to acknowledge is that any deviation from best practices will incur costs where individual members go it alone our team are big believers in the value of open source and creative commons the continued perpetration of bad api outputs two-dimensional dashboards endless xml joins and mirror mazes of macros and pivot tables makes it clear that we need to collaborate if we are ever to breach this current equilibrium and move from lists to graphs yes as with anything that requires process change the shift of thinking and operating graphs uh in a hyperlinked way does indeed have side effects in cost time and effort i can also tell you from

personal experience it can be a massively frustrating journey to navigate no not everything you see here will be immediately transferable or indeed applicable at all to your business but the goal today is not to suggest that this is how things should be done just to share one possible path the experiences we've had and the mistakes that we've made so please treat this talk like a meal eat what you like leave what you don't let us know what dishes confuse you and let us know what you would like to see added to the buffet with that may the odds be ever in our favor let's cast our mind back to the imaginary incident we talked through

earlier the early blur of phone calls emails and messages the problem of keeping everyone on same page and the dots we had to find and then join up so here is a different version of how our story could have unfolded let's imagine that when we joined digicorp it actually been building a security knowledge graph for about nine months and that excuse me uh they'd also uh grab some readily available data sets stuff like hr data application user lists alerts from some endpoint technology a few cloud systems and of course good old manual data entry so let's just check that we're still live and alive okay i feel very ill right now so as we step into the elevator to the

120th floor and we're going to be very glad there's so many flaws in this building over the course of this demo um our first concern is we want to know who we're going to be talking to uh who's going to be in the room when we step into it we've got an email with the ceo on cc and what that means is we don't want to walk in and not know all the names that we're going to be dealing with so we are going to do a search for uh the svp of special projects uh who was the gentleman who'd emailed us there we are who'd emailed us about the incident we're going to do a jio

research we don't have time to look for pete smith there could be many of him and what we've got back is a result great pete smith is in our knowledge graph this is good news so what we're now going to do now that we know he exists is we're going to ask our knowledge graph to show us pete we've got his unique identifier in jira and what we've cooled down is the information that lives in jira the node the edges around pete so that we can start exploring them in slack on our mobile phone in a lift so we've got some options here uh let's get a quick screenshot of the jira page because we want to see what else that

can tell us and let's view some links because that'll probably tell us some good stuff so here we are off we go so first we've just rendered a little graph here this takes pete and it gives us all the links the direct links that pete has and here we've got a screenshot so let's look at the screenshot first uh this is the data that we have in jira we can see that uh pete is uh assigned the role of svp special project he reports the ceo interesting good to know this must be a fairly senior guy he owns a risk that's interesting we'll come back to that later if we have time he manages a few people and he's funded

by the data science team okay well that's fairly helpful uh let's skip out of that and go and take a look at the graph here's exactly the same data visually represented a little bit easier for us to consume okay cool we know what we're doing and that risk looks interesting because it says in the next 90 days if vulnerabilities need mitigating or the acne is taken down there's no single person who can make a priority call expected loss is one million if similar incident occurs interesting all right so we're now going to look at the ceo so we now know that the ceo has a data tag here gsp 353 and we can use that tag

to go and see who reports to the ceo so what's the reporting line we can get if we go down a bit so now we're going to get another graph and tell us some stuff and here we are so ceo's manager of sbp special projects again this looks like the reporting line obviously the demo data usually if you went down from the ceo you get a much bigger hierarchy it's just showing the content so we know pete's a senior person he's got engineering team underneath him uh the question is though who are these people who have these roles who's the lead engineer who's the platform specialist who's the engineer we know from our training in the

knowledge graph uh that when we have nodes like this we can ask who is that role assigned to using a search called role is assigned to and we're going to hit that so we're going to try and use some natural language now to explore the graph so that when we teach people this um a work they they know where they're going and it's fairly easy for them to do and we get an expanded graph right this is fairly handy this is what we're after when we start our search we want to know who's going to be talked about in this room who are the people we're going to be meeting with and right here we can

now see that our ceo who we haven't met yet because it's our first week is alan lee uh we know that uh pete smith is the guy who wrote the email bruno lyon norman ligo sophia berlin and alan champion great okay so we've now got our org structure so uh we know that we are dealing with an incident about an application called someways we have absolutely no idea what some ways is uh so the next search that we're going to do as we continue going up the elevator is for some ways so let's see what jira tells us about this hmm this does not look good it looks like some ways has had some pretty serious

incidents associated with this before for example we've got admin accounts here shared across users we've got questions about detections for malware and hacking and if we expand that out we can see there's really an awful lot of stuff that looks like we should know before we walk into that room uh let's take a look at one of them so we're now gonna dive from here into one one two nine six come on brain there we are so with any luck ah right so sorry that's done a pop-up on my sec one two two nine six on my current screen which is not helpful so rather than put that in there all right cool so we have now dived into jira

we've gone from slack because we want to explore the graph through jira and what we're seeing is a question that was clearly asked during this previous incident and we've got a ton of stuff here where it seems that it's been established that there's been an awful lot of detections from the edr software against certain users so obviously we are interested in taking a bit more of that so something jumps out at us he says or it did at least when i was preparing the demo ah there we are uh brute forcing so obviously we're gonna be interested in detections of brute forcing okay great we've got a detection here and here we've got a computer we've got a device

that it's linked to it's pretty cool uh we want to see who owns the device let's take a look uh we skip into that and jira is telling us this is owned by alan champion okay interesting we saw alan at the bottom of the chain he's the one who had this weird detection for something that we were interested in we're now going to jump in to alan and what we're going to see here are he is the admin for digi inc okay so if we jump in to this we can now see that at some point that had a login detection against it from a non-digicorp ip bad news it doesn't seem like there's any option to enforce or enroll 2fa that

seems fairly serious and we can also see who else shares that admin credential so what we're going to do now is go back to slack and we're going to dive in to the theater update about some ways so you've gone a little exploration through jira but jira wasn't working for us we got caught up in this kind of minefield of information we want an easier way to consume that so here uh once more we have our handyview in slack and what we are going to ask is what are the linked issues so as we go down we can see loads of incident facts so at some point the team that have dealt with this incident have

gathered a ton of data as they've been going through it this all looks very helpful we can see the admin account we can see the user account if we want we could edit some fields we could change the assignee we could add a description we could add some labels we could change the workflow and so on and if we dive into this let's have a look at what the first fact tells us in our graph lambda functions are working away hopefully

so no single accountable lead for technical support and triage in the team in the event of an incident that doesn't sound great we can see there's a vulnerability there let's take a look at that see what that vulnerability is okay so in the event of another incident or a requirement for architectural change there's no person who can make a final decision on the impact or appropriateness of a change that doesn't sound too great let's go in to jira and take a little look at that

huh you must have deleted that one all right um so long story short what we're beginning to run into is that we're having to jump back and forth between slack and jira to try and answer the information we want and that's pretty cool in some circumstances in some areas or some instances where we're going to want to do that but really what we're after is an easier way to consume that information so let me see if oh my

so this is if this works this is going to be a transcontinental live demo george is currently either not watching this like he said he would be and telling phibs or he's going to come online which i might need him to to do the jupiter support bit george if you're out there george i need your help now don't don't leave me man all right i'm going to assume that there's a long lag on the online support i i'm gonna i'm gonna forgive you oh my man

so awesome stuff so george are you able to do the jp servers command here because i'm now out of the lift i've got to run down another really long i've got to run down a really long hallway and i just don't have time to cool down this jp server so um some form of signal if that's okay or i'm going to try and do it from the commands i've got here let me

now this is how live demos should be done i don't know why i haven't done this before right quickly very busy very busy okay so what george is going to do hopefully um is he's going to now call us a jubilee notebook which basically has a pre-packaged way for us to view all the things that we're interested in in this demo and once that comes up fingers crossed uh once he starts doing it we're going to see some more stuff so while he's doing that i'm going to show one other thing uh so one of the things i mentioned um very often getting hold of information in kind of a consumable format that stakeholder ready is

difficult and we've seen a number of components in this knowledge graph today we've seen risks we've seen vulnerabilities that trigger those risks we've seen people who own those risks and we've seen a ton of facts a ton of kind of ground truth data that's been collected and evidence has been gathered during our our day-to-day work so one of the things that dennis did when he built this is a lot of the pain of our lives in security is putting together slide decks and what we've done is create auto templates so that all that data that lives in the graph once you begin structuring it in the way that we'll discuss in the ontology section of this talk you can begin

automating the generation of powerpoints not only that but you can begin sending them through slack or matamos or whatever information messaging service you have to the right stakeholders and you can put in interactive buttons for people to go i accept that risk or i want a meeting or i don't have time to read this slide debt right now can you give a briefing to one of my team and in doing that you can solve a huge amount of the pain that kind of we have to do to begin to extract information so this is just an example that dennis put together of what one of these slide decks looks like um i'm gonna ping back george back

in a sec just to see if he can finish off the demo so what we've got here essentially what's happening in the background to this is an instance of chrome is being spun up by a lambda function we're logging in non-interactively to chrome we're opening this page in jira we're taking a screenshot of it we're sending it back to the place it's stored we're putting it into a slide deck and that all happens in a matter of seconds so this is like programmatic creation of stuff from applications which is super cool um dennis has also done a huge amount of work to clean up the format of jira so it's super messy uh when you actually

first start doing this but with a few uh tweaks um you can actually begin to create really nice consumable information out of the way that jira presents things you can generate timelines so great thing about jira everything has a time stamp if i want to say this happened this happened this happened all i need to know is like how i want to create that path in the knowledge graph and i can go create it if we want to pull down graphs we can do that if i wanted to i could instead ask to just show a part of that graph and to break the text out of the graph and put it in a table for example if i wanted to

show all the detections that we have against our edr software i could create a heat map of that and so we saw earlier that we map roles and teams so i could create that for each team i have and i could send a heat map of the detections that the various teams have had via slack to the various team owners uh in our business to say this is what your lay of the land looks like this week that dude who's had like 20 detections for adware that's all software that he needs to do his job you should probably buy him a good license version rather than the crappy one that he's downloaded which comes packaged with a load of

adware and mcafee um [Laughter] so uh this is like you know just another example like if i wanna if i know i wanna understand accounts that have access to things in application scenario again once i have those paths in the graph all i have to do is create a reusable almost like micro service lambda function to pull that data out of my graph dump it into the format i want you can do a ton of stuff with this dynasty his credit is on holiday and has basically worked 18 hours a day to make this demo happen um so as as george um and i went uh ah it's your server great cool so fingers crossed i can click on

this and this is going to load up yeah that's that's it let me just try and get my mouse back to base let's create another desktop drag and drop this in expand that out there [Music] here we are okay here we are so here's the presentation thank you george legend our own time what a time to be alive as he likes to say so that's the slide generation uh thing uh but the thing that george has done uh when he was running this incident um is uh create a pre-packaged recipe book in jupiter so one of the things that pains me in our industry very often we end up asking the same questions multiple times and very often

i end up asking questions that many others in our industry have asked before me because certainly the issues that we generally face aren't new right especially if we work in instant response wouldn't it be great if there was a way to codify that knowledge in a way that everyone in our industry could access open source and this is kind of how we're beginning to do it um so what you have here is a recipe book that gives me as the head of incident response the lenses that i want into the data when i have any incident and in order to create this all i have to do is drop in the title of the incident so

once i create an incident in jira once i create an application this recipe book works for anything because it's agnostic of the particular data it just asks questions through a path in a graph so we can now create this scalable thing that says right okay well here are all here's the playbook basically these were the questions that this person who managed that incident had in their brain that they knew to ask in this sequence like here was the way that they went from a to b to c in that incident um we haven't got the timeline for some reason that was working this morning for which i apologize um but hey like at least the

demo has worked up till now we're almost at the end and breathe a sigh of relief um so this we're just running through here so you can see here what we've got like we've got an incident view we've got graphs to help us understand the application oh subways still there sorry in some ways um this is not an incident about subways uh control capability failures oh cool here we are so the other cool thing about this um is that um we can bulk edit data right now from jupiter so we don't have to create things individually if i'm in a meeting if i'm in a war room meeting with an incident i can literally be adding questions adding

decisions adding people to my knowledge graph in real time through this interface this is like the control panel um to begin graphing stuff and once you kind of get this you power you end up graphing everything you can get a little bit we did get a little bit too much into it which i think we've pulled ourselves back from the brink but here you can see like these queries here these are the queries that are generating this graph and as long as my graph has standardized queries again i can replicate this for loads of other stuff and if i wanted to i'm not brave enough to because i'm not technical enough to i could change this now add

something else in and it would change the graph live so if i know what questions i want to ask of the graph i can use this recipe book to begin building my own whole new dish based on the scenario i have so this is the building block for me to expand stuff out so understanding subways accounts etc etc uh you can graph things like this like we use plant uml just because it's super easy to read but if you want to do more kind of neo4js stuff you can do it like this so this is showing the middle is some ways application the accounts around the edges sort of the accounts the next letter in and then we can see

all the um the alerts and the people who own those accounts that are associated with them and also i know we've talked a lot about incident data but when we run incidents one of the things that we do is we look at what the facts that we discover during that incident tell us about successes and failures of our security controls so in pir what this enables me to do after an incident is go through here and say right well for example this fact that there's no authentication gateway for an application indicates a failure of identity and access management secure architecture secure engineering and i can actually begin to build out a picture of how successful my controls

are for certain coverage across certain elements that i'm interested in in my environment and i can begin to look at that across my entire enterprise all i need to do is start linking the data together so that concludes the demo let's just go

cool thanks george thank you

the goat sacrifice was not in vain people it's not in vain turns out you can have anything delivered to your door in vegas is it about to go horribly wrong now though this would be this would be typical ah we're back right cool okay so let me navigate back to the slide where and let's have a look so how am i doing for time please

ready yourselves people this is not going to it's not going to be easy so let's have a look at the tech stack uh there were quite a few moving parts there uh before we go behind the scenes

help us move faster in partnership that is not to say the primary function of our tech stack is not to automate pain away it is to enable us to deal with that pain faster it should basically be like an iron man suit which any member of the team can put on to help manage new problems quickly and efficiently and once knowledge and analytics are based uh baked together sorry into a process that is stable and reusable only then will we add automation uh this approach draws inspiration from this article which is super cool called automation should be like ironman or ultron and basically the goal is to enable us to do the creative stuff enhance our data

system and keep ourselves moving moving faster rather than doing boring repeatable tasks second we want to make it easy for anyone who joins our team and eventually the wider business to be able to find and understand patterns in data both at systemic and local level so rather than providing set ways of looking at something we want people to be able to benefit from our knowledge base but also add their thinking and right and evolve it and take it to the next level to do that we basically want to take lists of questions like this and we want to build them into something like this so these are lists of ingredients in data examples of recipes

and eventually a library of micro service run books which can be taken and joined up as similar patterns emerge to the ones that have happened to create them so this slide provides a rather abstract visualization of this process but basically from the top to deliver a specific mission for example in incident response you know as you're in diagnosis mode versus solving versus mop up and pir you're going to use different data building blocks these are your ingredients those ingredients are going to be combined in different recipes over that life cycle to uncover knowledge answer questions complete tasks etc etc and in future what we hope is those recipes become reusable finally there's a heavy focus in our

system on the etl phase of transformation why curve balls in consuming data are sadly the rule not the exception uh messiness is a feature not a bug especially it seems when you need to consume and correlate data at short notice so we designed heavily for that so let's take a tour through the data system here are its current components the lines between them indicating a root of input or output for data to flow at this point well probably before this point you may have been thinking jira more on that in a moment but when we began building this we had a shoestring budget we needed a system that we could choose to scale as we wanted it to scale

and sure it may look a little bit weird at first glance but sometimes you have to sell with the ship you have rather than the one you want if i'm honest of all the data systems i've seen built and worked with to try and solve analytics problems in security this is by far the most elegant so a few definitions in terms of how we think about those system components jira is shorthand for our graph data store and ontology management system as well as having a lot of highly configurable fields it also logs every single change that's made under a ticket providing a full audit trail of who did what updates to what ticket and when and who doesn't

love a good audit trail elk is our friendly friendly neighborhood index where we store jira data so it's easy to search and visualize of our slack it's also a good place for us to analyze and visualize trends relating to nodes and edges albeit more in terms of how people are using the data system rather than actual operational scenarios slack slack is basically our command line tool as well as the communications fabric that our company runs on this lets us automate all kinds of feedback loops via medium that our colleagues are already familiar with and are engaged with a huge amount of the time big shout out here to ryan huber who's blogging distributed alerting informs a lot of

our thinking and all we've done is really add a sprinkling of jira into the mix here's a few examples of the kind of command lines we can call down through slack here's another one and then finally jupiter so this is our more advanced interface for creating and working with both ingredients books and recipes here's an example of an ingredients book and basically what this is designed to do is enable easy exploration of relationships between all our asset vulnerability and risk data so when i think about all this when i think about this text that um these are effectively like a choice of interfaces either for users like me who are non-technical all those like my

colleague george who are highly technical so we can pick and choose how we want to interact with our graph then in the middle we have gs bot acting as the api broker to make all these various connections possible that's the tech but what about the problem set that this actually solves for real people here's the frame of reference that we use for thinking about the modes and triggers people have when they need to interact with data information and a huge thanks to russ thomas mr meretology for sharing the triggers part of this with me a year years ago as it inspired my thinking we can map our modes to different parts of the data system where they fit best

and consider what interface is best under what triggering condition and this helps us consider routes for inputs into our knowledge graph and indeed those outputs that support feedback loops so with our robot army of lambda functions acting as the glue box between this system we can now do things like this so when we were graphing manually on the fly or sorry uh as you'll see in a moment um effectively we're under a kind of crisis trigger under interrogate mode um and this is basically what's happening in the background right um tickets are created from slack tickets sync to jira jurassin to elk and then as we clear it goes back slack all happens in a number of seconds uh dennis is if

nothing else super focused on speed uh of execution um in this kind of stack and it's very very impressive to see how quickly uh it responds to commands for batch graphing uh we can basically dump stuff as as we kind of touched on in the demo uh through slides into jupiter sync them to jupiter indexing elk again searching slack explore stuff you know that's that's kind of the workflow for that uh here's an example of a jupiter recipe book uh yeah it's a video um so i'm not going to go through this i will make the video available online all this is saying is like ultimately once you've taken two or three messy data sources and you've done

the the the stuff once the parsing like why not batch that right let me take that data and start to identify weird things about it right so um this is a representation of ontology but it's amazing how much business context you can get from just two or three data sets like your hr database some random application access lists right and anything else you can get your hands on the gray boxes represent data which is going to need to come from elsewhere um but even with a few data sets you can start building out context between business and technology dimensions which after all is exactly why we set out on this journey and it's a short jump to getting data like this

to using the passing process to identify inaccurate incomplete and incongruous data for example who owns that active generic account in that sas system which has a dot-com email domain which isn't yours and seems to have last logged in three years ago and why is there a disagreement between systemx and your hr database about whether or not this employee still works for you these are the kind of questions that leap to the fore when you start taking these very basic data sets and asking simple questions of them so those sound like facts we might want to capture present to management and present potentially a vulnerability and now we have the beginnings of building out our graph from just a few data sets

that we already have to begin starting to operate in graphs it really doesn't take much super cool and we're more than happy to talk to anyone about how you can speed up that journey if you're interested in going on it uh here's just an example of a recipe but we have effectively what we've done here is mapped an application via reporting line up to the ceo just so we can better understand the stakeholder landscape so i showed you that example earlier of mapping down from the ceo this is a real example that we use in our company um and here's another example with a slightly different lens where we actually want the names of the

people and this maybe there are people leaving maybe there are people joining we want to add some names in in the meeting we're in while we're discussing changes to the application during a threat model so all of this becomes interactive in every time that we go out and kind of touch a stakeholder either via a meeting or a conversation on slack so here's a workflow that combines automation with user interaction in the loop this is ryan huber's distributed alerting this is a video of the user experience built by the man and the legend gigi who you've already met so let's have a quick look what's going on here so this is distributed alerting um what's going to happen in a mere moment

is so this is an alert that has come in it's been triggered by something going off being put into elk and an alert in a lambda function seeing that data in elk and it sent it to george saying hey we've seen an unauthorized login it's come from a non-company ip address here's the date it used the method password um can you please let me know so this is him just indicating here this is now synced to jira right so that's auto synced you can see the workflow there is in progress usually this would be an alert that we would have to go triage right figure out like hey was it you but thankfully george is going to click this was me

and that is going to update our workflow to not only log that the person clicked it was them and to say that it was them that clicked it it's also going to change the workflow close the event and we don't have to touch it all that's had to happen there is a user has had to do an interaction which they're very used to doing which is going to a slack channel and responding with a single click so these are kind of workflows that you can begin to build and they can be much much more complex than this once you get into it really your imagination is the limit um and then it's just a question of figuring

out who wrote the code and maintaining it so what this enables in the long run is micropopulation analysis don't worry about the details on this slide the main takeaway here is what you've just seen begins to enable us to understand better a specific set of users who have a specific pattern of life within our business and we can use data as a security team to better understand their reality so let's say we have a shared email account which multiple users access with normal detections all we would see is a detection against that account but with the workflow you've just seen because there is someone who goes and clicks that and acknowledges that that was them what we then begin to see is

actually alerts against a specific user for that inbox and we begin to build up that pattern of alerts for example if this person is burning the midnight oil seems to be working 24 7 because their colleagues are on holiday frankly that might be data that hr should get right so there's a huge amount of benefit we can begin to give other parts of the organization by doing this the aim here however and i just want to stress this is not to be a 1984 s security team we want to use data to better understand the business process so that if we do need to apply controls to it hopefully we give ourselves every opportunity to minimize friction or

avoid it all together this is a work for employ progress essentially someone here reports a vulnerability alert goes into slack risk team get notified wrist team explorer wrist team link stuff to stuff they then change some data in jupiter so this is more you know this is at the moment a great example of where we have a much more involved manual workflow that we're trying to identify opportunities to automate um basically how we want it to end is that the exec gets sent that neat slide deck in a slack channel to go yes i accept the risk or no please have a meeting with me so if we can use feedback loops from these kind of interactions

perhaps this actually gives us a better pattern for what risk appetite truly is because what we're doing here is we're mapping and creating a system of record for the decisions that our business is taking about issues that they have in the real world uh so this becomes a great proxy for understanding different risk appetites as they exist across our business uh the code for all this is here uh we've created a fake mini company um a serverless version of jira um there's basically you know incidents alert people roles everything that you've seen today in this demo exists in this environment um you can go have a play with it the integrations with slack and jira aren't out of the box yet watch

this space as we begin to develop it out but basically you know feel free to go have a play it's a graph database that you can mess around with and begin to see whether or not there are stuff that's there that's useful for your business so then i've got 13 minutes left um probably not going to get through all the slides frankly it's a good thing um but the reason that we wrote this deck was actually provided almost like a book right so you know this this is designed to be the guide to the thinking as other teams kind of you know if they want to go on this journey or are on this journey oh that's interesting like wow

that was wrong and you shouldn't do that um which of course we would love to know because again you know we then don't have to make mistakes but let's whiz quickly through the ontology section see how far we get um and uh take a look at it so ontology is really obviously kind of at the core of graph thinking it's a set of concepts and categories which show properties and their relationships so the one we've arrived at uh in our noise graph has evolved over time it is definitely not a butterfly at the moment and this section of the talk gives you an overview of the evolution process and some learnings let's start with some early work that we

are ashamed to put on our kitchen fridge uh so this is a flow diagram of our instant response process about eight months ago there's a blog online about this uh if you search for it um the highlighted areas just show the different jira tickets we'd raise across this workflow and frankly while this created a varying amount of administrative overhead during an incident uh from oh this is okay to dennis i will never work in this way again um the detail it enabled us to capture was really invaluable both for post-incident reviews i took what went well and what needed improving as well as for capturing knowledge about all the business context that we have so early

on we were using graphs uh just to kind of map out our incidents right in pir and go right how many questions did we have to ask against a thread um you know how how many what threads got connected to each other did we successfully complete the incident or not unfortunately loads of really valuable data was getting lost and it was getting lost in the place that we capture incident tasks which is the description field the free text description field in jira this was immensely frustrating because incidents would pull on lots of threads across the business and there was no good way to weave this data together we began experimenting and after realizing how damn expensive it is

to refactor node and edge ontologies in jira because at that point we did not have jupiter and i was doing this manually late at night much to the frustration of my very patient wife um we moved to plant uml and this made it cheaper to discard mistakes and rebuild the graph differently but frankly the results weren't helping almost everything was linking to everything and while we developed some key components of the graph during this challenging phase overall things were getting more confusing when people would say that looks complicated dennis would buy our spirits for which i thank him to this day pointing out the complexity we were creating was a reflection of a complicated reality but that didn't

change the fact that our system of nodes and edges was increasingly hard to work for uh this is an example of the nodes we have in just one project and we have multiple and that was before you got to the problem of deciding what edges to use to link what nodes and while our graph certainly lacked nothing in terms of freedom of expression the consequence was inconsistency which made it incredibly hard to navigate the graph and ask it questions with the confidence that you were seeing all the data that you wanted the result was confusion in our own team let alone when we tried to use the data to communicate with the business and to a large extent frankly

this was because our graph had become removed from operational reality our nodes and edges reflected abstract concepts that we were trying to mold together that didn't make sense to anyone outside our team and for some of us you know not even those that were in it or indeed creating them after a while so forcing functions of funny things and just as incident response had been a trigger for us to work in grass with practical and beneficial results at operational level budget season helped us to make an evolutionary jump in a more strategic direction one of the many challenges we face in security teams budget season is articulating effectively what won't be done either based on the investment the business is

giving us and prepared to make or that our security team's ability to actually operationalize the budget we're given so we began focusing on the function of data to solve this problem drawing some inspiration from the bauhaus movement our need for fact-based narratives was clear we needed a common theme of questions um and the the things that were coming our way needed to be translated into business context without requiring a human to sit there so in classic two-choice presentation style we stole an idea from a friend of mine who works to management consulting who once said there are only two presentations that you give to management cloudy day sunny day which things are bad but if

you do xyz things they will get better and sunny day cloudy day things seem good but they won't stay that way and so once we once we started to develop narratives like this one uh here when we transfer them to our graphs all of a sudden the graphs began to get a lot simpler and a lot cleaner even when our plot lines got a lot more complicated and had a lot more actors in them uh we were confident that the storylines were still confident to track and we cycled back to jira and began implementing the ontology we trialled while our nodes and edges often did not correspond to a human readable version of the storylines we were telling that

mattered less and less because you could read those stories across the data in the graph the nouns and verbs that we needed were becoming far more human readable emerging through the shape of the graph and the storylines were working really well as we went and presented them to stakeholders even if they never saw the graph um so began the era of the great refactoring the informal creation of the entropy crushing committee hello james if you're watching we started standardizing and formalizing our nodes our edges and the relationships between them that could exist in the graph which represented a trade-off yes i'll come back come on don't die on me now okay so uh we chose a rigorous structure over

the ability to enter arbitrary data uh because of these things we wanted a logical narrative we wanted to be human readable we wanted a clear idea to be in someone's head of the possible outputs when they asked the graph questions this made it easy to see when data was missing so there's nothing more helpful than knowing what isn't there when you ask about the knowledge that your collective graph has as knowing what is especially if you have an incident and you might need to call in your colleague who works in threat modelling to do a super urgent threat model of an entire application so thankfully this choice fitted hand in glove with the way jira allows you to

organize data this is roughly how data is kind of sorry this is a rough translation of how jira organizes information into graph speak happily from an administrative perspective this structure does support innovation so the goal of the entropy crushing committee is not to stop experimentation innovation it's to stop pollution of the graph that leads to not being able to ask those structured questions upon which so much of our recipe books rely um this is important because change to an ontology is a feature of knowledge graphs not a bug until it becomes cheap to mass refactor your noise grass however i would highly recommend avoid avoiding the pain in doing so so um two quick examples of lessons learned which

i failed to learn at the time which cost me an awful lot of sanity um this is a generic version of what an incident graph can look like uh this is an edge narrative from a few months ago just to give you guys a picture of like past history right compared to where we are now um here just for reference of the workflows that we associate with each of those different layers in jira so that adds instant task facts etc move through their life cycle we can track those uh so i'm not going to blend along on those uh one of the things that i failed miserably to capitalize on was investigating the metadata that people

were adding to issue types in the knowledge graph so here's an example from our security incident issue type which shows all the various fields that were added over time as we tried to tag things we needed add in details that we wanted to search for or we wanted to organize by this is just a different view of all those fields and what i should have done earlier was to look at those fields and ask the question are there other issue types in other projects that could benefit from what we are trying to do have they come up with a better way than we have to do it and is there value in creating a new

node and edge relationship that bridges across our projects uh within the graph had i done that i would have probably found that we were all trying to tell the same story in slightly different ways and we could have identified what those common themes were throughout those narratives and glued them together a lot better than we did a second example of lessons learned is from the red team project this is the project's ontology as it exists today this is the narrative it supports the ontology did not start like that originally it reflected more of a very tightly constrained pen test that we called a red team as we began to get and the business comfortable with testing in

production on a regular basis um over time our scope got a lot more free form and as that happened as our red teamers began to think creatively within the structure we began to introduce things like security control observations from the red team about controls that should could maybe stop them quicker faster and cheaper when they were able to jump from one task to another or exploit or find a vulnerability and then once we got the blue team fully involved in end-to-end tests and evaluating findings we began linking that concept of control coverage to the information the blue team had about i.t systems and specific i.t assets so that we could understand and then begin searching across our knowledge graph to

see if we had similar gaps in other systems from the one attack path that the red team would be able to take because even though they only get one you can then mine a graph to begin to see right okay what are the commonalities across other systems that might indicate control gaps control failures changes in operational states for example or lack of coverage um so at a certain point it was obvious that security controls and its systems and assets should not live in the red team project unfortunately we developed the control ontology pretty much in isolation we missed some major opportunities to evolve it and make our data richer across all projects for example in relation to how regulators

articulate controls and when we changed it again we had to do a lot of refactoring so like lesson learn you know get out your bubble go speak to members of your own team who are probably battling with similar things it's easy to get into a uh a spiral of focus and lose the bigger perspective when you're doing this stuff um so where are we today uh the folks at the moment is leaving the detail behind our time is spent thinking about the big building blocks uh so that we can begin to fit the detail in them in the best way uh here are some examples this is kind of the business structure ontology which you've already seen here's the it

asset ontology uh here's the projects here's um interactions so once you get into like graphing meetings is actually really valuable well that decision you made that led to that task like that task hasn't been done it's been three weeks so like when you said in the meeting it would be done last week could you move that into your workflow so you can really begin to drive accountability through doing this cool five minutes left uh here's a good one for third parties uh this is the kind of stuff that i want to know and link to uh here's our incident one as it is today here's the i mean like this is the yeah this is the threat model this is still

very much an evolution there are some part we'll come to this in a moment there are some confusing parts through this graph that don't quite work yet um but again like this is where we are right and the the goal of this is to share um in all its imperfection what's going on um and here's the red team project that we've already looked at so um the plant uml code for all of these if you're not a coder like me and but you can kind of understand plant uml um that's all on github too so you can now go in you can grab all these ontologies you can experiment with them you can change them you can work them back and

get back on github and show us what you're doing and show us you know where we can take the next evolution just by way of a few guiding principles uh here's some things that i have found helpful to avoid mass refactoring there can be many ways to explain relationships between nodes in different projects but there should be just a few ways to describe node relationships within projects this is okay because the people project is separate to the instant response man tell me about it like if you feel this guy goes like so i know it's a lot of information i apologize um it'll be over in three minutes now um so for both of us like i can't i just need

i need a beer i need to go and cry in a corner um so uh this would not be okay this is not a good example of ontology next be careful of node to edge paths that create unstable narratives so here's a great example you start building your knowledge graph this incident uh used a vector threat in this instance use a vector and cause that impact right there it's super clear what's happened bad things happen when you start linking through non-mutually exclusive relationships because the context for which node links to which node quickly disappears and you have no idea of which storyline links to which storyline finally look for node to edge joins that create narratives with the fewest number

of touch points so here's a great example uh this is just you know random example like joe bloggs who works for don't know what it is acme methods uh he reports an incident if the incident concerns an application uh plant uml does weird things with the actual way that visuals laid out you just kind of have to live with that um here's a link into the application so we can begin seeing that and then this example which is kind of cool um so what we're doing here is we're actually saying we ran a project so for example if you run a project to do technology oversight right and investigate say a firewall you have or you know

you might find a bunch of issues with that waff right and those vulnerabilities can then be articulated to the business in terms of the things they affect and they can be linked to the people that need to make decisions about those risks and what we can then propose another project that says hey if you want to fix these things this project is going to do these tasks over this time period to deliver that shift and in that way you can begin to articulate the movement of time and resources through the graph that allows the business to understand where its money is going and then at the end of the year all you need to do is basically

say well like here's the graphs and here's how we've changed them this is basically where your money's gone which is very very helpful and stops the usual kind of zero budget thing that everyone has to do going back to the beginning and building everything up in a nightmare excels um that said sometimes you just have to accept that security makes a mess out of everything here is a project migration graph before security gets involved and here's what it looks like after you've started adding in what i reckon i like fairly minimum set of relationships in a threat model um so in closing i think i've just got time hopefully for this like two minutes uh what next i have been thinking a lot

about this quote recently uh from my friend dan about how to put graphs in context one of the reasons that we rely so much on this industry in this industry on generic patterns of play and best practices is there is no widely shared knowledge base that helps us identify the right pattern of play for our business despite the allure of consultants selling us what good looks like there is no single repeatable pattern it's more like playing 50 games of chess and changes on one board affect all the others as we desperately try to tailor our strategy and operating model to deliver stage appropriate results and build the boat while we're rowing it it's very hard to understand i find in a

given moment what the best choices we have if we lack a picture of the landscape and every visit to sfo reminds me of this a short geographical distance that does not count for hills may not be the smartest route uh simon wardley has written a lot on maps and patterns of play um when we think about the inputs and the outputs within controls that rely on data and analytics and we think about what that would look like if we started connecting it to the business perhaps what we need to do is to start considering and combining graphs and maps to understand where we need to put the focus of our investment and those data scientists that we are hiring and who

are trying to do such great work for us for example if the internal feedback loop in our sock looks like this and the one for our red team looks like this and the data feedback between these two controls involves this then maybe the smart place to invest is here and sorry incidentally along the bottom access what you're seeing is things move from genesis a state of kind of unstable creation into commodity at the end so we have a bunch of ideas on this that we just haven't had time to work on if anyone loves graphs and maps please get in touch this is kind of the next phase of experimentation for us where we're going to take stuff um i'm also super

excited about building the fair model into our ontology um you know jack jones has done some great great work in helping us articulate risk and i think some of the stuff we can do with graphs is especially helpful in quantifying unknowns uh through the lens of knight in uncertainty for example where we're actually unable to say what a risk is because we don't have the information to make that judgment um last two slides we face a really tough challenge in this industry to hit a target that's basically context dependent moving all the time i hope some of what we've shared today it helps us all escape a condom common enemy which is the risk of tron and uh that's

all thank you so much for listening and i hope it's been helpful [Applause] i've overrun so there are no time for questions but i'll be around all day so thank you

drop it down

and once i put this on you are live on the youtube so you know i'll try to keep the profanity to a minimum just say interesting things i'll do my best you're going to be great i i need a usbc hdmi adapter

hi yes good to meet you i'm really sorry that i've ever run no no please it's fine i'm glad i at least got to catch the tail end of your talk i've been uh in the speaker room getting ready so uh it's really cool we'll have to pick up later and you're gabe hey great to finally meet you in person my pleasure too i'm looking forward to this one as well awesome well let's hope

perfect and where's the hdmi blue one okay cool thank you it worked when i tested it so you're down okay this is just not twisting right all right all right well let's have i've touched it twice now great job again it really yeah we'll get there

sure thank you so much cool this is not happening so far uh it worked in the same speaker yeah it looked exactly like that um

yeah i can try on the other side maybe the slides are online if there's another device i could use um but this worked in the the speaker room how would you prefer to be introduced hey that's like that side does magic and i did try that side earlier so there we go um i don't know what are my options however is cool

will be filled

no no i can kind of stick to this area yeah just be aware of the speaker this speaker projector oh bad speaker yeah i'm not going over there that's fine

he got it

you're right

okay sir do you have one more

all right welcome everyone good afternoon uh this talk is grapple for the wrath platform for detection and response presented by colin o'brien now right before we begin to have just a couple of quick announcements first of all we'd like to thank our sponsors especially our inner circle sponsors critical staff bail mail as well as our special sponsors amazon blackberry and paranoid it's support from these sponsors and other sponsors donors and volunteers that make this event possible also this event is live stream so as a courtesy for the audience and for our speaker we ask that you right now check to make sure that your phone is set on silence now if you have any questions uh just so

that our youtube audience can hear you we're gonna ask that you've used the question mike just raise your hands and i'll be sure to bring over that microphone for you uh lastly lastly if you have feedback for this event for the speaker there is a survey in the shed entry for this talk now we're ready to proceed so please let's welcome colin o'brien

awesome all right thank you all for coming in my talk as you mentioned i'm colin o'brien i'm a software security engineer looking at dropbox on their detection and response team if you're not very familiar with detection response the very short summed up version of my job is that i am tasked with tracking suspicious behaviors across dropbox's various environments if something looks particularly bad i'm going to spend some time scoping it out figuring out the root cause and then if it's actually an attacker or more likely our red team i'm going to ensure that they've been completely removed from the environment today what i'm going to be talking about is a project that i work on outside of

work so this is not affiliated with dropbox it's called grapple grapple is an open source graph analytics platform that targets detection and response work graph is kind of the key word there sort of a graph based approach to detection response from a higher level as well as how grapple leverages that to make a lot of the things that i do at my job a lot faster and more ergonomic so before i jump into what grapple is let's just do a very quick overview of graphs graphs are a data structure just like a list array hash map they hold on to information graphs are composed of nodes and edges nodes are going to be oh shoot is it not

updating okay i'm gonna do it like this cool that that should help uh so nodes are um entities or things nouns right uh you could imagine a person maps very nicely to a node edges are going to be those lines in between them they denote relationships between nodes as an example you might have two person nodes and an edge between them because they are friends graphs are a really powerful data structure for a whole number of reasons i think one of the easiest ways to demonstrate that is with a very empty plain graph like this right there's no explicit labels there's not a lot of data here but even still i can say a lot

of interesting things about this graph i can say things like the purple node has a relationship with the green and the blue node and the green node has a relationship with the blue node the reason i can do this is because of a really great property of graphs where they encode information about relationships into their structure itself that's going to make them a very powerful visualization tool but that same property is going to come up in a lot of different places okay graphs are uh they're out of sync here graphs are really powerful data structure so of course all these different companies are leveraging them google's knowledge graph powers their search engine facebook's graph api is what underlies

all of their public apis crafts are also leveraged in things like tensorflow as part of the the way that the representation of its computation is so tensorflow is a form of data flow programming which means it executes as a graph for context tensorflow is a machine learning library that was able to power alphago and defeat all the top go board game players and that was a really big deal when that happened graphs also tend to be very emergent so bgp and the internet right is essentially a graph of routers communicating with each other and packets sort of traverse that graph and that just sort of sprung about given that all these companies are leveraging graphs for these different

use cases it makes a lot of sense that the security community has started paying more attention to them a couple of years ago john lambert at microsoft wrote a post about some of the areas where he sees graphs and security in this post lambert makes a very bold claim he states defenders think enlists attackers think in graphs so long as this is true attackers win lambert then goes on to give an example of this list in graph based thinking he talks about how when defenders are given a network to protect one of the first things they'll do is start creating lists such as who are the domain admins who are the high value users what are the risky

assets right and from this work they'll begin prioritizing what they're going to do to defend that network this is very different from how attackers go about doing their job attackers will gain a foothold on whatever asset they can actually get their hands on they're going to leverage the capabilities of that asset such as by dumping uh credentials from memory using tools like nemicats and then they'll begin abusing the trust relationships of your users and your network to move laterally across it according to lambert this mismatch in approach is so fundamentally bad for defenders that we simply cannot get around it without a shift in thinking at the end of lambert's post he has this quote he says manage from reality

because that is the prepared defender's mindset so i think if you if you read that post and you think about what he's talking about and you try to get that that fundamental reality i think there's a more generalized concept here which is that if you take information that fits really cleanly into one data structure such as a graph and then you try to force it into another data structure like a list you lose information you make certain operations less optimal in this case we're losing that that trust relationship information that a graph makes very clear but a list completely removes uh okay that slide is dead uh cool bloodhound is a tool uh that i think has

really managed to um let me give me one moment i'm gonna see if i can move this over yep this is why i have two tabs open cool so ah this is the speaker view okay i have three tabs open just backups for my backups okay cool [Applause] very prepared so uh i think bloodhound is a really great demonstration of this graph based approach uh bloodhound allows you to visualize your active directory structures so that you can move from a world where you do things like um say who are my domain admins and start asking questions like what are the paths to my domain admin and it does this by visualizing all of your active directory

data as a graph and letting you query it as a graph so that's a really fundamental shift in how you do that work it's a it's a new capability altogether when you start thinking in that way so in detection response uh sort of the fundamental primitive that we use for our work is the log logs are these digital representations of events across a network what we do is we index billions and billions of these logs every single day collecting them and storing them in what's called a sim which is effectively just a giant list of logs and we'll search through that list of logs trying to find suspicious behaviors and if we find them we go back

to those other logs to pivot off them and get other behaviors right so everything is is based on this log construct now if you pull out a couple of logs right and you put them next to each other like i have here you can start to see that there's actually these implicit relationships between them right i can see that the parent pit in one log and the pit in another actually matches with a parent pit in a different log right when you start pulling these relationships out and turning them into graphs becomes very obvious what's going on we're exposing not just the event in isolation but we're telling a whole story about what's happening here we're seeing those

relationships we're seeing those behaviors so this is really what grapple is all about uh what grapple aims to do is what a sim does with logs grapple wants to do with graphs so your detection work your response work scoping all of that's going to be graph based grapple runs in an aws account so after you set it up what you'll do is send it some raw logs grapple supports sysmon as well as a more generic json format that you can target grapple will do very much what we saw on that last slide it's going to pull out a sub graph representation based on those logs mapping things like pids to nodes it'll perform some identification steps

because we want canonical identities we don't want to think in pids we want to think about you know a node a process being a node as as a self-contained identifiable construct these identified subgraphs get merged into a giant master graph so there's this graph database that's constantly in real time being updated representing all the entities and behaviors across your network as that master graph is being built and updated grapple will orchestrate the execution of your attack signatures uh or what grapple calls analyzers and those are going to perform a sort of pattern matching and query that master graph for suspicious behaviors and finally when enough analyzers go off and you decide that this is something

you need to investigate grapple provides a tool called an engagement and that's going to allow you to really quickly pivot across your lungs and fully scope and attack those behaviors so i'm going to be going over how all of this works sort of the graph based fundamental behind it and just try to give you a high level overview of what grapple's able to do so when i talk about that master graph i'm really talking about what you see here at least today there are some core provided uh graph abstractions we have things like processes files external ip addresses and connections to them right so we can represent all of these things in our graph today in the very near future

this should actually all work it just has to go to master i can do things like provide asset nodes as well as internal network communication which would be really great for lateral movement detections beyond that grapple provides a dynamic node construct which the plug-in system uses i'll go into that later but essentially you can just expand this graph with plug-ins however you want you can see that nodes are pretty standard processes have things like process names files have paths and then there are these sort of special edges these special properties like a process has children or files that it's created right and those edges point to a list of other nodes so this is this is what grapple is under

the hood so i mentioned that there's an identification stage this ends up solving a lot of really important problems that a log base system will have the two problems i run into with logs are that for one thing pids those pseudo identifiers process ids they get reused they're not actually really good identifiers it's the same thing with file paths right if i create a file and delete it and then you know some attacker puts a file there it's important to realize that those are two distinct entities so uh one thing that instrumentation tools like sysmon will provide is a process guide construct so you don't have to worry about pid collisions you get true identity but it doesn't solve

the second problem i have which is that when i want to understand some kind of construct like a process i will search for that process do it right and okay no more pid collisions but i'm going to have to comb through tens or even hundreds of log lines just to understand what it's doing there's tons of redundancy in between there if you've seen a sysmon log probably 70 percent of that data is duplicated across every log so what grapple's going to do is much like sysmon it'll generate a canonical identifier for any logs that you send up not just windows but anything grapple can parse and then it's going to coalesce all of the unique information into just one

place one node right and that canonical identifier is called a node key i'm going to talk very quickly about uh one method of identification just so you can kind of internalize what this looks like this is session based identification sessions are things like processes or files they have a pseudo-identifier like a pid or a path but it's only good for a period of time when the process has started all the way until it's ended right the way grapple solves identification is it will look at logs like process creation or termination logs it'll start building up timelines for every pseudo identifier for every asset here you can see that there's two process creation logs and so we can say that pid 250 on this

asset has the id 0 for the time span of 20 to 50 and then there was another process creation log so we know that there must be a new entity from 50 to wherever that next log is when this other log comes in because the process has actually done something to find its identity we just look it up in the timeline now keep in mind that grapple's going to handle all of these crazy edge cases for you one obvious one is that your instrumentation might start up after a lot of your processes have already started so you'll never get those process creation logs there's a lot of heuristics and other work that goes into handling those cases but this is the

happy path and this is what identification means in grapple all right let's talk about detection so log base detection i think tends to drive us towards uh properties or artifacts right we will look for hashes and we might even look for things like command line arguments i think the command line argument example is a really uh really useful one because command line arguments aren't actually interesting attackers can change them they can use other binary different command line arguments or bring their own binaries right uh what we're using command line arguments for when we use logs to build rules is as a proxy for the underlying behavior i don't care that you know curl executes with dash f i care that the attacker is

sending a file off of the box that's the foundation and logs don't do a very good job of exposing those those core behaviors on top of that those sims the the indexes kind of punish you for writing searches that have to look at more than one log at a time so if i want to pivot in part of my rule going from an execution log over to a network log over to something else the sim might actually just break if i try to do that joining performance is generally something that starts getting exponential very quickly one demonstration of where this relationship based rule uh would be really helpful is if we have these two logs here these are just to process

creation logs one is for word and one is for powershell now these are both uh valid digitally signed microsoft binaries they are almost certainly executing in the vast majority of environments yes powershell is a tool that attackers like to use but so does system admins and again if we if we put these logs next to each other we can see that there is this implicit relationship right and when we turn this into a graph i can see that it's not word or powershell or any properties of those processes that i really want to build a rule around it is the fact that word is executing powershell it is actually the hidden information that i care the most about

in this case and even more so it's not word and powershell being having that relationship it's it's really just the fact that there is a uh an assumption that two processes shouldn't have a relationship in this environment right so the more generalized structure based query that we want to get to is tracking things like unique parent-child executions when you start designing your attacks as if they are graphs i think the attack signatures become really obvious we have executions of word talking to a non-white-listed ip address right this is going to be kind of property heavy but the properties are more of an optimization to increase our sense of risk we can also use more generalized searches like the one i

talked about earlier the unique parent child process more complex searches that require multiple hops right or even some causal analysis like we have here a a process that has talked to an external ip address and then it created a file and then it executed that file building that kind of rule in a log base system is prohibitively difficult you wouldn't reach for that kind of rule because you would know it would be painful so my opinion is that this graph based approach that we've sort of designed here is already strictly better than a log-based approach i can take advantage of properties when i want to take advantage of properties and that's that's great that that

word.exe talkingtoeval.com right i'm thinking a lot about specifics there but i can also track fundamental attacker behaviors if an attacker has to worry about process executions being unique you are changing how an attacker is going after your network of course the truth is that we have to treat uh that really nice crafted word uh with an anomalous connection very differently than the uh parent-child process anomaly right one of those is going to happen hopefully never but the parent child process one could happen fairly often in your environment and so for this reason we introduce a concept of risk right now i'm essentially labeling the the master graph saying these small pieces here are very risky and these other pieces here

are not that risky right so again this is a nice improvement this fits more with my mental model of how i want to track attacker behaviors and make their life harder still there's one more level that we kind of want to get to as an example what if it happens to be the case that word talked to a non-waitlisted domain and also word has a unique parent-child relationship i don't want to have to write a whole new signature for that i already have two signatures here right i want those to just automatically compose together so for this reason grapple has a concept of an asset lens or a username lens these these lens nodes allow you to view

otherwise isolated risks uh under a specific concept like a username or computer so here we can see that all of these independent risks that i had started tracking actually overlap and i can take that overlap into account when i'm describing the asset lens risk itself right i can say that this specific computer is not just the sum of the risks under it but also add a multiplier like an extra 10 for every node that overlaps right because that's extra sketchy when these things are overlapping so the actual implementation of this is going to be in python that's the language that you would use to write these analyzers i chose python for a number of reasons

my experience with query language and and domain specific languages that are very common in the state-of-the-art systems is that they front load a lot of their power they're really purpose-built for specific scenarios and anytime you try to move into some other scenario they really start fighting back you get performance problems you get huge huge queries because you can't abstract things away you compare that with something like python python is an extremely general purpose language i think it's probably fair to say that it's the language of choice for a large part of both the data science and the security communities you can build out powerful abstractions and you're never going to feel particularly limited by python

search query like periods or anything like that it's just going to happen on every single update this function will get called and passed in a client to talk to the master graph you will get a node view node views are a concrete representation of some node that already exists in the graph and then a sender which we'll use to emit hits so this analyzer here is going to look for suspicious executions of processes because their parent process is word right we don't expect word to be uh creating sub processes so this process only involves um i'm sorry this query only involves processes so if it's not a process node we're just gonna ignore it like if this

is a file node or a network node this it's not relevant to this signature uh we'll create our process query we will constrain it by saying that it has to have the name winword.exe and it has to have some children which we won't constrain because there's there's no white listing here to do any child process is going to be suspicious to me i want to track it i'll then call query first and pass it to client and i also pass it the node key for the node that was passed into us for the node that was just recently updated that's really important because that allows this query to execute in constant time so you might be thinking

this graph is going to have you know billions and billions of nodes and this analyzer is going to get executed over and over again you know hundreds of times in a minute easily but it's always going to execute in the exact same amount of time or roughly speaking because it's a constant time operation you're going to notice that's a trend in operations in grapple they will always be constant time whenever possible the reality is that we collect more and more data every single year it's um even just having linear access times it's not going to scale to next year when i've doubled the amount of volume right so if we get a response back we will emit the execution hit i'll give

it a name i'll give it a risk score and i'll see what that uh the the concrete node was that i'm considering to be sketchy right so pass in p and uh you know this is this is pretty bad so give it a risk for a score of 90 right one example of how python is able to provide us with these powerful abstractions is that we can leverage this parent-child counter that grapple provides this is just a specialized interface it's going to encapsulate all this nice logic for us and just expose a single method called getcount4 we don't have to worry about how this works under the hood there's actually a lot to it for example

there's actually a redis connection for this parent child counter so in the vast majority of cases because of the constrained api we actually don't even have to talk to the graph database to get these counts this is just an example of how you can build really powerful reusable constructs in a way that most sims are not going to allow you to and so we can also build out these more complex queries here we have a signature for processes that have a binary file where that binary file was created by what i'm calling an unpacker something like 7zip or winrar now here we've got a really low risk score right i'm calling this 15. if you

try to spend your time whitelisting something like this you're just going to waste hours or days you're never going to whitelist something like this because new software will be deployed and it's going to use this sort of approach but it's still an interesting behavior that i want to track in my environment so i can move away from thinking about white listing in black and white situations of good and evil and just say that you know let's track this and if it correlates with something else we get that nice multiplier because of lenses and i don't have to waste my time right that's really important i spend a lot of time on whitelisting searches that would just be

better served if i could downgrade them i think maybe the best part about leveraging tools like python is that you get to benefit from best practices standard practices right if you compare the number of people who are writing uh you know elastic searches query language or splunk's query language to the number of people writing python it's orders of magnitude difference if you start googling how to write a unit test for splunk you're not going to get very nice results you will find that it is entirely unsupported compare that to python your standard library just import unit test that's it right we can already start building out testing infrastructure i can deploy this code to github which grapple supports

via git hooks i can add linters code reviews i can roll back and revert changes if they're broken continuous integration right my opinion is that as an incident response team starts to scale alert management is one of these problems that's going to start creeping up on you more and more and you're going to say i thought we had an alert for that and it turns out that it was actually just broken the whole time you didn't have tests for it so this is a huge value add for actually managing the searches that you're creating cool i'm going to talk about uh investigations so log based investigations usually start off with one or more logs and maybe a ticket that

tells me why i should care about the information in this log right so here maybe uh you know that that hash is just known to be bad for some reason uh the way i usually start off an investigation like this i'm going to take a look at the suspect process see what it's done but i'm not going to spend too long on it i'm immediately going to start tracing it backwards and find the root cause and see where this thing came from so what i'll do is i'll open up a search window in my sim say the last eight hours right somewhere around a business day uh and i'll pick a field to to start

looking over so let's let's look at the pid right i want to see what this process has done and i'll get like tens or hundreds of logs back maybe i get too many and i have to spend some time cutting it down but okay i've got a general idea of what this thing is doing it doesn't look great not an obvious false positive time to start tracing it backwards let's look at the parent pid uh maybe i find that the parent pit is something like launch d or cron right it's legitimate this is a dead end the attacker has set this up to execute in a week or two weeks right so i'll start looking for that file that's the only

thing i can start pivoting off of i'll search for the files hash right and i don't get anything back not great but okay there are a lot of reasons why that might happen i'll try the image name but i'm still not getting anything clearly this file was dropped a long time ago i'm not getting any new logs back so what i have to do at this point is extend my search window back and now you can see i've got some logs related to that image name which is great but i'm paying a very serious cost these are linear searches at best which means that if i extend my search window by doubling it let's say every search from

this point forward is now twice as slow on top of that and and really much worse is that i now have pid collisions uh pid collisions in my experience for a client laptop if they are running chrome in particular uh are going to happen basically every couple of hours if my investigation is going days it is essentially a guarantee pig collisions suck they are really annoying to deal with uh you have to start saying pid after this time but not before this time stuff like that it's really painful uh and it's because logs don't have that strong sense of identity so there's clearly a couple of other problems here one of the problems that's maybe a

little harder to see is that i don't actually have a good idea for how i'm pivoting i want to know more things about that file really i just want to know what created it but all i've got is this hash i'm just going to hope that that hash shows up somewhere in other logs i don't know that i'm pivoting to the information that i want i don't know if it's on the other side switch tabs let's see what we got oh wow that's not good get a replay of the talk almost there ah come on all right sorry bear with me with exactly two minutes while uh the speaker wi-fi is not working uh for me

and so i'm going to tether to my phone which will just take less than 30 seconds i apologize tethering mobile hotspot let's connect to the wi-fi and reload cool not what i was hoping for

okay that should do it i again apologize for that and it's going to be slow wi-fi at that so uh spoilers cool okay so that didn't take too long uh grapple takes a completely different approach to investigations from this uh at the heart of grapple's investigation process is the jupiter notebook in this room maybe some people are familiar with that essentially it's this uh python environment that you can interact with in your browser you can do all these crazy things you can split your python code up into these cells you can inline mark down upload images replay different cells it's a really powerful tool and importantly it is the tool that the data science community has been

leveraging for years my opinion is that the detection and response field has a lot of intersections with the data science field we do a lot of the same work we're all just trying to hunt through data to find something that looks like signal right so we should be paying attention to what that looks like here oh man wrong tab killing me start presenting yes cool okay so grapple has a sort of two browser pane user experience on one pane you will have a live updating view of your engagement the engagement graph that you see here just contains two nodes one represents the engagement and metadata around it the other is this svchost.exe as i mentioned earlier we

don't have to comb through hundreds of logs i can see every unique piece of information about that svc host right there at the bottom it's all in one place for me at the bottom of the screen you can see sort of an excerpt from a jupyter notebook where i actually instantiated this engagement called demo and then i pulled in a node based on its process uh it's it's uh node key right i can't really show both panes at once very easily so i'm going to show you the engagement and then the graph essentially we're going to kick this thing off i'm going to do exactly the same workflow as before i want to understand what a process has done one

of the most important things to me is what children processes has it executed so i'll say uh get the children for this process and i'll just you know print out their process names and i can see command.exe three times so this thing is shelling out to some sub processes through command obviously really sketchy again this is a constant time operation it doesn't matter how much data is involved it doesn't matter if command.exe was executed months or a year later constant time lookups for these edges all of these get underscore methods are also causing that live updating graph view to pull in those new nodes which you'll see in a moment so let's keep going let's trace this

process backwards i'll go up its process tree we can see its parent is command.exe which you would see in the graph and there's a grandparent process which is called dropper.exe and we can keep going and eventually we'll see that the user downloaded this dropper.exe from chrome this is what the graph ends up looking like and it's a pretty nice story right we can say that the user executed chrome from explorer uh chrome uh executed this dropper process uh we've got dropper shelling out to svc host and then svc host shelling out to these other command.exe and this is what it looks like right so as you're typing in the commands on the right side you're going to see all those

nodes automatically just get pulled in on the left side so we've solved a lot of those problems that we talked about with the logging system right i'm not fighting my data anymore i don't have to worry how much data is there because it's all constant time operations there's no search windows right before i really wanted to keep my search windows small so that my queries would run fast but i also wanted my search window to be as large as possible so that i could search over the most relevant data here that's that's a non-issue we don't even think about search windows i have identity so i don't have to look at multiple logs or nodes i just look in

one place and i can see all the unique relevant information and i have my pivot points if i want the children i just ask for the children i say get children right i don't have to search for the child pids or anything like that and hope that i get the information back that's all the the sort of graph stuff for grapple that's the detection and the response and the identification of it there's also another important word with grapple and that is that it is a platform grapple does not intend to solve every single problem on its own that would be silly instead it is built to be very modular and extendable and it actually provides

a plug-in system on top of that everything in grapple works through event submission or receiving so you have these aws lambdas those are just serverless compute functions they get triggered every time something is uploaded to specific aws s3 buckets that's just a storage interface and they can read and write to these buckets and even emit new events right so it's very easy to extend because if i want to add say a custom parser all i have to do is subscribe to the right event stream grapple provides a plug-in system for this uh the plug-in system is still in early stages uh but this is fairly representative only get better from here um there are three main components to

building a plug-in for a net new source type in this case i'm going to walk you through what it looks like to set it up for aws guard duty that's an amazon offering where you pay the money and they send you logs to tell you when something bad is happening in your environment so we're going to build a a subgraph generator it's going to parse logs and turn them into graphs that's what you see here we're going to build that uh that query construct like we saw that process query right we want to be able to build analyzers around these ec2 instances and guard duty alerts as well as a view construct so that we

can represent the concrete nodes in that graph the first component is going to be built in rust the other two are built in python i do not expect anyone to be particularly rust savvy this is just pretty simple code it's mostly boilerplate and the macros are going to take care of a lot of it so so what we're going to do here is just focus on this aws ec2 instance you can see at the top here i'm just going to put whatever properties exist in the node for this you can see there's an arn that's the aws resource name it uniquely identifies that resource every ec2 instance or iam user anything in aws always has a

strictly unique identifier we also have a launch time and there's all these other properties that i've omitted here just for screen space i've added a macro that is not showing up super well on their gray on gray but essentially what we're saying here is turn this structure into a dynamic node right that dynamic node is a construct that grapple knows how to understand and we'll identify that node using a static strategy so earlier you saw the session based strategy static strategies are even easier it's just a lookup we can say the rn maps exactly to its canonical id right that macro is going to generate two things for us one is the dynamic node version of that structure so it'll be

aws ec2 instance node and then an interface for that structure implementing that interface is pretty trivial it is one method and it is always the same exact code and that's gonna basically allow grapple to do all these things on top of that node everything from here is pretty simple we get logs we parse them using a standard json parsing library we get a structure out we create our node we just instantiate it by calling new we're going to pass in that static strategy method that's generated by the macro so you don't have to implement that or anything and at this point we just populate that node with information we'll set the rn i we'll set the launch time we put it into

a graph description concept or construct and and that's it this is all the code that's necessary my opinion is that if you do this once it'll be really easy to do it a second time you'll probably run into a couple of things just not necessarily knowing rust very well but it's it's really quite simple once you know what you're doing everything from here is going to be on and it'll be even easier this is the ec2 instance query you saw the process query earlier right this is how our analyzers are going to start looking for sketchy patterns that have ec2 instances in them really technically the only thing you have to do is write this code here it's

just a constructor we're going to inherit from the dynamic node query and it'll provide all of these interfaces under the hood for us but so that we can provide a kind of prettier api we'll wrap those internal interfaces with ones that have nicer names like with launch time or with instance id right but that is it this is all the code you need to start building analyzers at this point so views are what we use in engagements and are what we actually consume from the analyzers when something is updated so we'll create a view construct for this aws ec2 instance it's really quite simple we have a constructor that has to take a couple of

values the d graph client which is the graph database that the master graph uses a node key a uid and then just whatever properties our node has pretty simple just construct that your your super class and again mostly we're going to be adding helper classes but there are two methods that we have to go uh and implement here pretty simple everything is just a mapping of field name to type so launch time is an integer um rn is a string that's it uh because the python code doesn't know the schema of the graph database you kind of have to create these mappings for it same thing for the edge types uh there are reverse edges denoted by tildas so

aws ec2 instances have no forward edges but they do have reverse edges like all the guard duty alerts that they are a part of right so we say uh for any finding resource reverse edge there can be many guard duty alert views because we might be a part of multiple alerts and when we from an ec2 instance perspective talk about those alerts we call them guard duty findings right so it's sort of the the mapping of a reverse edge to a forward edge and saying what the type is and again we we will add these helper methods so this is all pretty simple these are all of the constructs that you technically need to implement to plug it

in grapple this is still a work in progress i'm hoping to cut some of this down and provide some more intuitive interfaces but um really not much work i built this guard duty plug-in which included users and and alerts and all these other things i think it was less than an hour and i was building the system as i was doing it so it's pretty pretty quick there is only one more piece here and that's the actual deployment technically you can deploy this however you want as long as it goes into ec2 but grapple provides a construct to the amazon cloud development kit so that's an infrastructure as code group of libraries and and things like

that grapple provides a library and typescript it'll be in python soon but python was not supported when i started building this uh pretty simple we'll use the event emitter construct we're going to create a new event emitter for guard duty logs so aws will just ship those off into an s3 bucket and this will set up all the events and notifications a service construct that's going to set up our aws lambda which will actually run the subgraph generator and we'll set up a an integration with the output bucket the place we're going to emit new events to in this case the unidentified subgraph bucket and that's it you can run cdk deploy all of your code is going to

go up there it'll all be managed by aws really nice and easy setting up grapple is intended to be as easy as possible if you clone the repo and you go to the grapple-cdk folder you can fill out a env file there is one field in that file and it is a bucket prefix just use an org name anything that's like not going to be taken by somebody else it just has to be unique and run the deploy all script and that's it it'll take about 15 minutes but you don't have to sit there and even hit yes it'll just take its time and set up those resources at this point once you've run this script

the lambdas are all set up the graph database is set up uh everything needed for identification the s3 buckets the user interface that we saw earlier for engagements that's it right so about 15 minutes couple of commands there is one last thing that has to be done we're going to provision the schemas for that graph database so that grapple knows how to talk to it properly for this we can go to our aws console select sagemaker we can just pick the notebook that grapple has already created for us right down there at this point either create a new notebook or you can just use the one that's provided by grapple run this notebook it's going to provision

everything and at this point you're done send up some test data which is also part of the repo these are sysmon logs you can see that dropper.exe and the malicious svc host you can investigate those check out what the network traffic was right dropper.exe pulled that file from somewhere you can figure out uh from where you can look at the the child processes and see what those are doing and just you know perform your own investigations so i i believe i should have time for questions because i went as quick as i possibly could but like i said grapples open source if you're interested in contributing interested in using it feel free to hit me up always happy to talk to people

about it [Applause]

nice job um couple questions so how do you think about recursive state so for example if you look at those ec2 instances network firewall rules might be on off on off change how you guys think about that and then second question is two entities multiple edges because there could be multiple relationships between them how do you guys think about that yeah yeah those are both really good questions um i would say that by far the hardest things i deal with with grapple are modeling questions like that because they have really big implications for the workflows and that sort of thing um so i'm gonna answer your second question first it's a little easier uh what that

comes down to is just a decision so as an example right now grapple uh does not have a concept of like writing into a file multiple times right that's going to change by having an intermediary node called like a write node which will store all the metadata in multiple times for every single write that's usually the abstraction you want and then the query interface can abstract over that uh very very easily it's really just a question of whether you want to differentiate those different rights and that sort of thing which has come up a lot of the time with with grapple uh connections are another good example the way connections exist in grapple is you have a process with an

outbound connection which has an edge to an inbound connection which has an edge to a process right so there's these intermediary nodes that we can hold that information in uh so really just a matter of making that call as for uh recursive systems um i'm not sure i fully understand exactly what you're asking maybe if you could clarify i think i kind of understand the the ip question so it's not as much about recursive systems it's about successive state changes so if you think about something like a firewall rule typically what happens when vulnerability gets set gets open gets closed that time sequence it's not clear how that's represented because it's the same state being

changed multiple times on the same entity yeah yeah that's that's a a really good question and again one of the tougher modeling questions that i've run into state changes are something you can abstract away by saying like here's a node representing the firewall rule from this time to this time and another node uh with this new firewall rule right so if you wanted to expose that for something like a firewall rule that makes a lot of sense and you could build that into your ec2 plug-in um give a clarify i was just going to say in both those cases those somewhat permute the constant time calculation piece right because those grow linearly or exponentially based on

the number of connections or stay right yeah yeah totally so let's say your firewall changed like a hundred times right then you have to perform 100 edge expansions the good news is that edge expansions are extremely efficient and hopefully your firewall isn't changing like millions of times but yeah totally a reasonable thing there um one thing that i've thought about the nice thing about being in aws and about being a platform is that we're not constrained to the the graph database even right if i wanted to i could have a dynamodb table that just tracks firewall state uh for everything and then the analyzer would write to that table and then query the table and say like okay summarize

this information has it changed for this port right so in theory if you wanted to solve that problem there's nothing stopping you from just setting up a new system in terms of representing in the graph it's a hard call to make it really depends for a firewall i might be inclined to put that into the graph because i don't expect tons of changes for something else you know graphs are awesome for for a huge amount of these workloads i think they're a great like native fit for most cases but if something doesn't fit into it set up another database put your data in there right optimize for the for the workload you're looking to solve

cool hey uh thanks for the great talk and of course for open sourcing it um i was just wondering i don't know to what extent you have used this or deployed this at scale and relied on it uh but sort of the fundamental basically the splunk problem that everybody's got is sort of the volume of logs that you've got to deal with and uh it sounds like to some extent you know you're getting rid of a bit of the redundancies that you would deal with all these like tons of logs revolving around the same thing yeah do you have a sense of sort of the storage efficiency that you can reach by having sort of everything already

identified codified into this graph database versus holding on to all the logs that you would over a month two months three months for an entire fleet yeah yeah that's that's a good question so um in terms of scalability it's it's kind of variable right so the logs that i work with at home are going to be sysmon logs and sysmon logs just they have so much redundant information um i have something like uh 60 megabytes of system logs i run tests with it it takes 3000 seconds the actual storage property is about an order of magnitude less than that in terms of the data that grapple actually has to hold on to that should scale really well as you add

more and more redundant data but it's going to be super workload dependent right so if you have i think the good news is that for the systems that are the worst offenders for something like splunk they're actually the best case for grapples so an extremely noisy process that's doing tons and tons of things that's grapple's best case because very few of those things will actually be unique in terms of the logs uh for systems where it's like tons and tons of unique different processes and files executing it will certainly be no worse than splunk um its worst case cases linear scaling so yeah it's hard to give a good metric there because it's so use case

dependent uh the system itself has a huge factor on it uh the log type has a huge factor on it so it's it's hard to really answer that directly yeah but it it is no worse than splunk certainly and i i have felt those exact pains that's always the the goal yeah anything else

do you see uh standardization coming down the pike for like predicate names types or like rdf namespaces as this concept gets more popular yeah absolutely uh so the biggest focus that i have been having over the last couple of months is getting to stability api stability that's actually why the plug-in system exists so that i can stabilize the core of grapple start moving other concepts into separate systems that can stabilize at their own pace most of what you saw is entirely stable so the query interface for processes should be stable that's not an api guarantee yet but i intend it to be very soon the common information model that i'm using uh is standard it's it's a little

bit modified because i have all these edges and stuff like that but for the most part those properties are are absolutely stable edge names it's it's subject to change at this point um on top of that i mentioned that i'll probably have these uh sort of intermediary nodes to handle things like writes once that's out of the way uh which is probably a matter of weeks i will be stabilizing it and i'll make strict api guarantees yeah but that's it's a big focus right now

[Applause] sorry about the wi-fi issues by the way

hey man i just want to introduce myself i'm urban good to meet you yeah you as well do i i know you from the selection committee that's what i thought okay cool cool yeah yeah it's great i'm really glad to be here i was uh i don't know how to get this thing off of me i don't want to break it yeah great job though thank you very much yeah i um i was telling you thank you uh b-sides was my first conference uh when i when i first got into security and it's been really nice to actually come out here and and for one thing provide an open source project but also to give

a talk and everything so yeah please do if you run into any problems or you got questions just hit me up this is uh anytime after 5 p.m i'm probably available for this so that's good yeah yeah yeah you as well i appreciate it hey good to meet you yeah thank you as you think about ec2 and other infrastructure

not so terrible just normal good email and i can label these as such and then i can feed these to my classifier and they now know my definition for spam and my definition for him cool so then i can take this other text down here this other like set of emails and i can feed it to my classifier and based on the definitions i've provided my classifier will assign a label to these emails right so it's kind of like if i tell you to draw an apple and i put an apple on the table in front of you or give you a picture of an apple so i'm telling you both what i want from you but also

giving you an idea of kind of what that should look like so contrast that with unsupervised learning so this is when we provide input and we just let the model figure it out we don't define the categories as we did in the spam classifier example but we let the model do it for us and a really popular example of this is using an algorithm called k-means to cluster handwritten numeric digits together so we don't provide any source of truth we just say you know sort all of these into groups that look the most similar to each of the others if that makes sense and k-means is actually one of the algorithms we're talking about in a

moment okay data types so categorical these are categories these are things that don't have numeric properties and the numeric i think that's pretty self-explanatory they are numbers um so before we get into talking more about the results i want to talk about imbalanced classes and so this essentially is when the number of observations or data points in each of your classes or groups is not evenly distributed this is a really really important concept if you're wanting to do any sort of anomaly detection or detecting things that don't typically happen a lot of the time so to kind of explain how this would work with mailchimp so here's here's our user base and these are normal users the

majority of all of our users these are people who are just marketing their small business they're creating landing pages for their like community group whatever they're doing normal stuff and then we have these users so these are the users that we're talking about these are the ones we're interested in they're doing bad stuff uh they've been attacked or they're sending malicious content these are the ones that we really want to focus on so if i were to randomly sample this data and let's say just pull you know 10 000 users out of the entire mailchimp user base the likelihood that the distribution of okay versus not so okay users would match this is really high like i'm

probably going to get the same distribution in my sample data set which is great from a sampling perspective but it's not so great from the perspective of training our model to recognize these because there aren't enough examples of these if i just sample randomly so that's something that we're going to have to address when we get into feeding data to the model okay so now we're going to talk about the models and i will say this is i'm going to talk through this this has been a trial and error process and it's also still in progress this work isn't finished so it's not going to be like a super straight shot of just i did this and i

did this and it worked now it's a little bit more real than that so i hope that that's useful okay so first we have to figure out what data we're going to feed to the model and there's a lot of options this is obviously not all of them but it's just a sample of them um there are tons of different ways i could look at this and i eventually ended up filtering things down to something that looked like this and you'll notice that these are mostly security related or what i would consider security adjacent features so i want to call attention to this specifically because this is where in particular if you are you've been in

infosec for a long time you've been in security space for a long time and you want to do stuff like this this is where your expertise is so critical and this is a place where you can really shine even if you feel like you're not sure about the modeling piece of it because you already have a sense of what might be important like i chose these features because i thought i feel like these are going to have some bearing on how users might be grouped and they might have some bearing on whether a user is deemed more or less risky or at risk of being abused or attacked and so i just want to point that out

like domain expertise here is really really really important and can save you a lot of time so yeah so i start with security only or security adjacent features and seems like a good place to start so this is kind of what my data looks like this is a simplified version of course but you can see that i have two factor status represented and then the two other features that we talked about earlier times found in breaches and then the number of logins and you can see how these accounts differ one thing i want to point out here is that these accounts are excuse me this variable the two factor uh presence or absence this is a

categorical variable i've encoded it numerically because i want it to work with a model that takes numbers but these are categories so just keep that in mind so i started with k-means um i started here because this was really like i was just sort of sitting at my desk one day and i was like this would be a cool thing to do kate okay k-means is a thing to cluster okay i'm going to use k-means that was literally how i selected this to begin with okay so i mean come on like the one of the cool parts about this type of stuff is that it's all sort of like try stuff and see what happens

like there's experimentation to this and so that's that's really fun um this part wasn't so fun but anyway um so k means essentially it's it's a distance based algorithm and it i will point out it only works with numeric data so that will be important but the way it works is just like the geographic segmentation that we talked about earlier so it basically will take a distance measure of each of the data points so these will be your cluster centers the restaurants and then it will measure the distance from each of those data points to each cluster center and then assign each of those data points one of those clusters cool okay so i said that you know k-means requires

numeric data well most of the um a lot of the features that i had were categorical in nature so i had to transform them and there are a number of ways to do this but i essentially encoded them each uh each of these things as numeric values because i was like you know that seems like that'll work so i ended up with what essentially looks like a sparse matrix of ones and zeros for these mostly binary categorical features so k-means uses distances right so what could end up happening here is k-means is going to consider really close to objects that are actually very distant from distant from one another just because they've been assigned two

close numbers so just because two things have the value of one doesn't mean that they're close it just means that was the only other option because it's a binary variable so i stopped right here i didn't actually go further with this i was like no this is a waste of my time i'm not going to do this but it wasn't really a waste because it kind of made me think more about how this would work so it was like okay i have categorical data let me find something that's going to work with that instead of trying to fit my data to this model is a weird way to say that instead of trying to make my data work with this

model let me try and find a model that will work with this data so i found k modes and k modes is pretty similar in concept to k-means but instead of distances it uses dissimilarities and so essentially um it quantifies the total mismatches between two objects or two data points and the smaller that number the more similar the two objects are the larger the less similar they are instead of means it uses modes so here's an example of the data that i fed to uh k modes again you can see a lot of a lot of this is binary categorical there's a lot of other stuff that's over off the screen that you can't see

and so here's what i ended up with all right so after trying a number of different numbers of clusters trying different ways of doing this i ended up using eight clusters just for no real reason let me be very clear no real reason again i'm just kind of throwing stuff together and seeing how well it works and i ended up with this distribution so each of these here to orient you these are these numbers over here on the left are the classes or the clusters and over here are the counts of users who fall into each of those clusters so you might think as i did oh my gosh okay these are the anomalies that

five and four and seven are the classes this is what i wanted to find that was not actually true um it ended up being that there were a couple of accounts that looked really similar they just had different role-based like access controls like one was a viewer one was an author so that wasn't quite as exciting as i hoped and then i started reevaluating uh everything like all of my choices and um and so i went back to the drawing board and i was like okay um let's add some features maybe instead of focusing on security attributes only maybe what i need to do because the whole point of this is to generate kind of a holistic picture of a user and

understand all of the attributes that end up made that go into that whether they're more or less secure and it's not just security related attributes it can't be that just doesn't make sense and so i ended up adding a couple of other features things like account size so like how many email addresses or do they have saved in their account how much do they pay us do they pay us and a couple of other things in addition to these security features but then i wound up with both categorical and numeric data so that's a problem because now i've got to find another algorithm and i actually came across this i don't want to call it

obscure but there's not a whole lot of stuff out there about it i found this algorithm called k prototypes and it was introduced i think in 1997 or 98 paper um that was essentially for this type of problem not necessarily security related but clustering with mixed data types and i was like okay you know what this seems like this might work pretty well and so i have my data both you can see we've got like a nice mix of different types of attributes and uh okay let's run it through k prototypes k prototypes by the way works kind of it's the same idea as k-means and k-modes uh but it uses dissimilarity instead of just pure distance

so we now have this and again these are the clusters and these are the counts over here on the right or the counts of users that fall into those clusters and so okay this looks a little more balanced than the results i got before but it's still kind of i'm not really sure what to make of it so like i'm curious specifically about clusters one and seven just because they're small and they seem like they're outliers and so i was just very curious my like the security mind in me was like okay two-factor authentication maybe there's something different about them here and there was so um these were the charts that i showed you earlier and so cluster one

and cluster seven again just two separate groups of users and they these these two clusters had the highest rates of adoption of two-factor authentication across all of those clusters so there's something and then these clusters four and five were the clusters that had the lowest uh across the board adoption of two factor and cluster four and five let me just go back for a second so they're not the largest but they're somewhat similarly sized so that's kind of interesting as well this this is a really i'm so sorry for the quality of this um this is really embarrassing but there is also an extremely extremely extremely weak but positive correlation between whether someone um pays us monthly so has a

non-zero average monthly payment and whether they have some type of two-factor you actually can't even see it on the projector because it's so slight but it's there so i feel like i'm on the right track okay but ultimately these results are still kind of imbalanced so what am i going to do and i so to be clear from this randomly sampled data these these were 10 000 user accounts it shook out to be about i think 13 000 like accounts that could log into these accounts there's a chance that the actual target class wasn't even represented in this data so there's a couple of ways to deal with this with the imbalance class issue so one

is taking so is over sampling so this would be taking all like normal users and original excuse me and not so normal users and basically creating copies of this not so normal like these bad users and inserting copies into a duplicate data set so i would have the same or almost the same number of observations of the the class that i'm really curious about and also just the normal user base i could also do something where i under sample so i could just cut this large normal user set and make it equivalent to the the smaller more interesting target set that i want to look into i will say that i have a little bit of a hunch that this

is probably going to work better that over sampling will probably work better just because i think giving my giving whatever model a chance to see this this much will probably help out a lot as far as uh giving extra visibility into kind of what these accounts look like and being able to recognize them but i'm probably going to try this too just because this whole thing is you know in the spirit of just let's randomly try stuff so i'll probably do that um and then i kind of want to i've touched on the fact that this isn't finished this is still like where i stopped just there is essentially how far i've gotten in this

research so i want to talk about a couple of questions i presented this to the data science folks at my organization a week or so ago i got some really good questions and i want to talk about those so one of the the big big questions that i've gotten is how are you going to evaluate the results of this particularly given that you're using unsupervised learning this is a really hard problem just like in any in general across any domain across any problem where you're using unsupervised or excuse me using clustering clustering isn't supervised but a lot of it really depends on why you're using unsupervised clustering techniques to begin with i keep saying unsupervised clustering clustering is

unsupervised there's not really a type that's supervised so it really depends on why you're using this method in the first place and unfortunately most techniques that that allow you to evaluate like formally evaluate the success of a clustering model um require access to some sort of you know ground truth some sort of labeled data and so that's kind of tough um i could generate some labeled data myself because i going back to the whole idea of domain expertise i think i have a good idea of what constitutes a fairly secure versus a not so secure account but again that kind of almost defeats the purpose of feeding it to a clustering model to begin with

so i'm not really sure um i think this is probably going to take some manual review i think i'll probably have to go through and just look to see you know do the clusters make sense which doesn't really sound super cool and like advanced but that's the reality of what i'm probably gonna do and then another common question i get is you know what other models have you considered so all the models that i talked about here um are centroid-based clustering and so i'm probably going to try db scan or some sort of density-based clustering model at some point um again i'm probably going to throw a bunch of things at the wall and see what sticks just because i'm really

curious how this will shake out and what i'd really like to do is eventually long term put this data uh into a supervised model use this to train a supervised classifier and you know i'll probably do something like look into a random forest or some other type of ensemble model so and i think this is really the crux of this whole thing is what is the point of this and how does this actually fit into the defensive security program at mailchimp like how does this fit into what you already have so ultimately my goal for this is to at the simplest layer if you are a high risk account if we deem you high risk or

your security risk maybe we don't let you fail at login so many times um or maybe there are certain things around the perimeter of the account that we don't let you do quite as much as we would an account that we deem not so much of a security risk and that's super broad i realize that but part of it's going to be figuring out what that looks like but really what i want to do is provide a heuristic for my team and for our anti-abuse team anyone else who touches this type of problem to focus monitoring efforts on accounts that really need it because we have a very we have millions and millions of users and it's impossible to catch

everything so if if i can help carve out these groups so one of them would be the like these kind of um the accounts that i showed on the left side of the spectrum earlier if i can help provide attention on those and give a spotlight to those then i feel like i'm doing something to help my team and that's really what i want to do and know you know this isn't perfect at all this is just one tiny piece of the puzzle that you know security is a defensive security requires a layered approach and this is just one more layer on top of other things that we are doing that we will be doing um and that i hope

will help both my team and other teams get get to the important stuff quicker so thanks [Applause] and i think we definitely have time for questions so i am up for up for that

first i'll so you looked at 2fa to see if there was anything special about it did you look at the other variables as well to see if there's special things about them or not yet yes so i didn't i can't really share all of it but yes there are definitely other other pieces as well that was just an example and so it looks like you're using jupiter notebooks and that's been something that i think several of the speakers today have actually kind of demonstrated so could you tell us a little bit about how jupyter notebooks have facilitated doing this kind of analysis for you yeah for sure in fact let me go back to one of the screenshots just as an

example so jupiter notebooks if you've attended any of the talks in here today you've probably seen some examples of this but essentially it's an interactive python environment that allows you to break up your code into these neat little cells and it provi it provides kind of like a living document you can reuse it um so i can plug in any type of other data set like i could just change what actually what data this is and still run this analysis again so it makes it really easy to continue to like hand analyses off it also has the nice benefit of like i can i can create notebooks that have these types of visuals and them along

with markdown and other types of text annotations so that i can pass this off to my boss and there but like i can pass this off essentially to an executive with with this kind of content in it so yeah interactive environment but also one thing that i really like about it is that it makes it easy to share that kind of stuff so did looking at either the medians or the prototypes or the means when you did your clustering give any insight as to what the clusters actually meant yeah so unfortunately the way that the way they kind of took shape was based on um account attributes that could be found really evenly spread across

different different p which doesn't i realize that doesn't make sense i'm not explaining that well yes but it wasn't super helpful for discerning why an account would be in one bucket versus another if that makes sense

can you talk about security education for the users and i mean is there an outcome of the research are you retraining users are you communicating with them changing the variables yeah so this is actually something i've been discussing with some folks very very recently um we historically have been really careful about reaching out to users or suggesting proactive things partly because you know let's say we decide we want to email everyone who doesn't have two-factor authentication on be like hey lock your stuff up like what are you doing that can also kind of have a different interpretation people can think they will see that and go oh no the mailchimp's in a security thing we're really concerned this looks

like it could they might have been hacked oh no so there's a really there's a fine line and we're still trying to decide where that line is we're talking about it though and i am fighting very much for us to be able to somehow talk to these users all of these and say please do yourself a favor and turn on these types of security features that we offer so yeah it's a really hard price i mean it's you know there's the whole like policy versus what we want so yeah uh first of all great talk i love seeing machine learning applied to like data science problems in the security world and it kind of brought up for me like

this question of like what is the dependent variable in this model like it seems like you have a lot of independent variables about users and not a ton that the model would even be able to pick up on in terms of outcomes and so maybe just to build on the previous question like do you have metadata about the kinds of outcomes you're interested in or is that just not available or like what are you hoping to see at the end of this model aside from identifying user accounts that are anomalies yes so ideally what i would do um and i didn't really get into this in the talk just because there was a lot to cover um

we have a known data set of accounts that have been attacked and taken over i know what those look like and so those are things that i want to go back and be able to potentially feed to some of the the future models and say can you discern whether these are good or bad one interesting i'm sorry i'm like going on about this but one interesting problem that we have is when an account is attacked or taken over or abused in some way almost immediately we have a team that reaches out to that user to say hey something happened you need to change your password you need to turn on two-factor you need to set up these

notifications and so it's it's great for the user but it kind of is unfortunate for this research because what happens is we don't get a picture of what that attacked user looks like at the time of attack so that's actually an initiative i'm working on with some of our data engineers right now to get those account snapshots so that we'll be able to have that source of okay this is what they look like so i hope that answers your question so i'll follow on to that um over here no you're good you're good um so it it it seems as though you would have two types of malicious users so the first type is an account that wasn't malicious

where their account got hacked taken over and now is being used for nefarious activity and then you'd have the other type of malicious user account that was set up with the premeditated idea to be to use that account maliciously did you find anything from the clustering that would help identify or correlate with the first uh those premeditated ones i'd call them yes so this is actually another thing that i am working on kind of in conjunction with this this is this is completely different so the way those accounts look those accounts that are created to be bad they're just garbage they look different um they there are specific things about them that unfortunately i can't go into

here but there are definitely things about them that may they're they can be very easy to detect and so so i don't want to say that's an easier problem but in some ways it can be because particularly if it's something automated if we see bot signups or things like that we have some methods in place to deal with that but this is a little bit harder because or at least it's harder and it's hard in a different way and we haven't really spent a lot of time on this particular piece of it which is why i'm focusing on that that's also another really a really big vector though for sure yeah

so it sounds like this is something that you kind of you had the idea and then you kind of have learned and evolved your knowledge of it as you've researched it what kind of tips would you have for other people that are in their organization and might want to try to apply that kind of to learn how to apply data science to their own internal problems starting from kind of the same position you're starting at and trying to kind of follow in your footsteps yeah so that's a really good question so one thing i will say is that i think i've been very lucky because mailchimp encourages a culture of experimentation and so if i do all of this and it turns

out to be you know we can't really use this in production we can't really do anything with this that's still okay because we've learned something so i will say i will put that out as like as a caveat but i would say you know think about your user base or think about you know what questions keep you up at night and is there a way to somehow start thinking analytically about this and i realize it's a lot easier said than done but thinking through the lens of an analyst or a data scientist what types of data do you have available what can you query easily what can you get through get your hands on data get even if it's i mean if

it's in a google sheet if it's in a spreadsheet it doesn't matter start getting familiar with it because the only reason i knew to go to some of these things was because i i lived in an analyst role for so long i knew what what the user account landscape looked like so i would say get really familiar with that um and then just start playing with things like start getting your hands dirty like don't be afraid like that so much of this failed but i learned a lot from it so

i'm sorry but i don't know if um this was covered earlier before i was able to get in but um if i did you have did you you're doing this work did you discover that there was a gap in data so you had to go back and add new data sources in order to be able to get it the correct sort of information yes sorry i wasn't sure if you had a second part to your question yes absolutely so one of those situations was in going through the different models realizing that i needed not just account security related data i needed just normal what does your account look like how much do you pay us how much how many

users do you have like other types of just account attributes but what there's one piece so i touched on this just a minute or two ago um we don't have data around what an account looks like at the time of attack and that's that so those those uh values aren't they're not time stamped and stored essentially um when the user turns on two factor that gets updated but we don't necessarily like time stamp that and say this is the history this is the change log of when a user enabled these things and so that's something that i'm working on like right now uh with our data team to get that to get that in because

that's been huge that's actually created a whole lot of extra work for this because that's something that i don't have that source of truth so does that answer your question okay

[Applause] thank you all so much i appreciate it

thank you thank you so much i'm going to give you an assist with this thank you so much

there we go thank you thank you very much

hey thank you so much

so when you say small how like how small do you mean like a couple thousand or um

is there any way you can so when i talked about like the over sampling like this kind of situation is there any way maybe you could create copies of the classes that you're interested in and maybe do something like that that's kind of hacky but that's the first thing that comes to my mind i might try something like that so yeah good luck good luck yeah absolutely okay it's kind of a general understanding what i took away from my answer

yes yes exactly yes we do okay so then the next question is is there [Music] uh other than the obvious questions that i can look at in my day and figure out but uh is therefore there are unobvious things like like for example

is

don't feel comfortable talking about them just because they are kind of integral to some of this um but there are definitely attributes to the account so like once i kind of zoomed out instead of just looking at security related features and zoomed out i will say there are definitely account attributes that end up having i think a little more bearing on that then so to your example time zone like i said i can't go into it a whole lot more than that and i'm sorry i wish i could um but yes how much i call that i don't know that's a dangerous question

sorry say that one more time

so i don't recall off the top of my head

yeah so i feel like that's definitely where we are like i've kind of been able to say hey this is important i think we should try this and i've got i've i've been loud enough about it that oh i think i need to get out of the way i've been loud enough about it that i think people have been like well we either need to take her seriously or placate her social shut up so that's kind of where that's where that's gone um i think a lot of it is dependent on organization like so much of it is um you know and whether this ever sees the light of production i don't know this is still literally just

you know in a jupiter notebook in git so

so i've seen a ton of people do it uh internally and that's kind of why i was interested in looking at it specifically user facing just because that was such a big a big problem we have um but

no i totally i get it we have to we have to be careful because people will is there

training okay better than the way so this is going to go down

uh

this is

you forgot the tomatoes no really

oh

it has to be like more than just is

so there's another going on

something

yes

uh

is

this morning no well i was for like some of the time

great

did you tell them

hi my name is gabriel bassett my voice is my passport verify me [Applause]

this event

please welcome john seymour awesome cool cool can everybody hear me awesome hi everyone uh welcome to my talk reducing inactionable alerts by a policy layer my name is john seymour aka delta zero and i'm a lead data scientist at salesforce where i work on the detection and response team performing machine learning on security logs to alert to new attacks to improve our existing alerts and rules and to find and make new contextual data for use in investigations my goal here today is to inspire you to be creative in where you apply data science and machine learning techniques a major impact can be had with very small amounts of effort if you come out of here with a nagging feeling that

machine learning would help you with somewhere that's not normally applied then i'd call this talk a success right so like a lot of good presentations let's start with definitions often times humans analyzing model generated alerts will throw an alert away immediately right and we found as we've deployed our models there are two main reasons what we've seen how this happens first when there are issues with the data pipeline so these are things like when necessary logs are missing when parsers fail when joins between host and network artifacts behave unexpectedly when third-party information is bad or corrupted when deployment inconsistencies throughout the fleet when added contextual information like hostnames is wrong when added contextual information is stale where it used to be

right when it's wrong when added contextual information is right in an unexpectedly way um this list goes on and on and on right contrast this to like obvious false positives where we mean the model is not able to capture the complexity of the instance where the activity can easily be determined to be a low priority or a non-existent threat to the business such as a model alerting on beaconing to accompanying resource generally we've seen these handled after the fact a white list is added which says simply even if the rule or the model says this is bad don't alert on it you can think of white lists as a simplistic example of a policy layer

which addresses the two causes for inactionable alerts in this talk we'll demonstrate how even simple modeling improves upon whitelist and further we'll argue how modeling whitelists separately from modeling suspicious events is actually a natural approach to the problem so here are some reasonable examples for whitelisting we've seen in the past right for a large number of alerts generated we don't actually care about if a connection is completely internal to the network so for these it might actually make sense to whitelist anything that where the both the source and the destination are internal or we might only care about a connection attack that's successful right um so we might think filtering alerts where the connection was ultimately unsuccessful

like maybe the firewall blocked the connection uh we might think that's a good idea or or another widespread use for whitelist is in filtering if the domain is obviously benign so like take the top you kn