
all right thank you everyone for sticking around those those are great talks earlier on but I would like to get started with our next talk and introduce our next speakers I'd like to introduce Peter and John and they'll be doing our next talk thank you hi everybody thanks for having us my name is Peter Smith I'm the founder and CEO of edgewise Networks I'm joined here by actually our first first employee and chief data scientist John O'Neill John is the brains behind the operation he's actually dr. John O'Neill PhD from Harvard in a variety of things he'll be giving his talk in French or Italian his ph.d is in linguistics Oh Sanskrit perfect so no we'll try to keep it to English
and the the point of the talk is really about how do we establish multiple different perspectives on our network attack surface and so I was at blackhat this past season and I was standing in line at a Starbucks for coffee I see this kid walk up to an ice cooler he grabs a bottle of milk and drops in in his pocket and I'm like wow that is brazen at a casino to steal something and so I immediately look around to see where the cameras are and I noticed a bunch of them and the thing was that the environment had changed there were other people surrounding this kid so from every perspective that the camera was in
they couldn't see this kid put something in his pocket and it got me thinking you know perspective is really the most important aspect of observing your network and the majority of what we do today is a fixed perspective and it's in most cases actually just a single point of view of the network and how can we broaden that to do some interesting things I I won't kill the punchline I'll save the interesting things for later on in the talk but quick agenda here I'll give you a little background on the problem we'll talk about what we're doing today as a solution why we think there's a better approach everything that we show you is free
open-source it's something that John's been working on as as somewhat of a side project at edgewise and it's on a github repository which will we'll share at the end and then we'll cover why it's a better solution and then I'll hand it over to John and John's gonna get into the details of what this thing actually is how it works right down to the graph analysis that we're using to to make all of it function so there could have been a different title for the presentation it could have been it's all a matter of perspective right instead of multi-dimensional attack path analysis which sounds very very fancy so what I mean by a difference in
perspective I have one perspective of you guys right now right I'm looking at all of you I can see all of you individually that's that's a single perspective me looking out into you the network we do this every day that's a nmap scanner sitting on some box scanning your network right multi-perspective would be distributed nmap you're all running and map looking at me looking at your neighbors I'm looking at you we aggregate the data and now we've got a multi-dimensional perspective of the attack surface of the network now the next part is position right now because of this microphone I have a fixed position I can only see the Front's of you but there's no reason
that I couldn't walk around and see the backs of you and so the position is also important can I change my orientation relative to the attack surface do I have a single view of the attack surface or multiple views of the attack surface and there's really interesting things you can do once you have all of this data in one place and you can do some analytics on it but before that I think there are some interesting physical analogues to this that are worth pointing out just to make the points crystal clear here so this is a friendly neighborhood coffee shop and what it's showing you is the perspective in the coffee shop of their camera it is a single perspective
looking over the counter into the customer environment and it's a fixed position I think it's actually kind of interesting as a side note that a lot of these camera angles that you're about to see they are indicative of sort of our social values when you look at this and what what it's saying is that there is no camera down into the registers cash drawer so they trust the customers or at least they feel like their processes would figure out if somebody was stealing cash out of it they aren't monitoring the clerk's they are just looking at the customers now if we change our view this is a you know any cyber security conference blackhat or a casino what we see here is 22
cameras just in this one view here it's multi perspective but they are also fixed position right they've got a lot of different dimensions of visualization of the total attack surface now we could do something else which is actuated cameras change your orientation or a robotic camera actually I used to work out of Manhattan for a company in Boston and I had one of these robots in the office and even though I had a fixed position or perspective I I could actually move myself around so I could get different orientations and views of of my surroundings you can take this to an extreme and what we have is drone swarms right so each independently movable and each with its own
perspective I mean the military does this on a daily basis right you're surrounding an object to get every vantage point you can each individually moveable so I thought it would be interesting to do a case study so I walked into a convenience store with the camera to see what their perspective was in terms of the floor layout what the obstacles were from your ability to look into aisles so on and so forth and how they position cameras so what you can see here is through the front door there's two cameras when you're looking down an aisle it's worth noting there are four aisles only two of them have camera coverage and then the cash register again this is a social values
problem the cash register has three cameras a cash drawer camera they clearly are looking for the insider threat stealing money out of the cash drawer there's the perspective of the customer at the counter and then there's the perspective of the clerk at the counter so I think what this tells us is they don't care about people stealing candy bars and aisles because they can't even see down 50% of their aisles they clearly care about the cash in the cash drawer so what I did and I is anybody here willing to admit that they watch the property brothers on HGTV okay okay perfect so this is the property brothers view where we're gonna do the Home
Makeover here we're gonna drop in the counter with the cash register put in a couple Isles we're gonna put the cooler around the sides and then the door at the back and we'll drop in the cameras just as I saw them in the photos so there was the camera that was on the front door there were actually three cameras on the cash register there were two cameras on four aisles and there was a camera in an open general space and and what we can see here is that there's actually quite a bit of gaps in their view two of the aisles aren't covered there's the back corridor that's not covered in this general open space
that's not covered so really what my point is here is that when we limit our perspective of the attack surface by having fixed vantage points or single views what we're doing is leaving blind spots whether that's your network or the local convenience store so I'm sure you're all wondering why am I talking about convenience stores and stuff yeah it is highly relevant to cyber security and it's because of this how do we scan our networks do we use single perspective or do we use multi perspective if you've got a rapid7 scanner that is sitting on some box in some secure Enclave that has all of your security tools you are using a single perspective and it is a fixed position
it's not changing so what are other examples where it would be multi perspective if you're if you have a sim that's multi perspective right you're aggregating logs and correlating the events from multiple distinct perspectives if you're using insite IDR from rapid7 as an example and I obviously don't work for them but I'm a big fan of that product you would be collecting this sort of information from desktops which are moveable devices so that's an example of multiple perspectives on moveable devices aggregated into a single location and having correlation done across it so what is the real main point of this talk it's it's this it's that one of the most important things we do we do from a single perspective and
we should start doing it from a multi perspective to get the multi-dimensional data and that is network scanning so right now you're gonna scan your network with a single end map and you're gonna tell your administrators to open up the firewalls to allow access so that your end map can see everything well that's great that it can see everything but what it can't see is what the pathways are between the hosts that you just scanned those are the pathways that the attacker can use to move laterally and you don't actually know what the perspective is from the domain controller what the perspective is from the file server you know the perspective from the perspective of the nmap system so what
if I told you that we could make every system in end map and every system scanned every other system so you have this multi-dimensional view of the total attack surface of the network instead of me telling you that the network looks like this from the scanners perspective I can actually tell you what the network looks like from every perspective and that's super powerful and when I say we're gonna make every device and map scanner in the bottom right hand corner there I'm actually saying that you would deploy and map or distributed and map to every one of your servers every one of your cloud instances I mean if you're up for it you could do it to desktops as
well and once you a great this data then there's one more step to make it really compelling what if we could overlay the vulnerability information for all of the scan ports onto the edges of these connected to seize every host is a vertex right and they're connected by edges which are the paths that you've learned from the end map scanning and if I can overlay vulnerability data then you know what I can do I can do a graph analysis to identify the open shortest paths between any two points so the one on the left might be the internet the one on the right might be a court database with all of your credit card information and if I
know the connectivity and the potential connectivity this is not used path scanning is telling me what is possible not what is used that's the purpose of nmap right so if I can say that there is potential connectivity between all of these and the services that are listening are vulnerable then it's one graph analysis away from telling you a path from a vulnerable entry point to the vulnerable target through the mesh of services and that's exactly what we're gonna do and John's gonna show you how that works before we get there why do you care about this well you care about it because there are 19 Vollmer abilities published every day four of them are critical remote exploit route
compromise it takes an average of 50 days if you've got everything in order to remediate on average across the industry and what that means is that there's perpetual exposure and you are likely needing to prioritize what better way to prioritize than to identify core assets the paths to them that are vulnerable and start picking off the closest vulnerable paths first and then focus on the rest so prioritization is number one number two its to fortify the paths from intricate internet-connected systems so you know your core assets and you can easily identify with this approach every entry point that receives globally routable address space so now that you know all of your globally routable entry points and you know your asset you can
map every path between them and the point is do you have appropriate controls in place at all of those intermediate junctures in the network if not you've you've got a task to do that is a very clear gap that needs to be closed number three is network overexposure so it's probably not a familiar term it's something that we made up at edgewise but network overexposure is how many paths to a given target did I discover versus how many of those are actually needed so I've got a database that receipt that has the potential to accept 100 connections from various sources yet only ten of them are actually needed I'm ten times more exposed on that database
than is necessary so how do you accomplish lease privilege access on the network to ensure that only the required entities are communicating and only the required entities can communicate so this is sort of laying the groundwork for a segmentation project or micro segmentation or something that would constrain access to those core resources I think at the end we should we should have a chat about how would you use this data this is just three possible options there's a multitude of ways you could leverage this data before I hand it off to John the only thing that you didn't come here to learn how to use nmap I'm sure everybody in this room knows how a
map I just brought this slide up to point out that there are really two ways to get this highly dimensional data number one you can use the N map a distributed end map which is really just a wrapper that orchestrates some of the heavy lifting around collecting these and map files from a variety of sources but probably the easiest way is just to use a configuration management system puppet chef ansible saltstack SCCM whatever whatever you happen to use and you can just have execute the end map you're gonna give it a very broad set of ranges that it should scan and then collect the resulting file and once you have that that's when what John is about
to tell you is really important so everything he's about to tell you is available here we will I have the domain attack path com I hadn't had enough time to redirect it to this but over the coming days I will redirect attack half.com to that URL and without here's John worked good in fact when originally in the slide he had attacked path calm then calm and I thought oh he's making a joke waiting for me to put it in the right you are on anyway so um this is uh not too much a slide as it is a Jupiter spread notebook so hope you don't mind that too much I was all been writing writing this code
in real time so if you like going to the circuits to see people falling off the high high wire this is where to be so what I did of course was writing this in Python I had actually I was out of the office until Thursday morning and then I was told by the way how would you like to do a talk on Saturday sure no problem oh can you do this and can you hear here's the here's the here's the description can you can you program this on sit well I've got something already done so I guess so yeah so let's see what happens so this I'm just importing stuff here know what we're actually
actually doing here is starting with the end map data which Peter just talked about which in which implicitly has inside of it all the network topology we need to Bill we want to import that and I won't go into that deeply cuz it's really boring to parse files and not too hard but taking that and then turning it into a graph structure using that graph and then we'll use that graph you know to use a few examples of using that graph just to find the shortest paths from internet accessible hosts to some target and after we do that will then apply the vulnerability data we'll talk about the vulnerability data well apply it to the graph and then rerun that same
going from shortest paths two most vulnerable paths and there will be some graphics so you can actually see instead of having to read a lot of you know print statements implicitly by the way just you know it's at some level of abstraction of course the nmap data is just source IP destination IP destination port and service so and the typical end map data looks like that now let's start with and go down and then we'll start with just the first thing we'll get a nice small graph we'll start with a nice ball graph and we'll it'll define the initial target weird now we have the graph G is just the graph and we'll start what we want to do is
actually find halves in that graph going from a selection of internet accessible nodes to the target which we've named now what I'll be showing you is there's a couple of ways we can do this we can talk about the well if you want the end most vulnerable paths we can return that well you could also we're also here we'll be doing all the paths with a with a weight less than a distance less than a certain score and there are several search search techniques available that come back to this in a second after I show you so let's actually we can get the paths I'll come back and do this and they just go down to the first picture
so you can see what I'm being out here oh actually make that alright basically this is just the first graph we'll see the red node is the target the green nodes are internet accessible you can see the red haves there and we'll come back to this graphing later but if you look at the say the the most the first path we see it's two hops from 22 to 16 to 228 and in this case it's 20 to 22 here 16 here 228 pretty straightforward now um these little bigger again so we can read it there are lots of ways to find paths in graphs a lot of them are specialized in finding this single the single extremal
path of some sort and since we're not interested in funding only one what I did was implant a sort of a a quick version of beam search which essentially incrementally go advances through the through the graph from the target outwards and looks for the least expensive way to get to each node or keeps a small number of them and then keep spreading out until it finds the internet accessible nodes and then returns the scores that it finds for those internet accessible nodes um we're just basically what I did to get these scores now give it a weight of distance of three we only had three paths here so we could actually reduce it if you
wanted to no less just redo that and get two and then redo the chart can we see you know again not very exciting at this point well you see the 122 and 38 oh right 38:22 there we go those two paths from there okay so
and I actually automatically for this data I didn't have a target and target and internet access hole notes set up for it so I'm choosing them essentially by using analysis of the graph the the target is basically you could use an eigenvector centrality analysis to figure out for any graph what the most central node is what the most peripheral nodes are so what I did was I made the most peripheral nodes are a random selection the most for referral nodes the internet accessible nodes and I made the most central node the target there's a certain amount of randomness in this but it works pretty well now so we have that we have the actual output and we
have this somewhat nicer there we go sure of course what yes it's saying exactly that because we we set the distance metric for this graph to to not not in two hops there are other ones so if I said originally set it to three so go back up here and just say okay we'll make that three and we get more paths and then we go down we just rebuild the graph and you can see that now there are there's there's a three hop from 104 104 up here to to 238 to 16 to the target this is just this is just this is just that ivan poseable Darrell Levine yet of trying to this is just getting the toes
into the into the water here now now we're actually that good bring it up we're nestled that's put that's add vulnerability data to this so there's lots of sources for vulnerability data and oh thank you thank you very much that's very helpful Wow great timing there's lots of sources Roland early data but um I'm cheap so I'm not gonna use any of them I'm gonna use a free source now what we want to do is take that vulnerability data extract the information we need to add arbitrary weights to the graph and then see what changes now because I'm because I said I was cheap well you're using the national vulnerability database vulnerability database sorry um
the advantage is open-source it's community-created it's it's comprehensive um the disadvantage is that it it's it's what it is it's a complex data structure and it's it's not really optimized to actually get the answer I was looking for so I mean this is uh so just basically it they're files is again a bunch of files 50 megabytes of stuff that looks like this you see the point it's it's not friendly um so I went to the one of the one of the basic tools I use for almost everything I put I dumped it into elastic elastic search again three works well you provide your own tech support fine and then we so from that we can say
we have the for the end map data we have the service on each port on each host so then we can search for the vulnerability information associated with that service on each host and so then we can use that to then re wait the graph based on the vulnerabilities and we'll do that right now so we have the vulnerabilities already let's apply to the graph to get a new graph and then we just get the so for the distance of so here we have real distances target distance of less than or equal to nine same target same same and same same internet accessible host and same target we have this was that one two three call it four and so we can
then draw it so we have this make it a little smaller oh you can see that so you can see that again we have a very different grunt we have a different looking graph so the same the same internet accessible host but different paths to the target and of course we can also change the weight so we could reduce it if we wanted more say which is a 7.0 so we get ID Sivan that's it that's ugly anyway you get the idea we could also increase it so we could increase it to say 11 and I have oh because this right very very few we can improve that very much anyway so um what's the
vulnerabilities there the and basically for each housed at a certain point you get you have multiple vulnerabilities on each host and I wanted to combine them in a relatively straightforward way they still gave me a single number and so I used harmonic averaging which essentially is a way of saying uh most you know most that a lot of the people will tend to use the most vulnerable of the opportunities they have to get onto a difference particular server but um they might use a less vulnerable one and so we weight those together use through the harmonic mean um also the in the net in the vulnerability database the scores are all the larger numbers being more
vulnerable but everything was set up to have shorter paths do shorter showed his paths underlying Li so I basically inverted the inverted the weights so they were actually more convenient for my purposes and so we actually for you each of the servers in the small in this in the small set we have a vulnerability weights that looks like that now okay we're fine but it's kind of boring to only look at 13 nodes so let's see what we get when we create a large graph and so let's spin it up um we wait for a few seconds and so we have now we have thirteen hundred and thirteen nodes and seventeen thousand four hundred 70s
between them and the target is 10 10 22 to 28 now we can get the do it by distance so we have with a distance of 3 let's we have well do a distance of 2 we get you know a 3 3 2 hop links from the periphery to the target and with 3 its well you can see it's a lot more now this is a pretty big graph and so what you do is see if you try to actually draw that graph that was right before you get a piece of beautiful Modern Art so that's not really that useful to me at least so I thought well why don't I actually create a different way of drawing the graph
that only actually includes the paths and leaves out all the all the other nodes and I don't want so I don't get drowned in the big blue cloud so we can do that and so we get something this is for the this is for the I should have said this is for the target of 3 we get something that looks a bit more friendly it's not friendly friendly but you can actually kind of read it if you if you want to go back and say well I would actually want it to look a look at the really smallest numbers we can then go back and then draw that and we get something that's you can't get much
easier than that and these are all the two hop two humps from the two for the shorted system this is an the before the vulnerability in French was added to this through the large graph so we got something pretty straightforward but you can also increase the noise as we saw before to get something that's more comprehensive um we have the vulnerabilities for the large graph too so we just apply them and now we're at distance five for this we get the the vulnerabilities at distance 5 and there are none let's increase it to 7 7 what do we get now oh there we go so distance 7 we get probably a half a dozen would we so we get a nice that's
what we get with the at the distance 7 we get this this vulnerability graph where we have the host here and and some and the and the paths that lead to it from the periphery
we can look for also we can increase the weight here so we can get a slightly so good at 8:00 and then redraw the graph and we get something again a little a little more comprehensive little noisier more Hobbs but Marx that's more of the same thing sure that's um what we have you've gotten me at a point where I I don't know how I'm using a draw a graph drawing package now we can look at the actual paths here so we have so take the first one it's um its weight is six six point oh four we have a couple at that weight actually um three it looks come on three at that weight and they're all
looked like two hops and they are of course directly answer your question it is a prioritization question so you've got this nice long list the first ones you should take action on are the ones that are immediate adjacencies to the target and then it's really just a question of question of diminishing priorities as you get further away from the target and so you could imagine this and and this is early days on this approach and again we're releasing this out to the community what john has done here in the jupiter document it fully elaborates everything that he said textually so you can literally read this like a story and then it explains the math you can run the math right there
just by clicking returns see the resulting charts and continue the journey and and the point here is that we want to make it open to the community so that you could plug in rapid7 you could plug in really whatever you want and get these these answers with more consistent reliable familiar data to your internal operations it's the same algorithms it's just some like hooking of different api's thank you so that's um again the as Peter said earlier we have the code in the repo that notebook we going up in case I change during the talk you were going up after the talk along with the data um if anyone I assume most of you have
needs to know at least a little bit about Python you need to install that there are a couple of their libraries third-party libraries we need to install as well and also elasticsearch which is actually given it's a large java application is actually fairly easy to install and use and that's it and now we'll go back to Peter thank you so uh certainly open any questions I think the too long didn't read on this is that right now we take slices of data by having fixed point perspective of our scanning of the network and if instead we take slices of data from the perspective of every node in the environment what you start to look at is
this cube of data and this is a familiar topic to both John and myself my background is from in Decca technologies the company that invented faceted search which of course today is just commonplace on every website you ever visit and John was the chief data scientist of the of a tibi Oh which was the data lakes company or is the data lakes company and really what this is is about OLAP modeling and faceted search discoverable navigation where you structure your data with multiple faceted views that if you can rotate and reorient your perspective on the data you see different outcomes and that's exactly what John has done here he's taken individual slices data from the vantage point of every
device in your network and looked for the commonality across those data sets and join them together so that it's now structured as this big interconnected mesh a graph and you only have to reorient your visible your view of the graph to see different datasets different results and then you can augment the data set just by superimposing new data on top of the graph topology so a super powerful super flexible way of looking at network topology that gives you every possible multi-dimensional vantage point from every device everywhere always so that's it any questions happy to answer yeah go for it of course
hit the nail on the head so that is not what edgewise does but for the purposes of this project yeah that it you hit the nail on the head and for anybody who didn't hear him really the point is that you're gonna go through a remediation process as you learn data like this and you prioritize these fixes and really the point is that you're going to regenerate this data set by doing another nmap scan across all of your systems at the rate that you remediate right because you want to see those changes reflected in the graph so you're continually doing these nmap or you know it could be rapid7 scans aggregating the data together so that you can see how
your landscape is changing over time from a vulnerability exposure and multi-step attack path perspective
yep-yep chokepoints it's a fantastic point so if you see a common node that is the intermediary for a subgraph so it's a topic John didn't get into but he he educated me on it so let's see how well I can relay this you've got you've got graphs that are highly interconnected but off of those graphs are sub graphs and where they interconnect is this joined point that if you can block the vulnerabilities at that point you fracture the graph and it means that to the target there are fewer possible paths that arrive at that ultimate target but it doesn't change the fact that your highest priority is adjacent C's to the target itself so things that directly talk to or the to
the target and then after that would be these choke points and then after that would be things that directly touch the internet those are the ingress points right you clearly want to lock those down any questions are on the table go for it all right with that thank you so much [Applause]