
and owe it to you all right hi everyone so I'm Luis de Valentin I'm an associate principal at accenture security lab and just as quick background about sensor technology labs where the most forward-facing part of accenture we're looking at things about three to five years out that are going to disrupt industry and we've got seven different labs scattered across the world my work is mostly focused on analytics and different mathematical models of modeling risk or attack surfaces so organizations have always sought to mitigate risk in their environments IT professionals in jet in particular have really used a lot of different methods in order to mitigate to mitigate or reduce risk and so in the past we've
seen compliance as something that people has done as a bare minimum more recently we've seen that people have start to migrate to rolling access certifications so its privileges that expire and then have to be recertified patch management in general is another risk mitigation tool and then more lately we've seen them start to use analytics to try and quantify risk within the organization as well via internal detection software so analytics specifically based around detecting attacks and user behavior analytic so profiling users and figure out when they are actually doing risky behavior however there's been a part of risk that they really haven't quantified yet and that is the the human interaction risk so human interaction risk is rarely modeled social
engineering is just uncalculated risk that they haven't built into their mitigation policies and one of the keys of this is that proximity like your approximate a person is a measure of your trustworthiness so people trust people that they know and people are more likely to respond to spearfishing from people that they know so the example use case that we always that we like to use is the executive assistant so the executive assistant not does not necessarily have access to a lot of things however just based off their proximity to partners that have client confidential information or things like that they are still a huge risk factor because if you compromise their account your you have centrally giving yourself
location from where you can spear fish a lot of different people so they're of riskier counted and someone who is actually privileged with the exact same privileges so they're how we've been trying to model how this risk actually is and how how risk spreads from people to people and the way that we've been doing this is through graph analytics so if you model all your assets and users as nodes then you can create a graph of users to end users to assets and the interactions between them and see how wrist spreads within your social network and then the strength of these relationships can also be built into this model so people who are closer to
each other will have stronger will have stronger risk transference properties so you want we've we've been trying to understand the network of access available to each individual within the organization so if I want to reach a particular asset from a particular person how many spearfishing pivots would I need to take in order to reach this asset and then from that if you are a person that hey you're a common link that a lot of people have to go through in order to do a spear phishing attack then you are you are you have a much higher risk inherent in your position and there's actually been this actually correlates very well to a very well very
well studied graph analytic mop graph analytic metric called betweenness centrality so between the centrality is at its core who is who is the most who it's a score of each individual user based off of how important they are to shortest path travels across the entire network so how important is this person to reaching other parts of the network and then once we do this calculation we can identify who are golden key holders within our organization our and how the risk from those golden key holders spreads to people who may not necessarily have the access but are people that if you compromise you could get access from one of these golden key holders and then once we do this we can
actually start to we can actually start to prioritize protections and actually build them into our risk mitigation policy like making these people have dual factor authentication or other security controls to make it so that's harder for risk to travel from these people so this sounds like a lot of manual data collection and building a social network that there's that's it sounds like a lot of work however there is an approximate repository of all these user to user interactions and that is in the ad so most well well organized organizations have ad information in there that actually models who that has hierarchical models of the entire organization so who is your manager versus who is who is who are your
subordinates and from that we can actually build a network user network model that gives us an actual that starts to build that Network and then in addition to that they usually have departments and locations as well so if I see you every day that is another it is another way of measuring how well I know you so if you're in the same location we can say okay this is another interaction that we want to model and then the other thing in the ad is it has all the privileged information as well so if we harvest all of these ad logs than or all these ad entries then we can figure out okay this person has access
to these assets based off the member of the member of fields that are associated with this account so and then from those membro fields we can even run some filters on them saying if there's an admin in if there's admin accounts in here we're just going to we're going to make them more important as well so it's harder to get to these accounts and then specific PII is another example of that so once once once we harvest all this information we can actually start to build a network and so like I said before nodes will end up being users and privileged verbs and then edges will end up being user to user interactions or user to privilege interactions and then
we can do some variable waiting based off the important of the privilege if it's an admin account we're going to make that edge worth more once once we have this network we can calculate between the centrality score so how much each node contributes to the travel distance and for those of you for those of you who have taken some algorithm Theory eat the the betweenness centrality is essentially a calculation of if I have one node on the network what for every single node on the network how how many times does every other every other shortest path in the network run through this node so it's it's it's essentially a measure of how important you are to moving throughout
the network and then the the for the algorithm people out there the Big O calculation of this is just the number of vertices cubed so it's a fairly lightweight algorithm okay so to test whether analytics would actually capture the behavior we would expect of it though I went ahead and built an actual network of influence for our technology lab so I scraped the ad and pulled all this information and built actual user network that corresponded to our to our to our network and then we we want to see if the behavior we expected would actually emerge when we ran this between the Centrale group so the behavior we were expecting is that uses in proximity
to valuable privileges would receive receive higher between the centrality or importance course and then so this is this is the actual graph of our network blue nodes like these are users well as orange nodes right here are privileges and then the the size the actual size of the the node it corresponds to how high they ranked in the between the centrality score so these two users right here are our network operations manager and our security manager who is also functions as our systems admin so they have they have a lot of accesses to these to these admin privileges right here that no one else has so as expected they're receiving extremely high between the centrality scores that's that's
that's behavior we would expect from them and then these these edges are all waited very heavily because they are admin accounts but what we really want to know is how does risk propagate from these from these two people so our workshop coordinator is provisioned on basically the bare minimum of the of the privileges that we have however they work very closely with our network operations manager in our systems admin in order to deliver workshops that we actually give to clients so they have to be in close contact with them and they they work with them almost every day so we can see here though that they have actually received a higher a higher centrality score despite being
provisioned the exact same the exact same privileges so they would be a perfect person that to to Spearfish and to put put a real-life attack vector a risk factor to this a spear phishing attack from our workshop coordinator asking them to approve an agenda or something like that but that agenda has macros enabled so they would actually be compromised it'd be much it'd be much more likely to succeed from them as opposed to someone else in our lab or someone else outside the organization so we can we can kind of see that the between the centrality algorithm is is providing value in identifying our risk spread rip is providing value identifying high-risk spreads from our high impact users to
users adjacent to them or in proximity to them and so for for the specifics of this algorithm I live I limited it to 24 4 nodes out so the the between the centrality you only calculate it would be it would be very it would not be enough to expect for the beyond someone wouldn't be able to do for spear fishing spear fishing pivots in order to get from one thing to the other so we actually we actually did limit the actual shortest the size of the shortest path that you could that you could actually travel
and that so so going forward risk quantification analysis like this can help inform our decisions when we tailor risk mitigation techniques so if if we wanted if we wanted to to kind of account for this risk that we haven't captured yet we could we could do things like focused training on specific people who are close to these users however to these high-impact users however are not are not specifically provisioned those those accesses and then we can also tailor security controls 22 Taylor security controls to target these users and make it so that it's it's harder to access their accounts so for instance dual factor authentication or something like that or just even limit the amount
of privileges that they have access to so determining all this information is great but it's really just a starting point this this this model doesn't take into account a lot of different other complications that you would that risk comes from so for instance we can also factor in hardening so as the cybersecurity lab we are probably more less likely to respond to a spear phishing attempt because we know what to look for whereas in a marketing department maybe they don't know as much what to look for so they would be more success acceptable to actually entertaining a spear phishing or responding to a spear phishing attempt so we can we can factor that in to our
risk models then we also have individual susceptibility to compromise so there's an nyu poly psychological study and one of the one of the key things that came out of that is that ten percent of people are going to click on the spear phishing attempt no matter what so they're just going to they're just going to respond to it and that that is that is that is bad like that's very bad yeah but if we know who those people are then we can also adjust for them in our risk profile so for instance we can do we can do testing spearfishing so I know they're a bunch of companies out there that actually do fake spear phishing attempts to see who
who in your organization are more likely to respond and if we can get that data we can incorporate it into the model as well so we have a better representation of who is who are going to be the people that that can be compromised then we also have dual factor authentication a lot of these accounts a lot of accounts are not necessarily our dual factor authenticated however we don't know that in this model right now and if there's a dual factor authentication it makes it much harder to traverse between from person to person so we can maybe reduce how the wrist spreads on over certain accounts and then finally there's a lot of social media information that that
just isn't accounted for in our network I mean you might be friends with someone and not work in close close relation to them at close relationship to them at all so we could actually improve the underlying data by incorporating Facebook connections or Twitter by scraping Facebook or Twitter to figure out who do people actually talk to in real life is this is this something those edges can be built into into our actual into our into the into the base data to better understand model how the wrist spreads and then the end goal is to have a quantified risk landscape for for pretty much the entire for pretty much the entire organization so like I said before the betweenness centrality
actually scales very well so this could be this algorithm could actually be scaled out to do enterprise or larger level types of analyses and having that information would help you would help when they are tailoring their risk mitigation strategies okay and it looks like I sped through these slides so questions apologize for that yep hello yet hell yes I'd like to know if your crystal ball is working if you'd be able to offer prediction about how far in the future do you think we're away from actual security audits beginning to take these kind of metrics into account and possibly saying hey you know if you if you have these employees it failed at the spearfishing they need to not be in
certain positions please reassign personnel accordingly yeah or you fail the audit or things like you how soon do we see these kind of social engineering metrics actually made part of the audit process okay so there there is there is some actual social area social engineering being made part of the audit process already I know of a couple companies that already do the risk the risk testing essentially to say okay these are user in your organization they're actually responding to to spear phishing attempts and they just sends its just an automated thing where they send out send out spear fishing to every single person in the organization but the I i would say maybe five to ten
years before they can they can actually incorporate human interactions into that into that into that risk profile it I do believe it is coming but it may not be as complicated as this so I know as of now that a lot of a lot of user behavior analytics is starting to take into account peer groups which is another is another way to to model user interactions so peers that you work with versus peers like peers you work with you should have the same type of activity if we see activity that deviate deviates from those peer groups then we're going to we're going to flag that and say it's a risk
so I noticed that when you had your node map up you had a couple examples of folks who you knew were important and one example of somebody who you expected to have a higher risk ality based on their position going in yeah uh isn't the fat part of the big value from this the unexpectedly high target did you have any examples of somebody who you wouldn't expect to have had as high of a risk centrality rating as it turns out that they had and what sort of mitigations did you put in place for that person afterwards okay so we did actually we did actually find some surprising things as there as well one of one of our consultants had access to
a lot of a lot of the admin accounts because he was doing engineering and he generated a much higher risk score than than even our workshop coordinator however he does need those accesses to do his job so there wasn't really much we could do in the risk mitigation profile besides making sure that he's pretty hardened to phishing attacks and all of the rest of that and we we do do we we do actually do hardening and trainings for for risk profiles at it eccentric was that the guy there was like two big nodes in that big yellow cloud than one yet uh if you'll take a look yeah the tracing parallel no it was this guy right right that guy right that
was the one I was trying to ask about actually yeah who was that person that had that you that large of an impact because when you first put up the workshop cordon air going that that's out sighs know that the guy way over yeah yeah yeah now the the workshop coordinator is just is just very interesting because they they are essentially Priven provision the barest level of access they have their more in charge of making sure that the workshop runs smoothly so they have to do the they have to be able to do a lot of the AV and making sure the agenda is right and coordinating with the client in order to in order to figure out the
topics and the agenda for the entire meeting and then I do actually want to call out the the visualization software that we use graphs tree it's very good and that's graphs tree graph history yeah because they there there this this isn't live however it is interactive the actual on the actual network graph so that gentleman actually took my question so I'll have to ask another one what tool did you use any tools outside of your own coding or your own okay to gather this kind of data and information yeah so so the tools that I used was I use PowerShell to scrape the ad and from that I I converted it into I pulled it
into Python and I used pi spark in our in order to do the graph analytics yap i spark and pi spark is distributed it's a distributed it's a lambda architecture that you can do large-scale analytics on I I probably didn't need it for this however if I had done the entire ad for an organization I probably would have needed to leverage leverage that and i did use graphics as well which is a package of pi spark next question I know you scraped ad for this I would be interested see how effective this opera this process would be if you're taking information I've linked in yeah you find here's a CEO and then here's all the
people at work at this company and see if you're able to I guess the first article model right you may not get the actual admin rights but you can make educated guesses yes well so I did want to do this from the defender standpoint that is someone you something you might do from an attacker standpoint scan scan a linkedin organizational chart and figure out who who actually works in close proximity to each other when you target your when you target your spear phishing attacks well then you could also find out what groups are members of what their interests are and yep mm-hmm
um so you're currently using the ad to show how connected people are um but really most people spend their time like communicating with other people through like email or some type of chat is there a way we r you could scrape an email server to see like how much email traffic or use the metadata of emails to figure out who's more connected to each other possibly if I had access to that data the thing that's nice about the ad is that I don't need to have too much provisions in order to actually go ahead and scrape it and figure out who who is who like the manager who your manager is versus who your subordinates are all
that stuff is publicly available within the actual organization you don't actually need too many privileges in order to get that information I could I you might be able to figure out smt SMP traffic just the quantity and build weightings based off of that in order to go in order to figure out who actually talks to each other verses who doesn't which it would be another way of incorporating so in its it most of these things are about in improving the underlying data that you're building the network off of the network of influence of access off of any improvements to that are just going to reap gains in the analysis you're only as good as your
underlying data I was just wondering if any of the code you used to do this is publicly available or if you plan to release it in the future I need to clean it up a little bit because there is um there is PII and client confidential information in fact yeah yeah I this this is essentially a map of a rat or ad so like I anonymized it for this however I have I the values are actually still there and I mostly just turned them off from showing in the snapshots that I that I had but I can't I I am planning to eventually put this out there you clean it up a little bit first off
cool we have time for one more question anyone one last question no okay so less time oh you have wait ok all right ok maybe I'll steal that so that's it and let's thank Lewis