← All talks

GT - Lessons learned while modeling rare catastrophic cyber loss events

BSides Las Vegas59:1287 viewsPublished 2022-09Watch on YouTube ↗
About this talk
GT - It was a million to one shot, Doc. Million to one' -- Lessons learned while modeling rare catastrophic cyber loss events - Russell Thomas, Christopher Vos Ground Truth @ 11:30 - 12:25 BSidesLV 2022 - Lucky 13 - 08/10/2022
Show transcript [en]

hi there everyone okay uh we are from RMS uh next slide um my name is Russell Thomas and I've been at RMS about uh three years give or take I come to the organization with the background in data science at a Regional Bank uh PhD work in computational social science and before that uh high-tech Enterprise r d marketing manufacturing Bachelors of Science in electrical engineering and management Chris my name is Chris Voss uh I've been at RMS for about seven years um bit sorry closer look at my head a bit of RMS about seven years um been working on our cyber rest models since since the outsets uh I have a background in mathematical modeling particularly in the context of natural catastrophes so I have a a masters in risk and environmental hazards and a Bachelors in physical geography both from universities in the UK as you can probably tell from my accent um so as Russell mentioned we do cat modeling and for many of you in the audience when we say those words this might be what comes to mind uh however unfortunately today we won't be spending the next 45 minutes checking out different designs of you know denim jackets for cats and that sort of stuff instead what we'll be talking about is catastrophe modeling which is a field within mathematical modeling focusing on quantifying the risk associated with rare severe phenomena things like hurricanes earthquakes floods pandemics and of course cyber attacks what we have on this lovely slide here that completes Russell it's just an example of some uh realistic but uh synthetic hurricane um tracks from our North Atlantic hurricane model so passovers Russell so when we say rare catastrophic events uh one of the features of this talk is going to be some Seinfeld memes I think we would all agree this particular episode where Frank Costanza fell backwards on a little few Sealy Jerry statue and it ended up in the proctologist's office as a very rare very catastrophic event so as we mentioned you know Russell and I focus on Cyber risk modeling but uh this is just this is just one of a whole Suite of different catastrophe models that RMS build our firm builds everything from North Atlantic hurricane models to European flood models Asian earthquake models and you know terrorism pandemic and of course cyber risk models so our primary clients sit within the insurance sphere uh this little slide here just kind of gives an overview of the insurance value chain starting from Individual companies who if they want to buy insurance will speak with insurance brokers they'll get a policy of an insurance company but those insurance companies typically don't want to those those companies um don't want to hold all the risk on their balance balance sheet so typically they interact with reinsurers to shift some of that off and you can see the flow of risk from left to right here now RMS is a company provides risk quantification tools and services throughout this value chain uh in the context of cyber risk we're primarily on the sort of right hand side insurers reinsurance Brokers and reinsurers themselves so what do we actually mean by risk well very simply risk is often defined as being the product of likelihood and impact now in our context impact actually means the direct Financial losses experienced by companies that unfortunately are on the the bad end or a cyber instance now on the right hand side here you can see some of the types of impacts that are considered by our modeling you can see everything from Lost Revenue that occurs during an incident so of course as we know when a bad ransomware incident occurs often that means a company can't operate at 100 so we'll be quantifying that sort of thing quantifying forensics cost incident response costs some kinds of fines notification costs and Ransom payments and that sort of thing however we don't quantify is things like post-incident upgrades lost to Share value and this sort of thing and that's primarily because these are the sorts of losses that are not covered by insurance contracts in the Cyber Insurance sphere so as a result we really focus on incidents that are above a particular severity threshold we're really only interested in stuff that causes realized Financial pain to companies not so much about other kind of incidents that perhaps network security folks are concerned about intrusions that they want to follow up Etc but we're really interested in if it results in a financial loss we're interested if it doesn't not so much let's please and as part of this we model a diverse range of different types of cyber incidents everything from data breaches to ransomware attacks wipers Cloud outages and this sort of thing and each of those different what we would call sub Perils of cyber have different likelihoods and different Associated Financial losses and of course depending on what type of company we're talking about you know a small company versus a large company the likelihoods might be different to any industry we're talking about the likelihoods also might be different okay a quick show of hands anybody here work in the insurance industry cyber risk okay interesting uh anybody here do risk modeling as a business as opposed to okay so what I want to contrast here in this next series of slides is how these different perspectives of risk vary and overlap but are significantly different that's really critical to understand that how we approach modeling and how it may be different especially from the Enterprise so if you're a risk manager in an Enterprise essentially all risks all bad things that hap can happen your business are important so if you get hit with a really bad event causes big Financial losses or big reputation damage you've got to go in front of your board or something that could be a catastrophe as you define it but from a population standpoint if you're the only organization get hit by that the population or the people that look at populations like governments Regulators they may not see this as an extraordinary event it's bad for one but not necessarily for the population but if you start seeing events hitting many organizations essentially or roughly in the same time especially the same type of attack and attack severity now we've got a population level uh catastrophe now it's critical to understand that insurance companies View and manage cyber risk in the context of a portfolio that's how they decide what the premiums are going to be and what the rules for coverage are going to be how much Capital to allocate and how to even acquire reinsurance and report to regulators so portfolios have boundaries who's in and who's out portfolios have rules of coverage in terms of conditions and even every customer is going to buy different levels of coverage so while a single insurer may look at the population and sort of take that Advantage perspective they're always looking at a subset and a key part of the RMS product is to help Insurance customers go from the population or macro view down to their particular portfolio view and say what does this mean for the types of customers we cover so what RMS does in our model is in in this version six that we've just introduced we model a synthetic population of all firms above a certain size threshold and further we uh separate this synthetic population by the industrial sector as well as the geographic uh jurisdiction and critical to our modeling is what's the footprint of given attacks and we're concerned about how many threat actors there are and there are campaigns and what are their footprints and different campaigns can have different Footprints and that can affect who's uh who's affected and uh how many so some may be horizontal some may be vertical and in the worst case scenarios they may cover a very very large portion of the population and this is really a prime concern to our customers and to our models cool thanks Russell so to sort of formalize this I think it's helpful to sort of reiterate a couple of things so the first of which is that we can think about risk in two categories we can think about what's called attritional risk and essentially these are incidents like Russell mentioned at the very beginning which are sort of independent which can still be substantial in scale but are not associated with many many companies being hit so an example of that would be something like the 2017 Equifax data breach which was brutal for Equifax but it's not like it hit thousands and thousands of companies simultaneously then we have tail risk which in our parlance is really focusing on low probability High severity events that hit many many companies simultaneously so these are things like Wanna Cry and not pettier might be examples of Terror risk and of course we can all imagine much more terrifying examples than any of those you know that might potentially occur so we wanted to sort of touch on how this influences insurance premium because that might be something that you know a touch point that you folks have you know directly with the insurance industry and so insurance premium really is covering the the average loss that or the mean loss that your company might experience in a given year and this includes both attritional and tail risk so on the right hand side I've got a very super simplified example of an imaginary company and let's say an insurance company comes up with a ten thousand dollar technical premium Now a technical premium essentially is defined as being the amount they're charging directly to cover the losses that you might the claims you might bring it doesn't include profit or other sorts of uh other sorts of costs so next one please also so again these numbers are made up but as hopefully it will help us follow through so the attritional component you can see here is about seven thousand five hundred dollars and the way that this might be computed again this is a simplification is that you might take the mean loss that the company might experience conditional on an incident so given that they've experienced an incident on average what's the dollar cost and then what you do is you combine that with information about the likelihood of that happening or the probability of that happening in this case five percent for easy numbers and then you get what's a probability weighted loss which in this case is seven thousand five hundred dollars but that only covers the attritional component those sort of independent events what you also need to consider is the tail component which is often called the catastrophe load and this is then considering okay what about all those events which might hit loads of companies simultaneously in this case here our example is saying that on average if our company gets caught up in one of those events that the loss is going to be 250 000 on average but there's only about a one percent chance in a given year that that this company gets caught up in this and when we do the probability weighted loss we get 2 500 sum them and we get 10 000. um nice place so one thing that we thought is kind of useful to to mention is that you know Russell spoke about the different scales of risk and what what's catastrophic in in the eyes of an Enterprise versus the eyes of an insurance company or the eyes of a population and it's worth mentioning that you know for an individual company you could imagine that perhaps you know a brutal double extortion event might be the worst case scenario right where all of their commercially confidential information gets stolen you know personal information they have get stolen and leaked and at the same time their operations grind to a hole because everything's encrypted horrendous however due to the scaling properties of double extortion and those sorts of attacks might not be the drivers of population level catastrophe risk it might be something like a wiper that is just rolled out through a worm or something like that instead and ultimately depending on the angle at which you're approaching risk management you might be concerned about certain types of incidents over others so before we really get into the meat of the presentation we thought it'd be helpful to really spell out what catastrophe models in particular are cyberis model does and what it doesn't do what it does is assesses the likelihood of different loss outcomes we're not trying to make predictions of exactly what will happen now to use a simple maybe appropriate analogy in Las Vegas if we imagine a you know a dice what our model is saying is that the dice has six sides and each side has one over six probability of it of it being rolled what we're not saying is that the next roll is going to be a two um if we did know that then we wouldn't be here and we'd be in the casino instead um laughs what else what else does it do well what it really tries to aim it aim to do is to capture the key drivers of risk what we're not trying to do is reflect all of the complexity of the real world all of the you know huge depth of technical complexity that you're all super aware of and you know all the complexity of decision making of threat actors ultimately we're trying to identify what is really driving risk um and ultimately this comes back to the fact that all mathematical models are simplifications of the world they are helpful decision making tools but they're not supposed to you know reflect all of the ugly details of the real world and finally what our models do is they complement expert judgment as a decision-making tool they're not supposed to replace humans because as I mentioned before these models are simplifications so we need expert judgment on top of it you know some people might think that the output is too high too low etc etc so um now that we've sort of laid the groundwork of that we'll move on to our first lesson which is the benefits of causal risk modeling so a causal model is a model that represents the causal or mechanistic relationships in a system on the flip side statistical models are models that reflect the mathematical or statistical relationship between different variables next slide please so here we have a toy example of a statistical model that you might use in the context of cyber risk so this follows a very very popular framework the frequency severity framework so on the left hand side here we have a probability distribution showing the likelihood of a company experiencing an incident here you can see there's like a 75 chance that it doesn't experience an incident just over 20 chance that it experiences one incident you know and a small percentage that hits two three or four incidents and then on the right hand side you have the severity side which basically says given that a company has experienced an incident what are the range of potential dollar loss outcomes and they're Associated likelihood so on the right hand side that's a probability density function so essentially where the the curve is highest you're sort of most likely to see an outcome but we can see that all the way down to very large numbers there are there's a chance that the loss plays out this way um one thing that those one thing that these kinds of models uh are good for is when you have a lot of data but what they're not very good for is trying to quantify very very extreme outcomes that you haven't observed if I want to ask the question what's the likelihood that next year 25 of companies get nailed with a wiper this sort of model can't help me because we haven't observed anything like that in the past we need to use different techniques so yeah so just to underline that last point I want to share a brief story conversation I had uh 2008 or 9 with a famous security consultant keynote speaker and he is that 15 minutes did you just wait 15 minutes thank you scared me so anyway he was arguing against the possibility of ever quantifying low probability High magnitude loss his argument was if you take infinitesimal probabilities against incredibly large numbers your margin of error you can end up with any results so he was uh just trying to dissuade me and other people from going down that path so coming back to our Seinfeld meme how would you any of you estimate the likelihood probability the risk associated with this particular loss event well I would challenge you to take a standard statistical model of frequency and severity and apply it to this it's very hard to get off the ground and have any credible information so what we at RMS do in our version 6 model is we model a synthetic world that has all of the key elements firms software vulnerabilities campaigns threat actors and we connect them in a mechanistic or causal chain so in this case if we wanted to include this we'd have to have a threat actor of the type Kramer and Kramer would have to have the capability of building weird things like little statues and his attack pattern would be leaving that statue on the ground where somebody might fall on it and then the causal mechanism is how is somebody like Frank the potential victim with vulnerability here likely to fall and if he falls what's the likelihood that he's going to fall in a particular way that he's going to have to visit a proctologist so to go a little bit deeper into this synthetic world as Russell described we here are really trying to call out as I mentioned earlier on the key drivers of risk in the Cyber kind of risk ecosystem so here what you'll see is components everything from Individual threat actors that we spawn that have various different characteristics size skill motivation Etc things like software different kinds of software that exists with different market share and what kinds of companies use those pieces of software vulnerabilities of course which are crucial to understand and that the rate at which those vulnerabilities spawn and their characteristics and of course you know what their exploitation kind of characteristics look like but then also things like the different ways in which threat actors can gain initial access into uh into corporates right things like social engineering have extremely different scaling characteristics to worms for example and modeling the specific ways in which those play out is super important so those of you kind of who work in a field a little bit closer to us might be wondering how on Earth do you operationalize all of that well so this is a bit of a simplification but ultimately the way that this works is for fifty thousand uh synthetic years what we do is we simulate we initialize a world in which you have software with different market share characteristics different vulnerabilities that have been spawned with with various different characteristics and then of course putting into place different ways in which threat actors can get into into companies and different bad things they can do once they're inside and then those threat actors essentially are able to evaluate their different options each of each year and Russell will talk a little bit later on about how we go about that and then these thread actors will choose what nefarious thing they decide to do and we essentially rinse and repeat this process exploring lots of different states of the world because none of us can you know write down on a piece of paper now exactly what's going to happen from a vulnerabilities perspective you know over the coming year so we need to explore a broad range of potential outcomes and once you repeat this process essentially what happens is you get synthetic attacks occurring some of which are much larger in scale than others [Music] yeah just uh those of you who might be familiar with epidemiological models of infectious disease especially across a network or a geography similar structure those models take the bacteria point of view or the virus point of view and the hosts the susceptible hosts are sort of provide the backdrop for the virus to move around so in this case the threat actors take the primary acti