← All talks

Metrics Mess: Why the Lack of Clear and Common KPIs Undermines SecOps

BSides NYC · 202352:00480 viewsPublished 2023-06Watch on YouTube ↗
Speakers
Tags
StyleTalk
About this talk
"Critical SOC KPIs? MTTD. MTTR. Right? Except… “Detect” starts from compromise? Or Alert? One “expert” says “Respond” means “time to resolve an incident” but defines neither “resolve” nor “incident.” Time-To- Qualify? Identify? Triage? Contain? Resolve? Recover? I run SOC/CIRT for a F500. My vendors all use different words. Metrics don’t align. It’s ridiculous. I’ve tabulated 30+ sources and… no one has a frickin’ clue. We need a common language, like ATT&CK gives TTPs. If “a problem well-defined is half-solved” we’re doing a crap job solving this. EVERYONE’S Security suffers as a result. I’ve diagramed one possible solution as a conversation-starter."
Show transcript [en]

thank you for having me um I'm going to tell you three quick things about me I don't like to do a whole like here's my professional bio there are only three things you need to know about me one I am actually from New York so I don't live here anymore and the reason I tell you that is that if I ex occasionally slip into fits of profanity which I will try not to do now that I know we're recording you'll have to forgive me the second thing you need to know about me is I am not actually a security person although after 24 and a half years of Faking it kinda sorta I am actually an NBA bean counter weenie who stepped into cyber by accident in 1999 and haven't scraped it off my shoe yet and the third thing you need to know about me is yes I do in fact work at JetBlue Airways where my remit is currently uh threat Intel threat hunting sock incident response attack simulation and detection engineering basically everything that has to do with threat or detective controls rather than preventative controls and I only tell you that part for two reasons one it's Jermaine in my talk as you will see and to New York Homes New York's Hometown Airlines so let's hear it for JetBlue all right see I got to stay close to the mic all right um first thing you need the next thing you need to know I always like to start with thank you all right we are going to do a little Icebreaker and I am going to encourage you all to participate because you will win valuable prizes and by valuable prizes I mean cold Hard Cash fold and pocket money all right show of hands who knows what that is awesome okay keep your hands up now in the context those of you know what this is in the context of that picture for one Greenback can you guess what just shout out this is initial access and he's sitting closer than the other guy would you pass this down to the gentleman in the green shirt please all right cold hard folding money so see now you crack the code what's this one you know what I'll give you credit for that because I go to the gym about once a year so I'll give you credit for that it wasn't the one I had in mind though elevation or escalation all right you know what I just got to start handing out money all right all right you said persistence you said escalation which is also true and who said lateral movement I mean yeah lateral movement all right by the way I had to rob the 7-Eleven to get this many one dollar bills would you kindly help me out with that thank you all right um click click what's my point any Douglas Adams fans okay I love miter attack and it's not just because I'm a Katie Nichols Fanboy although I am also a Katie Nichols Fanboy I love miter attack because it gave us a Common Language I'm also a former linguist I speak a couple of languages I'm also annoyingly anal retentive about definitions I believe the old adage that a problem will Define is half solved and miter attack gave red teamers blue teamers purple teamers vendors government officials a lingua Franca where when they say T 1504-b sub c number two we all can look it up and agree that means persistence on a Windows box established by modifying a registry key Hallelujah 20 years as a vendor and the last three defending a Fortune 500 Enterprise let me tell you just being able to talk to each other is a huge win so I am a big fan of miter attack it's a rosette sorry was there a question no okay you can like raise your hand or Shadow scream at me I got a thick skin I I've done 700 of these so just talk over me if you need to or tell me I'm wrong by the way um I call this talk how metrics mess and blah blah blah and how we can fix it we are gonna fix it not me we all of us the community practitioners students academics attackers Defenders because nobody else is going to okay I just cheated one of my later slides I love miter attack because it is that Rosetta Stone it allows uh Ian I don't know if you're here but the guys from title cyber to say here are the most common ttps for initial access this month and a guy who's writing Sim rules can be like oh cool and a guy who's working at the company that makes your mail security Gateway all go ah yes I see poisoned email attachment with Excel with a VBS script that downloads a bet uh batch bash script that downloads batloader that ultimately brands in Royal ransomware got it and everybody can look at that attack chain with its little tags on it we all know what we're talking about now I mentioned that I also run among other things thread Intel and hunting and detections I also run or oversee sock and cert and that kind of common language and mutual understanding of what everything means when we talk about the job that we do is also anybody here ever uh work in stock okay so the same is true of metrics when you work in a sock right no no okay um by the way I totally blew my first joke the Icebreaker is Russian I was gonna do a whole route anybody here no so um I was going to do a whole thing with an accent I would keep doing callbacks through the whole talk anyway no more funding now it's serious part yeah okay show a hands um if you have worked in a sock Cold Hard Cash can you revenge or yeah raise your hand because otherwise I'm going to run out of dollar bills too fast raise your handle calling you ha if you have worked on a sock what is one of the things you should measure or track in a socks okay what does that mean and you said you just you two guys just made the whole point of my talk and you don't even know it dollar for you dollar for you all right sorry if I you know shorted somebody a buck come see me after Okay um I wasn't kidding you just made the point in my whole talk in the first five seconds two people who have experience working in a sock said mttr and he said meantime to uh respond and he said meantime to resolution or vice versa anyway thank you for making my point it's a I'm from New York ciao right it's a mess it's a disaster so I decided to being an old NBA bean counter weenie spreadsheet guy I decided what else to make a spreadsheet so I reviewed 30 sources from credible people I would do air quotes with both hands but I gotta hold that anyway you get the point it was so bad the data was so bad that I had to reduce it to 12 more formal Publications like white papers or blog posts from the chief something or other from 12 security vendors who have collectively raised 2.5 billion dollars from people who are supposed to know what they're doing okay so let's just leave aside the fact that I had to drop 18 out of the 30 sources because they were so bad they were unusable here's the top of the pile ready it contained then by the way my Google searches were things like sock kpis and key secops metrics okay it contained Pearls of Wisdom like this one hang on who said mean time to respond Define it for me I'll give you another dollar double your money uh the time it takes for you to go thank you would you hand that down to I need my tip all right I know this is not fair to you but I know seeing you get a delivery trash all right so It's a Small industry I'm old I know a lot of people all right so here's my point your definition was actually clear and it included an end point congratulations you are now better than 11 of the 12 vendors who collectively raised 2.5 billion dollars in funding here's my first pearl of wisdom before I had the Good Fortune to meet you sir mean time to respond is the average time it takes to respond I believe being an old spreadsheet Guy this is what Excel calls a circular reference problem okay more Pearls of Wisdom so I'm going to go in ascending order of people's money right because NBA bean counter guy is time to detect the time it takes to become aware of an incident Maybe fetish for definitions how about first we agree what excuse me both become aware means and what an incident is sorry I I'm gonna let these guys off the hook because they only raised eight hundred thousand dollars of other people's money come on Clicker is it I didn't make these up the time to become aware of iocs and other threats so if I become aware of jetblue.com being registered or going live is that when I start the clock or when I stop the clock if it's when I stop the clock when do I start the clock when I start the clock when do I stop these guys raised like 30 million dollars in funny but let's go with somebody we all know if you've been around this industry at all you know mindcast theirs was time to detect is the time to identify an attack so is that when the alert goes off or when the person reads the converter when the person decides the alert is not false positive and it actually isn't attacked or I don't know do you it would be helpful if they'd actually explain that in their white paper about sock metrics but they did not um that is better than the one from Cipher oh by the way mime cast 90 million dollars in other people's money Cipher just listed time to time to end time to and included no explanation at all awesome okay my personal favorite and you'll see why and obviously I'm not afraid to name and shame my personal favorite is the time it takes your security team and technology is to notice abnormal behavior first of all I think we all know especially you students on the college bar scene that noticing abnormal behavior is rather subjective and second of all you cannot have one metric for both when the alert or when the control notices and when the human notices that the control noticed my favorite part of this story is the company is called security scorecard okay now mean time to detect I mean a lot of things there's ambiguity okay but mean time to respond or Resolute we would hope mttr is better and by the way those are the only two that people consistently quote or think of mttd and mttr and I think we've shown that you can't really get consensus on that so the average time to respond to a cyber threat cyber threat is the mean time to resolve okay for anybody who's been an incident responder my whole life just changed responding is resolving I don't have to do any work this is awesome right I think you're getting the point I haven't actually seen which Marvel movie this is from but I'm told it's a really famous quote so I kept it all right by the way forgive me for putting my phone over here but I have no idea how fast I'm going so I'm going to keep an eye on the clock there's one on the screen okay I'm doing fine on time in fact I'm probably going a little fast so let's talk about the actual data not from the 30 sources but just from the 12 that were credible enough to have raised billions of dollars in capital spreadsheet guy there's my fancy spreadsheet in case you want me to do the math for you that is 46 instances of 13 different metrics my favorite part of this one is that if there's nothing else we could agree on a sock who's no disrespect again I oversee one their job is to sit there stare at things and when they go ding notice them you would think that we could at least get agreement on the fact that mean time to detect whatever detect means should be on the list nine counts for time to detect 12 sources writing white papers and thought pieces on psot metrics in other words 25 percent of the people in this business do not think that time to detect is an important metric for asoc I'd slam my head on the desk but it'll mess up the recording okay of the Publications that I included in the data set this is the number other than the fellow in the audience that actually provided definitions that were both explicit meaning it was clear what they meant and that that definition was then functionally usable as a guy who actually has to report metrics and run a sock what hat tip two logarithm I should note that he took their CTO authoring the piece to get it vaguely right and by the way they were not great they were only the best of a bad bunch okay now um I assume that you all have your opinions on these next questions or you wouldn't be sitting here or the red team side was just too incomprehensible to go to so you're like well what am I going to do until the five o'clock either way here's the part that the NBA weenie B encounter cares about so what why do any of us care about this why are we talking about why am I talking about why are you listening to it okay let me break that down a little further this lack of common metrics this lack of a Common Language what Miter did for ttps and doesn't clearly exist for sock and incident response as business processes why is that a problem if we fix it what do we get for all the trouble and if we agree that it's worth doing which I will try to convince you of how do we do it okay let's take a step back I'm going to ask someone anyone to go out on a limb this is not a right wrong answer there are no right answers this is an opinion question Cold Hard Cash if you are a practitioner if you work in a sock or assert or you fiddle a Sim or an EDR or you basically do Intel or threat hunting or engineering or anything that has to do with securing an Enterprise or an organization here's my question why do we have a job any of us any of us in this field or students who want to be in this field why do we have a job trying to double his money to keep the business thank you very much double dream run anybody else I got a couple of bucks left to get paid no no that's why we do it why does the job they pay us to do exist because people are dumb and click on things yeah I mean I can't disagree with that statement clearly I think I got a dollar left keep the lights on no wrong answers he should get two dollars but I don't have that many all right um like I said I have been doing this for 25 years uh there's no right answer there's no wrong answer so I'm going to give you my answer and I know this seems like a tangent but it isn't it's at the core of my talk today what I'm talking about and why I'm talking about it we may take the job to earn a living but the job exists I would argue for only one of four reasons and you are welcome to raise your hand and tell me something we do in as cyber professionals or I.T Security Professionals that doesn't fit in one of these buckets I haven't found one yet which is why I held at least one dollar in reserve in case you can prove me wrong all right here we go I propose that everything we do in cyber boils down to one of these things ready reduce the likelihood of a loss event loss can be money loss can be data but basically this is your preventative controls your ssdlc your network security architecture doing all the stuff right so you don't have to deal with the problem later keep the bad guys out which is as I said exactly the half of cyber means that I do not do I work in domain number two which is assume that that failed and the boogeyman is in the house right Intel hunting draft detections institutionalism sock Playbook escalation assert Circ Playbook and then when you think your controls are in order you test it because an untested control an unvalidated control is not a control it's a hope right if you haven't tested it assume it doesn't work detect faster to reduce the likelihood of damage from that event ensure compliance I know this is the one that most people in cyber roll their eyes the nist 871 I promise you there are fines for regulatory violations that will dwarf any ransomware paint okay The Government Can impose bigger fines than Russian criminal gangs gang I guarantee it don't believe me ask Equifax okay the obvious joke um finally defend the brand and for those of you who hardcore security Geeks Who want to open a c prompt and think that defending the brand doesn't matter defending the brand I used to work with a very large investment bank and they said when we started doing thread Intel this was third of our three priorities right keep operations going minimize Financial losses oh yeah and protect our brain over the 15 years of evolution of both their thinking and actor sophistication that literally reversed they now consider defending their good name more valuable than the loss for many particular event okay so today's talk is about this one okay defend faster detect faster riddle me that's Batman why does that matter by the way does anybody have something in cyber that does not meet one of those four domains I challenge you pen test vulnerability management socks or seem take a break okay the one we're talking about today is detect faster why because dwell time equals damage period right the dwell times on some of these well-documented cases are unbelievable guy gets in on a Tuesday and it's not like they run through the network and crypto Locker everything on Wednesday the average dwell time before the really adverse event happens is like weeks or months that's a lot of time to read your Intel draft your detection run a threat hunt put things in with your sock there's time there shorter time less damage full stop I just clicked it okay so what do we get if we fix it less damage pretty simple right we do this better we do it faster we have less damage Okay so lots of pictures lots of stories lots of hand waving all of a sudden I'm going to get into the uh like where's the guy with the stopwatch hard part right because up here right on right now I'm just philosophizing wig my hands telling stories how do we fix okay um it is we it's all of us uh I mean literally I have a colleague not where I work who is struggling with the following they're relatively new to their job they have been asked to report metrics for the performance of their sock and incident response functions and they cannot make the numbers line up except by doing monthly gymnastics with a spreadsheet because they have a seam a sore and a ticketing system that do not agree on the definitions of time to detect time to respond time to contain time to mitigate time to remediate time to close right if you're using Sentinel one Chronicle cortexor and God help me serve us now and they don't line up you're literal and by the way of course half of them are local and half of them are GMT and by the time you put out your monthly metrics report it's like painting the Golden Gate Bridge you get to the end you got to go back and start because the other end's rusted again that's all this person does is put out these stupid reports because there is no common language between the systems that are producing so how do we fix I have a proposal I call it the B6 timeline model bonus question oh by the way I should say I promised him I would do this this is largely the brainchild of me and a brilliant young engineer named Andrew Malone if you like it I get all the credit if you don't andrew.malone jetblue.com feel free to send your critiques to him all right bonus prize I'm actually gonna go big here five dollars this is a trivia question outside of security anybody waiting to have any idea why I call it the B6 mode it is all right out of code hand that man of Finster thank you very much so if you ever get a ticket on JetBlue flight 1545 your ticket actually says b61545 thank you I don't want to put it out I'm not looking for credit for me or JetBlue I had to call it something here we go can we agree that and I know there are outliers and we could argue about DDOS which I would argue is actually not a security issue it is a resource exhaustion issue which is not actually a threat in the same sense but that's a get me over a beer can we agree that there is an initial generally speaking if we're looking at a miter type event there is an initial compromising act okay somebody gets in initial access right then click there is some amount of stuff that happens and then God willing unless your time to detect is infinite a control goes ding now the reason I put brackets around letter B is because depending on whether what they're doing is something very well known and your EDR recognizes it and goes ding or it's not where your EDR sucks which by the way all of them do I believe tr