← All talks

Billy Huang - Taking Back Control of Your SOC with Risk Based Alerting

BSides Augusta29:11330 viewsPublished 2022-10Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleTalk
About this talk
Is your SOC inundated with alerts that end up as false positives? Are your analysts experiencing alert fatigue? These are challenges that every SOC faces with the increase in technologies generating logs. Utilizing Risk Based Alerting (RBA), we no longer tie a single alert to a single detection, but rather utilize risk scoring and a security framework like MITRE ATT&CK for higher fidelity alerts. Traditionally, alerts are created based on a narrow set of criteria, which often need tuning and produce a lot of noise. As new and sophisticated attacks emerge, we try and keep up by increasing our detection mechanisms, which generate more alerts and more noise, thus overburdening our SOC analysts. All that can change with Risk Based Alerting. We will be showing you how RBA works inside Splunk Enterprise Security (ES) and the benefits of building detections and even conducting high level threat hunting.
Show transcript [en]

so all right thank you can you guys hear me hear me all back there no you can't hear me why are you kidding are you uh for real flip the switch they gave me two of these things like which one is

this one okay there you go now can you hear me all right cool audio technical difficulties are solved all right let's put this back

all right hey well thanks for uh thanks for coming I know I got the after lunch special so I'm going to try to entertain you guys keep you guys uh awake and focused but today I'm going to be talking about um how you can take back control of your Security operation Center or sock as you all know with risk-based alerting you might hear me say RBA just short for risk-based alerting all right so who am I so my name is Billy Huang I'm a practice lead for data analytics uh at guidepoint security we have a booth out there so if you guys want to come visit us and talk uh we partnered with tines and we're between

the raffle raffle counter and the pick locking lock pick um so it's hard not not hard to uh to find us so I currently live in Raleigh North Carolina drove through the hurricane to get here just to talk with you guys yesterday uh but I'm a cyber security engineer by Nature I primarily work with Splunk the seam you know Splunk but I also then ventured and started working with different Security orchestration on Mission response tools like Splunk sword which is formerly called Phantom and then also tines as well so gotta give a shout out to them um and then I also spent quite a bit of time working Insider threat detection as well utilizing RBA uh I did nine years

in the Army uh I was in Air Defense Artillery then uh went cyber uh for my last three years and then I got out in 2019 I've been just doing primarily a security since then what I do outside of work I got three boys three little boys and I busy coaching them breaking up fights you know running around with them doing soccer and baseball and things of that nature all right so uh this is a short lightning talk so I'm gonna try to power through these I do want to do a demo uh in Splunk at the very end so uh we'll try to save questions at the end if if possible but if you've got something

pressing just raise your hand so traditionally socks you know they've got correlation searches that they built or detections and those detections create alerts now you know these alerts what we've seen is that in 40 of organizations they're getting 10 000 alerts a day this may not just be coming from their Sim but it might be coming from their EDR solution you know different solutions that are just pounding you know pounding the sock with alerts and what I've seen is at least 50 percent if not more you're getting a lot of false positives uh and so you know you're chasing down these alerts you're you're constantly um what we're seeing is you know you're abandoning these alerts you know like

when you pick up your phone and you get the same notification over and over from the app you actually end up being blind to that right because you see that notification so many times you're blind so people are abandoning their alerts they've created them they determine that they're horrible and they're just forgetting about it people are constantly tuning their their searches um tuning is is the right thing to do right because you want to make your searches uh more efficient more accurate but they're just spending so much time tuning then they're also creating suppressions in whitelists so many suppressions that you may actually end up suppressing you know true positives so that's bad and then also just a very noisy

environment so many of you guys might have heard we're getting a lot of alert fatigue in your sock um so actually before I go on how many people here have played with Splunk all right so cool cool a lot of a lot of people um yeah so just getting tons of alert fatigue uh you know it's just busy and mundane work right so we've got two problems that we want to try to solve the first how can security teams reduce alert fatigue while improving their overall security coverage right we want to decrease alerts but we want to make our alerts better second is how can security teams shift their triage investigations to higher Fidelity alerts so now let's create

higher Fidelity alerts let's create only alerts that give us that meaningful and contextual information all right so let's introduce RBA so now right we create correlation searches or detections now instead of having those give you a one for one alert now we're sending all that to your Risk Index so you guys all know what an index isn't Splunk is right pretty much just a data repository somewhere where you can organize your data um and search efficiently and Splunk so now all those uh alerts that we we had that was pounding the sock instead of having alerts let's let's shift all that information into a Risk Index and let's store it all there so in this scenario we're going to be

using Splunk Enterprise security has anyone heard of yes Splunk yes so it's a Premium app that you have to get on top of your Splunk so you'll create another es search head and then that's got a lot of um really cool like security configurations and stuff that you can do all right so now inside that Risk Index we're going to add some extra contextual information so the first thing we're going to add a risk score and the cool thing about this risk score is it's not just static it's going to be very Dynamic based on you know who the user or what NT hosts we're talking about we're going to tie in the miter attack

framework we're going to add a tactic and a technique to each of these alerts next we can also add any type of vulnerability data you have a vulnerability um system where you want to add something from the outside like a third party you can integrate that into your uh each of your your Risk Index events now what we're going to do is this Risk Index holds an immense amount of contextual data it's great data now we're going to formulate our alerts and notables based on the Risk Index all right so we've got um two different things I want to talk about here we've got Risk rules or RR we've got risk indicator rules so on the

left side you've got our old alerts we're renaming them as Risk rules so you had you know an anomalous login rule that or detection originally was an alert now we're just going to send that to the Risk Index you've got data exfiltration data you want to put in there all the all that's that really interesting stuff that may not be malicious in nature we're going to put that to the Risk Index right we don't want to get alerted on every single one of those right for example a um uh an account lockout right that happens like you depending on how big your organization is you might get hundreds of those account lockout event codes and

they may not be I mean 99 of the time those are false positives so we're going to shift that all into the Risk Index now we're going to call we're going to create risk indicator rules or rirs and that's going to look at all that data inside your Risk Index and we're going to make the alerts from there so a couple of the ones you can create right off the top of the uh um top of the right off the bat sorry uh the first one is the risk score threshold exceed so we can now say hey if a certain user or certain NT host exceeds this particular risk score now I want to get a an alert

um how about multiple attack tactics now maybe this particular user they've done a certain number of um things that have crossed over two or three different miter attack tactics right they went from um detection to exfiltration to something else right so now we're interested in that um maybe we can look at exfiltration in a certain sequence of events this is really good for Insider threat because let's say we want to look at our particular set of order for example someone plugs in a USB now they're requesting a permanent or a temporary exception to the rule to use that USB now we're seeing them actually exfiltrate you know hundreds of Megs or gigabytes to that that USB

um right so now we see that particular sequence that we can alert on all right so key terminology uh so Risk rules and risk indicator rules we talked about those risk objects so now that's going to be your point of Interest inside the Risk Index um so this could be a username this could be an IP address an NT host right this is a particular unique um point of interest that we have the second is the risk object type we're going to categorize these in two different types we're going to call them an asset or an identity an asset would be like a physical machine a server a computer an identity is going to be

associated with a user or an email Now with uh Enterprise security accident you might hear as an identities a lot because you're going to configure that initially off the bat when you set up Enterprise security what you do and with the asset identities is you're going to now say hey I see um this particular asset it may be a domain controller or some type of privileged server that's going to be uh have a priority level of you know a critical priority level compared to a regular user host machine right because that's going to be way more interesting than if we see something happening on a host a regular host machine just like an identity if we see a

particular user like an admin user a service account a privileged user that's going to be way more interesting than um just a regular user now I'm going to show this in another slide but now you can actually set your risk score and you can set um you can variate that based on the the critical criticality level of your asset or identity so going on your wrist sub your risk score would be your assigned numerical value so generally at the higher the risk score uh the riskier and then your threat object is an optional field that you can put in a lot of times people will put specific iocs in there so if you see a hash or

anything like that you can actually say hey set this md5 hash as an into the hash field and we can use that for threat hunting in the future all right so the benefits of uh risk scoring so RBA gives you the ability to modify scores based on the criticality level of a specific asset or identity so kind of like I was saying your priority levels in the middle are critical high medium low and informational um on the left side here you can see we're editing the risk factor risk factor editor inside es and this allows us to say whenever I see a high priority user I'm going to take that score and I'm going to multiply it by 1.25

so pretty much one particular correlations church that you create may not have the same score so on the right side you can see um we have the risk analysis this is an Adaptive response action so when you create a correlation search in Splunk you can actually use this drop down for setting the risk score setting the risk objects and then all that gets then sent to your risk Risk Index so you can see here we've got a default risk score of 80. so anything that comes in here that particular correlation search is going to give you a score of 80. now let's say we find out that the person who um uh the person who you know wrote this

malicious Powershell process maybe uh they had a user priority of high so now we're going to take that 80 and we're going to multiply it by 1.25 we're gonna we're gonna get a higher score down here you can see um at the bottom we got threat object so you're able to include the md5 field is a hash so you can you can include that into your Risk Index all right all right so how do you create a risk rule uh risk rule a correlation search is very similar to a save search in Splunk core but in es you're going to create as a correlation search so the first thing you're going to do is you're

going to write your SPL your your Splunk search processing language so that's going to be the base of your detection sometimes you're going to use just you know index equals and Source type equals you may use a t stats if your data models are configured correctly um a lot of times in in es you're the one of the things to configure es is to set your data models and um have those being whitelisted so we're going to create this the Splunk SPO uh next thing is we're going to choose a miter attack tactic and technique so you can see down here

yeah there you go you can see down here miter attack we're using the t1485 um and I think that is that's for Data Destruction that's the particular technique because this is looking for ransomware so we're looking at Data Destruction you can also use the CIS 20 you can use kill chain you can use nist I primarily like to use miter attack uh next thing is you're going to schedule to schedule the search so you're going to look up here you're going to schedule it based on a cron schedule you look at your earliest and latest time that's going to be your time range then down here you can actually configure the throttling so throttling is going to tune out any duplicate

events say you look back 60 minutes now you're going to throttle and you're going to say I don't want to look at anything within that other that 60 minutes so you're going to be able to factor out duplicates and the last thing is setting your adaptive response action so that's going to be the risk analysis adaptive response action which takes that sets a risk score and then um sends all that information to your Risk Index now how do you create a risk indicator rule so risk indicator rule is also correlation search and Es we name this one the attack tactic uh threshold exceeds for an object over the previous seven days so we're looking at

a seven day period and we're making we're checking to see if a particular risk object so a user an asset and identity if they cross over three different miter attack tactics so here you can see we're using a t-stats command we're looking at the Risk Index data model and we're saying hey let's let's check to see when um a particular miter attack tactic ID count is greater than or equal to three uh this one on the right side is actually looking at the um risk score if that succeeded a certain threshold in the last uh last 24 hours so this one we actually set to be static we're saying we're trying to see whenever um a risk

score for a risk object has exceeded a hundred in the past 24 hours now one thing to note is as you create more Risk rules and you've got a lot of a lot more data in that contextual data in your Risk Index your actual risk score you may have to actually edit that and so I've done things in the past where instead of having a static uh risk threshold in this case it's a hundred I actually will look at the Risk Index and I'm going to do some statistical standard deviation and I'm going to see hey based on my environment what is the average risk score amongst everybody now I'm going to do some standard deviation and

I'm going to say I want to look at the top um 2.5 percent uh of you know when you do like a bell curve you're going to look at that top and you're going to look whenever someone exceeds that bell curve we're going to now um uh trigger an alert so there's different ways to do this you can create a static threshold or you can do it dynamically by looking at your overall environment all right so after uh your risk indicator rules created all that stuff actually gets sent to your at your notable index so your uh so index equals notable that's where that stores all your your alerts and then you can actually see all that

in the incident review page um that's the dashboard that a lot of the sock analysts will will utilize to check their alerts they can assign them to themselves they can then run drill downs and queries from the uh the incident review page all right so other considerations so the Risk Index is a great place to do preliminary threat hunting uh so like you can look for iocs um the data is organized in there so it actually allows you to look over a long period of time so that people anybody here if you're looking at like firewall data or proxy data if you start looking over 24 hours right your search becomes super slow because of you know just the

quantity of events you have in the firewall or proxy Index right so now all that data is in your Risk Index all that interesting data you can look back 30 days 90 days you can look for those low and slow um uh attacks and that's pretty pretty neat uh and then also like like I mentioned before you can look for patterns or sequence of events this is great you can use the transaction command and you can actually start looking at um uh for that sequence of events another great thing is you can build dashboards to quickly see patterns and Trends in your apartment so this one right here is your risk analysis dashboard um great for you know sock managers and

and management and stuff they can start looking at their environment and seeing you know some of those Trends they can see if they're going up or down they can see um pie charts of different uh scoring and different miter attack or you know CIS 20 or nist you know the different Frameworks where all of your your alerts and detections are coming from um they can quickly see here like top uh risk objects right so you can see um certain IP addresses and certain users right they bubble up to the top because they have the higher risk scores compared to other users you can even look at um most active sources so these are your Risk rules uh you know you can see which

ones are firing the most do I need to maybe tune them a little bit um are these Maybe not interesting to me maybe I can actually take them out so you can see that um over here uh and then the second thing is the the more um uh the more Risk rules you create the more accurate your alerts actually become right so the more data you can populate into your Risk Index um just the better Fidelity alerts that you'll have um that's really important all right so let's look inside Splunk I'm gonna do it in time 10 minutes okay thanks oh thank you there we go okay all right so first thing I want to show

you guys is Let's uh maximize this a little bit first thing I want to show you guys is uh the assets and identities that I talked about so here you can see we're looking at lookup assets and by the way sorry this is a this is a demo environment so it's not actual production data and so has anyone here done boss of the sock yeah so we're actually using Bots uh version 4 data here so you might see some like interesting like Splunk Splunk uh users here so the first thing I wanted to show you here is our assets look up so let's go ahead and run this all right so you can see so some of these NT hosts right we've

got Acme o one to six we've got different different hosts we set the priority level uh for each of these now this lookup is actually generated from it can be generated from a cmdb database right so we take a CMD database bring that data into Splunk then we run a search that will then output the lookup to this lookup assets.csv and we're going to assign a priority level to each of these so we assign High to all the Acme's we sign a critical to this particular one zero zero six and we also check the category so these are all PCI related NT hosts um and you can see here we got PCI virtual these are all signed mediums

and as we go down we've got um lows and we've got you know different different categories here so same thing if we were to look at um identities instead of assets

okay

very good okay um so we if we sort this By Priority level now we can see particular assets that are critical versus uh high medium low here so it's very important that when you configure es assets and identities have to be properly configured and tuned next thing I want to show you guys is uh the threat intelligence management so if you were to come to configure data enrichment and come to threat intelligence management this is where you can actually bring in third party data so in this case we're bringing miter attack data and we're grabbing it from GitHub this actually gets updated um every 24 hours and so if there's new tactics or techniques T numbers that get

added uh it can it's going to continuously upgrade update here so then here we actually take all that miter attack data and we're going to put it into the miter attack lookup now here I'm searching for the particular tactic of discovery but this is where you can see all the T Numbers right so we're the tactic is Discovery we've got all the sub techniques here and our T Numbers here so this is actually what allows you to um in your correlation search um assign a t a miter tactic and technique to your uh to your correlation search all right all right so next we're going to be looking at our risk our Risk Index now

in this case I'm using our data model because your Risk Index if you accelerate it you can put it into a data model so the first thing you can do I'm going to create a search where I'm going to look for just risk objects that have a high count of a high count of particular risk rules that triggered so in this case if we sort by count we can see that some of these have bubbled to the top so this person fyodor at the splunkshardcompany.com we've got frothly we've got particular IP addresses and so these these have all been uh these are all particular risk objects that have a high number of Risk rules triggered

now

all right so now I want to actually get a value of all the different search names that that person has triggered so in this case you can see frothly they've detected four different AWS login type logins to the console uh fyota they've done some type of malicious Powershell process they've there's been an unusual volume of outbound traffic just different different searches that uh that this particular person has triggered now if we want to go a little bit deeper now we can actually pull out the scores for these people so in this case we're going to look at the risk object type we're going to look at their search name their risk score risk message and all different things

so we're still sorting by count so that's why few order is at the top but you can see that over the last seven days there's some risk score has been exceedingly High um this person is also all you know triggered I think six distinct different searches uh Risk rules these are all the different risk messages for them and uh different minor attacks so you can see it's crossed uh it's um it's um it's crossed over uh five four different techniques and six different or five different no sorry four different tactics right here um so if we want to actually do something else we can actually do a distinct count so let's say a distinct count of

foreign attacker here DC miter tactic

and now we can actually have we can see here that it's um triggered four distinct different miter tactics all right so the last thing I want to show you is the risk analysis dashboard this I had a screenshot in there but as you can see so we're looking over the past seven days here we can actually um we can say star for risk objects so we can see all risk objects and all of the different searches and risk scores that everybody in your environment has populated but you can also drill down to particular risk objects so in this case you can see that um over the past seven days we've had an increase in three different distinct

sources so these are your different Risk rules you can see we've actually increased by one for a unique risk object so this is interesting um and then here you can see we've got overall um we can see a trend line for different Risk rules over time we have our miter attack annotations here so we've got different miter attack tactic and techniques you can see we've got a large one in remote Services here as well as a data transfer size limit now if you scroll down here we can actually see some of our most recent data from our Risk Index okay thanks

all right so are there any questions

yep

yeah so my my um my suggestion there is take out all those alerts that you originally had and just have that miter attack tactic exceed a certain threshold or the risk course exceeding a certain threshold and I've seen like you can you can set that notable in your incident review page so when you drop that down and you look at that you can actually see all the sub Risk rules in there so it'll show you um and I can show you afterwards if you want to come down but it'll show you the Risk rules that you've triggered it'll show you the different miter attack tactics and techniques and then you can even drill down so you can just drill

down you know click a hyperlink it'll take you to your Risk Index and it'll show you that data yep appreciate it