GF - Detection Engineering Demystified: Building Custom Detections for GitHub Enterprise

Name: GF - Detection Engineering Demystified: Building Custom Detections for GitHub Enterprise
Uploaded: 2024-09-04
Duration: 43 min 40 s
Description: Ground Floor, Tue, Aug 6, 12:30 - Tue, Aug 6, 13:15 CDT For many organizations, GitHub houses critical intellectual property and is a prime target for attackers seeking to steal valuable source code, disrupt software development operations, or carry out supply chain attacks. Security teams must pro

BSides Las Vegas43:40318 viewsPublished 2024-09Watch on YouTube ↗

About this talk

Ground Floor, Tue, Aug 6, 12:30 - Tue, Aug 6, 13:15 CDT For many organizations, GitHub houses critical intellectual property and is a prime target for attackers seeking to steal valuable source code, disrupt software development operations, or carry out supply chain attacks. Security teams must proactively monitor their GitHub Enterprise environments and have the capability to detect and respond quickly to any suspicious activity. This presentation is for defensive practitioners curious about the world of Detection Engineering and how to build detections that are focused on identifying attacker behavior. As Detection Engineers, we’ll receive some intelligence on a threat group’s modus operandi for stealing intellectual property, analyze the attack technique, identify relevant data sources, and build & test a detection step-by-step. You’ll leave with practical Detection Engineering techniques that you can apply to other use cases to bolster your organization’s defenses against threats. People David French

Show transcript [en]

all right yeah thanks everyone uh thanks for coming this is first time at bid of Las Vegas so thanks to the organizers for having me this talk is for anyone who's curious about the world of detection engineering and learning a process and some techniques for building detections that are focused on identifying attack of Behavior Uh specifically we're going to be looking at building detections for GitHub today but um my goal is to share some techniques you can use to improve your detection coverage for all platforms and systems that you monitor right and not just GitHub uh yeah this is on the ground floor track so I'm going to try and make it um you know so that people

beginner or intermediate knowledge can follow along so yeah here's a little bit about me before we get started uh I've been in it in security for over 18 years now during the last eight years or so I've kind of gone back and forth between being a defensive practitioner um you know defending a single organization and working on the vendor side doing detection engineering threat research building out content for uh Sims and edrs uh currently at Google um it's kind of a different role for me it's like a blue team Advocate kind of role um working on Google SEC Ops and yeah enjoy doing stuff like this sharing research and knowledge um and when I'm not

working you can find me enjoying the outdoors in Colorado when it's not on fire um brief overview of what I'll be covering today so for those of you that are new to detection engineering I'll start off by explaining you know what it is and some of its benefits for a company that has that capability then we'll look at some threat Intel that provides details on an attack group's tactics for stealing data from GitHub environments and then we'll move on to developer detection that identifies a specific behavior and to build that detection we're going to identify data sources uh simulate the behavior we want to detect and then develop our detection logic and then we will look at um the

concept of monitoring your your data pipeline right um and testing your detections stuff that is really important but is often neglected by security teams and it comes back to to bite them so I I'll walk through a couple of example techniques for doing that and then I'll leave you with some key takeaways and some links to some resources to learn more about detection engineering if you're interested so just for the benefit of folks who are new to this right um just want to take a couple of minutes to review what detection engineering is and some benefits so let's do that so I like to think of it as a specialization within security focused on implementing detective security

controls um the goal is to detect and respond to potential security incidents before they can cause our company significant damage uh there's a focus on continuous Improvement so a team of detection Engineers uh have this process for continuously developing testing and improving your detections to stay ahead of threat s and our detections complement our preventative security controls either acting as a safety ner if that prevention fails or it lets us Implement controls where you know prevention is impossible or impractical to implement uh there's an emphasis on detecting attack of behavior versus indicators of compromise um I'll F I'll talk more about that later but the idea is you know um our detections have a longer shelf life if they're Behavior

based and yeah this terms I think appearing more frequently during the last 3 to four years um it's now I think accepted as its own specialization within security uh plenty of job postings on LinkedIn that either have you know detection engineer in the job title or within the job description and just to take a second actually if you're uh my friend Wade pointed this out the other day when he went through a dry rad us with me um you don't have to be you know your title doesn't have to be detection engineer to do detection engineering you could be a sock analyst or part of a you know detection of response team that could just be part of your job that you're

developing these detections uh couple of slides on the benefits for doing this right if you're you a company that has this capability uh it reduces risk so by detecting that malicious activity early on before an attacker can achieve their goals you've got a chance to respond before you know a data breach or A disruption to your business operations occurs and causes significant damage uh this can not only you know save a significant amount of money if you you know catch threats early on before they cause a bigger problem for you but can also save lives right depending on the industry you're work in so this study by this University at the top right um ransomware attacks have

being carried out against hospitals they found that mortality rates increased by 20% so we not Wren detections for fun right by being good at security depending on the industry you work in you can help um save human lives and then next by developing these detections that generate actionable alerts the security team can reduce the time we spend um working on security incidents so this is an opportunity you know if you're responding to incidents quickly you can build and maintain trust with your customers uh people are definitely paying attention now when you're you know you go through a security incident they want to see that you've got your act together when they bank with you or you know store their

personal information with you and then final slide on the the benefits before we move on to look at this uh this piece of threat Intel uh so having this continuous process for identifying and integrating new data sources or logs can increase your visibility into what's happening in your environment over time uh detection Engineers are continuously kind of um assessing your detection coverage as you know new threats and attack tactics emerge and they're constantly kind of developing and refining their detections and then finally depending on the the industry you work in you might have an auditor that comes in and asks for evidence that you've got certain detections right um so if you're working in the financial services industry you

might have an A to come in and ask if you've got detections related to Swift all right so we're going to move on to look at a practical example of how to transform some threat intelligence into uh detection so let's look at this um so while I was working at another company uh I was on the phone to a security engineer who worked in the same industry as I did uh they said they shared some Intel on a threat group tactics for stealing data from GitHub Enterprise environments so let's take a look at the details they shared um among several other things right and you you might recognize this this threat group some of you uh this is what the

attackers were up to at the time so uh they started off by compromising a software engineer OCTA user account uh via submission campaign they L users to log into a fake uh OCTA single sign on Portal and then they stole the users's credentials so their username password uh one time one time password token as well then they used the stolen credentials to log into the targets uh legit to octop portal they were using VPN services to mask their IP address and geolocation information and then they log into the target organization's GitHub Enterprise account via the OCTA dashboard tile and then they create a uh personal access token under the compromised users account and we'll talk more about what

that means in a moment moment and then they used this tool um this guy said they're using this tool called Gorge to clone all of the code repos that that user had access to so needless to say right after hearing this I became what I became interested in what logging monitoring and detection opportunities existed for GitHub right um i' never looked at GitHub logs before um I didn't know what kind of auditing was available so we're going to we're going to look at some of that as well so for the remainder of this presentation we're going to use this threat Intel to create a detection that alerts if a specific behavior happens in our environment

right that we're defending so just taking a minute to consider why attackers Target GitHub and why as Defenders we should care about monitoring defending our GitHub environment some code repos might contain intellectual property right um after stealing that data from an organization an attacker might um try and sell that or use it in extortion attempts against you uh they can examine your source code for vulnerabilities which you know they could either sell or exploit and follow-up attacks uh if they're able to harvest secrets from your get up environment they can use that to um further infiltrate your environment right perhaps they can um establish persistence in one of your Cloud environments and then finally if they're

targeting a company that develops software they can look to um you know inject malicious code into that company's cicd Pipeline and deliver malware or back doors to unsuspecting customers which is we've seen that before right um so people aren't familiar with GitHub Enterprise uh here's a brief overview of some of the key concepts for this platform so GitHub Enterprise is this commercial offering that provides companies with the tools and features they need for collaborative software development so we're going to look at the GitHub Enterprise cloud-based platform and imagine that your organization has a subscription to that offering an Enterprise can contain one or more organizations and an organization essentially just lets you group certain projects together

for people to collaborate on uh you might have a get Organization for each of your company's core product offerings for example and then these organizations contain uh repositories or repos which is where the code is stored and it's worked on for each project and then GitHub users are invited to your GitHub organizations where they can collaborate on projects and then I mentioned this them about a personal access token right um in this threat inail we received a personal access token just acts like an alternative password for your GitHub account right so um you can create a token under your account uh Grant a specific permissions to interact with interact with github's API and um yeah

these tokens need to be kept confidential and people should assign the minimum necessary permissions to to limit damage right if they're compromised so one of the first things I did when I heard about that tool um that that guy mentioned right was to try and find it um the tool was on GitHub ironically uh it lets you clone an organizations or an organization or users repositories into a single directory uh it works with GitHub gitlab and bitbucket um found it funny like the list of use cases includes creating backups so the attack is creating involuntary backups for people um so at this point there are you know a few questions to ask ourselves as detection

engineers so should we review the code for the toour and try to look for um you know detection opportunities there uh who's the developer of this tool do we trust them uh what if the tool contains malicious code right to infect unsuspecting users uh we definitely wouldn't want to download this and run it in our production environment and interact with our our company's code in GitHub uh so yeah do we want to look for opportunities to write signatures to detect the tool or do we want to build detections that detect the underlying Behavior right so in this presentation we're going to build uh behavior-based detection so now we've got an understanding of the attacker's tactics

and the tool they're using uh let's look at the differences between an indicator and a behavior-based detection if you're not familiar with this so we could analyze the tool and look for opportunities to fingerprint it and write signatures um that detects this use in our environment so the tool might have a specific user agent string that it uses um but with this particular type of detection the attacker could just modify their code right and have it use a different user agent string maybe one that blends in with other traffic and evades our signature um also the attacker might try and use a different tool then our indicator detections might be broken right and they might miss that behavior

um alternatively which is what we're going to do we're going to focus on detecting the underlying actions that need to happen to detect the activity so the idea here is that it's usually harder and more expensive for attackers to change their behavior instead of just swapping out their tools or malware or infrastructure and then with this type of detection we're monitoring for a sequence or a pattern of behaviors which we'll look at next and just before I move on um I just want to know you know with an indicator based detection if you're able to uh deploy an accurate signature that detects you know malware or an active intrusion then that's a win and we

definitely shouldn't discount like the value of of indicator based detections as well all right so let's move on to developing new detection to alert us if um that specific behavior occurs in our environment so this is a very basic design right for the detection we're going to we're going to build to identify a specific behavior um and we're going to we're going to expand upon this as we go through so uh we're going to detect these following Atomic behaviors in sequence and think of these Atomic behaviors as you're building blocks for your detection logic so by combining multiple Atomic behaviors you can create more complex detection rules that alert on these patterns or sequences of activity so the first

behavior is access being or permissions being granted to a personal access token uh the second behavior is that same user account being used to download more than five GitHub code repos so we're going to yeah build this simple detection expand upon it from there um there are opportunities to detect other behaviors where where that were in that threat Intel received but we're going to focus on this single detection use case um given the amount of time we have so now we've got that basic design for that detection we want to build um we're going to look at what data sources are available to us as detection Engineers so a detection needs to be fed relevant data or events otherwise you

know it will never generate an alert to tell us that the behavior happened in our environment uh so in this example we can see GitHub enterpris has got this audit log that record W events as they happen in our environment uh it tells us that those logs are retained for 180 days uh git events are only held for 7 Days right before they roll off um and G events involve people cloning code repos uh like the attackers doing and people pushing code to repost so we're definitely interested in those events for our detection use case today and yeah in my opinion you know a decent audit log when you're looking at the stuff includes um details on the the

who what when and where for the event the why for an event is usually implied or you need to look for another data point to tell you why the user carried out that action so in this example you know why did this user disable um this setting in our GitHub organization so we need to either speak to the user or go out and find you know maybe a ticket or another data point that tells us why that happened so um yeah in this event we can see the who right um who carried out the action there's a unique ID for the user that initiated that action uh what happened there's a specific action or event that took place and when the

event happened we've got this precise time stamp so the the where is missing from this event which is um which is the location right from where that action originated and let's take a look at why that's missing so so a couple of noteworthy things to call out regarding github's audit log so by default The Source IP address is not going to be in your events um I think it's some some privacy thing you have to go in there and enable that uh we definitely want to want to see that uh the second thing to call out is that by default API request events are not going to be streamed to our Sim um we're going

to be developing our detection in a Sim right from a centralized location so we're going to enable that option too um it's going to let us see you know attackers or regular users cloning the contents of GitHub repos via the the API so these are the reason I bring this up right these are the types of nuances we need to understand when we're looking at new data sources for detection so by reading github's documentation um and exploring those settings for the audit log we can configure it to be used for our detection use case and just to point out right um not just for GitHub but if you're working on a security team building detections or you care about

logging for investigations or hunts or whatever um definitely make friends with the people who administer and own these platforms right they they might not always have um Security in mind and know to like turn these on to make the log valuable to you as Defenders so definitely build those relationships um and get your get your data in a good state so next step is going to be to configure GitHub to stream its audit log to our SIM for ingestion um our Sim is going to normalize these events into a common schema and index them so we can use them to build our detection so in this example we're just streaming those audit logs to a Google Cloud Storage

bucket and then our Sim is going to collect the logs from that storage bucket and ingest them so the next step is configuring our Sim to ingest um the get our border logs from that cloud storage bucket uh in this example I'm ingesting the logs into Google SEC Ops so this is um you know Community event um my goal is to keep things as vendor neutral as possible so and and share some practical techniques for detection engineering so you can apply these techniques using whatever tools you use so at this point we've conf configured our GitHub audit logs um those GitHub audit log settings and we're ingesting the logs into our Sim so the logs are being normalized and

indexed and they're available for us to search um the next logical step is to simulate that behavior we want to detect right so we've got some events um that we can use to develop our detection logic there's a party going on out there one of door um okay yeah so uh yeah if you skip this step right it can feel like you're you're shooting in the dark um when you're writing your detection you don't have any events to test it against so that's what we're doing here so um yeah in my GitHub Enterprise environment um I created a GitHub personal access token then I gred that token access to six GitHub repos in my environment and then

I used that token to clone six GitHub six repos so after executing that test scenario I went back to my cm and explored those events to understand you know the various field names and values that were logged so you can see at the bottom in the middle we've got the personal access token um access being granted to that and then the get clone events above that and then on the right you can see these events were carried out using a personal access token so now we've simulated that behavior we can develop the first version of our detection rule so this example is going to be written in the y l language um you can adapt this to work

with the technology your company uses right um so in the event section of this rule this is going to specify the field names and values we we want our detection to match on so on lines 20 and 21 here we're searching for events where um access is granted to a get our personal access token and then onlines 24 to 27 we're searching for events where a private GitHub repo was cloned using a personal access token on line 28 we're just creating this placeholder variable um named gith her repo name this is just going to store the name of the repo that was cloned and then line 31 we're joining that GitHub personal access token event

to the GitHub clone event based on the user ID right because we want to join those by ID to to see which user is carrying out the activity and then on line 34 we're creating another placeholder event uh this will become clear in a minute just to hold the user ID that carried out the actions and then finally on line 37 we're searching for events where that personal access token event happened before um the repo clone events so reviewing the other sections of that rule uh the y l rule in the match section we're telling the rule to return results if the events we specified are found for a single user within a 30-minute time window in the outcome

section we're storing the count of distinct private repos that were cloned and then finally in this condition section we're specifying that the r should trigger if a match is found for those events and more than five GitHub repos were cloned so now we've written the first version of our detection rule we should test it um to do that we could either run our detection logic over the events we generated earlier or um simulate the attack of behavior again in this example we can see a user created a personal access token and then they clone six distinct private GitHub repos so our initial detection logic works and then let's assume that after testing that new detection we handed it

over to our sock analyst to respond to alerts um and then after a week or so they tell us you know it's it's generating falce positives uh this is taking up their precious time so we need to fix that so let's say hypothetically um in this organization when a software engineer gets a new laptop is common for them to create a new GitHub personal access token and then clone all the private code repos that they work on so we're going to look at an option for filtering those falce positives and increasing the detections Precision um we not have time to do a deep dive on on precision and recall and those classification metrics that can be used to measure the

performance of your detections um but if you're interested in that you should definitely check out the the link on this slide so if you recall our threat Intel said that the attackers are using VPN services to hide their IP addresses and geolocation um one way for us to filter these falce positives is going to be to um modify our detection logic to generate an alert if the activity generates comes from a VPN service right so um when I spoke about Precision it talks about considerations when you know you're filtering falce positives but you might in turn introduce false negatives or misbehaviors so in this example let's say that our users should only be using our company approved VPN right not mulad

VPN or nordvpn um if we move forward with this tuning option we could create another detection that looks for activity from non-company approved vpns or you know um looking for installation of those VPN clients on endpoints so yeah to to tune this detection I'm going to use a third party data feed from spur um if you're not familiar with them they provide um data feeds on VPN Services right residential proxies and and Bots um and the value there is that the IP addresses for these Services churn quite quickly so um you know an IP address that's for nordvpn might not be the for the same service might not be be being used for that next week um so having

these kind of up-to-date feeds is useful for correlation during detection investigation or enriching

events so in the highlighted portion of this screenshot we're modifying our detection logic to match on the GitHub activity when it comes from an IP address that spur attributes to a VPN service uh in this example we're joining the IP address from the GitHub events that we enabled in our loging right um with an IP address in Spurs data feed if it exists in their data and this is an example of how we can use third party data sources to adjust the Precision of the detection all right so when you're developing and tuning your detections it's crucial to test them after any modifications are made so in this example uh I went ahead and simulated

that behavior again in my lab environment and validated that the detection generated an alert so on the left you can see uh the detection matched on the same GitHub events as earlier and then on the right you can see spur is telling us that this IP address um is associated with the mulad VPN service and yeah if you're not familiar with that um they accept cash and Bitcoin as payment methods it's popular for you know people wanting to do certain things and attempting to remain anonymous so definitely weird right if you see that in a lot of environments um 29 and I haven't mentioned uh gen AI yet you almost almost got away with it um

one of the important steps When developing a new detection is going to be to document it right um this ensures that the goals and the design of the detection is understood and the team knows how to triage and respond to alerts um we could have done this earlier right arguably but we're go we're going to go ahead and do that now um a popular method for documenting your detections is to use palen's ads format alerting and detection strategy um and Wade Wells who I think is in here he's a a lead detection engineer he's built this AI assistant that um helps us document our detections right so in this example um we're we're asking the assistant to document the new

GitHub detection we provided some details on what Behavior we're trying to identify and the data sources the detection relies on so the assistant responds right with the documentation for the detection in ads format uh it documents the for the detection the MIT attack technique mapping was incorrect so we'll have to fix that but um it includes yeah technical explanation of how it works any blind spots um how to validate and respond to alerts and and so on so yeah it's not the output's not perfect right as with um a lot of these llm models at the moment but it saves us a lot of typing and can help us document out speed up our workflows as detection

Engineers so uh yeah encourage you to check that out if you think it might be useful all right so yeah let's move on to look at why we should monitor this data Pipeline with built and a technique for doing that so yeah this is something that's um often neglected or not thought about and it comes back to to to bite you as a Defender uh so simple diagram of the data pipeline we've built so far um as you can imagine this will you know get more complex as we integrate additional data sources with our Sim um it's it's going to be important for us to Monitor and test the various components in this pipeline as our

investigation and detection capabilities rely on the quality of our data so on the left we've got um GitHub audit logs and those spur data feeds I mentioned in the middle we've got um a couple of services running in Google Cloud right um that are helping us ship this data from the left to our Sim on the right and when data shipped to our Sim is typically normalized into a common schema then it's indexed before those events or record are searchable um those events might be enriched uh either before they go into SIM or after um maybe with you know metadata about an employee like a job title or department or geolocation information for IP addresses and then those events are

available to the detection engine to to run our rules over those events and generate an alert if if a match is found so all of these components and Connections in our data pipeline can fail which is uh why it's important to be able to monitor for issues and you know jump in quickly and fix those so uh as an example of a failure right um GitHub might stop shipping its logs to the storage bucket but our Sim is checking the storage bucket every 5 minutes for new logs doesn't find anything um you know it doesn't see an error uh this is like a silent failure right and then when something bad happens in your GitHub environment you

won't see it or you need an investigation or you start an investigation your logs aren't there for you um so let's look at how to to do that um so some reasons to monitor your data pipeline so our environments tend to drift over time um when you configure a data source and write some shiny new detections everything might be working fine today but that might not be the case in a week or a month's time so infrastructure and Technologies come and go you know system configuration changes um software is updated all those things can impact that data pipeline that we looked at a minute ago uh monitored systems might stop logging or somewhere along the line their logs making it to

our Sim logging spikes can cost a lot of money if you don't jump in identify and jump in and fix those quickly um The Sim might have issues pausing events from the log is it received so maybe the vendor um maybe GitHub changes logging schema and our detection relies on a specific field name that's changed and our detections fail and then yeah latency issues as well between certain components can result in us you know um Behavior getting missed or logs not being available we need them and yeah a lot of these things can result in you know misbehaviors false negatives um yeah missed opportunities to detect and respond to threats early on uh anyone experienced any of these

issues affecting the detections yeah okay sucks um all right yeah so if just a quick call out here if you're interested in learning more about data pipelines and how to monitor and improve your data quality highly recommend this talk by Josh liberti all right so an example technique right to get you thinking about monitoring this stuff um techniques to monitor some components of your data pipeline so we need to know when there are issues with that pipeline so we can jump in and fix it before our detections are impact impacted um some people like to create detections that alert them when you know a system goes quiet they startop seeing logs in their Sim that can be fine with

like a noisy platform like OCTA or Google workspace but some systems are just quieted another right just because it doesn't log a thousand events an hour doesn't mean there's a problem so um what I've seen is those detections can create false positives that just waste more of our time another option is to think about implementing these health checks for the systems and data sources that are important to us um a health check can carry out a small basic operation a read operation against the monitored system and then validate that those events are in your sim and they're indexed and searchable so these automated jobs you can just run on a schedule in your automation tool or cicd

pipeline whatever you use and they can alert us to any issues that occur so they don't have to be anything complicated uh let's look at an example so here's an example of a health check that we can use to uh monitor for issues with our GitHub audit logs being ingested into our Sim so I've just created a couple of GitHub actions for this example uh GitHub actions jobs and you can put these in whatever automation tool you use but yeah this first job is scheduled to run daily you could run it more often if you like uh this example is just making an API call to GitHub just to read the information for one of

my GitHub organizations uh it's just a simple read operation we're not making any changes and at this point if authentication or the API call fails um an error will be raised and we can jump in to take a look and investigate at that so this first job passed uh the second job runs after the first one it's searching for the events that we expect to be generated and indexed in our Sim based on that first health check job so we're running a search query via the Sims API um we can see in the output of this job we're searching for yeah those those values of specific events that we expect to be there um one event was

returned in this case and our job completed successfully right so um if that failed um we could jump in and fix that issue before it impacts our detections or other security operations so it's not a comprehensive solution for monitoring all components of you data pipeline right we didn't talk about latency um between log events being generated and when they're searchable on your sim but hopefully that you know gets you thinking about monitoring these things um you can get started with some some minimal code and it helps you you know not get blindsided by missing attacker Behavior or red team activity finding out that you don't have any logs to support your investigation it it sucks

right when you're under pressure to figure stuff out um let's look at the importance of testing our detections on a regular basis and an example of how to do that so this is another step that's often skipped by security teams and it comes back to bite you again when your detections are broken you don't know about it um by testing our detections on a regular basis we get to say with confidence that our detection and alerting capabilities are working properly so a few common issues that impact the c s um it might sound familiar to you so a system might stop logging events or uh the events that are being shipped to the Sim might not be

passed properly data sources might be misconfigured right we looked out a GitHub um audit log if those two settings were turned off we wouldn't see IP addresses or API calls our detection would be running right but it would never fire and pesky vendors right changing their logging schem is on us um spoke about this earlier if a field name changes our detection might break and yeah if you're running detections in if you run your rules in detection engines that are never going to fire we're we're wasting resources right so uh yeah by implementing automated tests we can be alerted to issues and fix them before we miss misbehavior so looking at an example of how to test that new GitHub detection we

created earlier uh we want to test the detection regularly maybe once a day and be alerted if the detection doesn't generate an alert in a per world right where we would um create a test that simulates that attacker Behavior end to end and validate that an alert was generated in this case right um this isn't realistic for a few reasons uh to my knowledge we can't create a GitHub personal access token via their API and configure it with permissions um we probably don't want to do that anyway right we we've got a test maybe it fails we don't want to leave these tokens out there dangling with permissions assigned um it's just not real world

uh we probably also don't want to develop a test that comes from you know mad VPN service and interacts with our data uh just for the sake of testing this one detection and yeah finally developing tests can often take longer than writing the detection itself right so is it worth the effort to develop an endtoend test that does all this so probably not um let's look at an alternative a practical technique is to take the events we generated earlier and then replay them to our Sim for ingestion so to do this we've got another couple of GitHub actions jobs that run on a schedule uh the first job replays those historical GitHub log events into the SIM for ingestion um

we're modifying the time stamps so that it's today's date right um because generally our detections are only looking back so far in our logs when they're running in an engine uh and the events are also labeled to make it clear they're related to testing activity so in this example um we loaded up seven events from this J on file we shipped them to the log for ingestion Shi them to the SIM for ingestion and then our second job is going to validate that the detection generated an alert so in this example output this um second job gener checked for any alerts generated by our detection and zero alerts were found and an error occurred so this is an example

of the job failing right uh and this alert will comes to us so we can jump in and fix whatever it is right logging maybe there's an issue with a detection engine or something something else entirely here's an example of what it looks like when that second job passes or completes successfully so the job found one alert that was generated by our detection it validated the alert was generated by testing activity it looked at this label log replay equals true um if you have alerts opened up in your sim or your case management system you can look for that flag as well and just close them out right so um analysts don't need to spend their time looking

at them and yeah this approach can be applied to other detections you could start small and um test one detection per log Source right that you care about and that you're ingesting into your sim and expand it from there and a risk that a risk with this example right is the um You probably thought of this as I was explaining it GitHub could change its logging schema we're validating our detection against older data right um this is far far better than having no tests at all cuz we're testing the components of our Sim detection engine that kind of thing ingestion but you could commit to refreshing your test data every a few months right or um commit to testing a

detection manually end to end on some agreed upon schedule all right so that's um we're at the end here let's summarize or leave you a couple of key takeaways and then some links to resources you might find useful so if you're new to detection engineering hopefully you've seen that this is a proactive capability focused on ident Define attack of Behavior Uh your security vendors might provide you with outof thee boox detections right that are quite generalized they don't want to blow up their customer base with Force positives um no one knows your environment better than you do right so you can start diving into analyzing attack of tactics developing custom detections that are accurate right and

you can detect malicious activity before it causes harm to your business uh we spoke about the importance of monitoring your data Pipeline and testing your detections so um this lets us um you know find issues before we miss any attacker Behavior or find out that our logs are not there when we need them and yeah if you're not doing any of that kind of monitoring and testing um just challenge you to start thinking about how you can Implement some of that and and get started with some some minimal code and Automation and then finally we didn't speak we I didn't speak much about um you know the skills and experiences required on a detection engineering team

but um I think it's important for a team to have this diverse skill set right to build the best detections I haven't met a 10x detection engineer yet um it's not realistic to expect that one person can bring everything to the table right so something to to bear in mind if you're you're building out a team and recruiting and then finally uh links to some useful resources uh I wrote a blog post recently about getting started with monitoring the detection for GitHub it includes 26 free rules to help you get started uh my colleague John Stoner gave a presentation recently about strategies for testing and validating detections uh Wade's AI assistant for documenting detections is here uh if you're

interested in a different perspective or a different um a deep dive on a different detection engineering workflow Dan Lucio's got his blog post if you have a training budget um Spectre Ops detection engineering course is awesome you should check that out and then finally Megan Rod's book um gives you a good intro into the world of detection engineering with practical examples that's it thanks for [Applause] coming any questions okay um do you have any recommendations on tools that smaller organizations could use to implement some of this yeah um anything specific like a Sim or the testing part or uh mostly the the Sim aspect the Sim as aspect yeah I would um you know trying to be careful I work

for a vendor that offers a commercial offering um just Google free or Community editions of sim and um you'll see some popular Solutions out there right that that offer a commercial solution if you get to the stage where you need to pay for support and I don't know maybe uh more login gestion like capacity that kind of thing so yeah just Google that or um we can talk in the hallway after right anyone

else sorry buddy um sorry about that um as a factor of uh age of a security program so by years um I'm kind of curious um like what is your expected time to turn out a detection so like a I'm kind of working on the assumption that a security program in its first year is going to be a lot slow lower to turn out a detection than a security program in its 10th year um and just kind of from your experience um what's a good goal to shoot for based on that yeah it's a it's a tough one um it can vary depending on the complexity of the the attack of behavior you want to

identify um I don't know I'm trying to think of a a recent example uh so if you had I don't know OCTA session cookie theft um as like tactic of the month right it's the hotness all the companies are getting breached um you can quickly either simulate that behavior using um using open source tools right or if it's a popular technique in the industry other people are churning out detections so you could you could get something going probably in like an hour or two in that case um I think it's yeah more the you know tuning for Force positives um the testing can take a long time but like yeah what do you think Wade like

just an emerging threat and then get an A detection out quickly like I think it depends on the organization right if you're a huge organization one of the struggles is actually knowing if you already have a detection for it if you I think you said like a you organization like like a year old I would say it's almost easier because you know what you but like the harder part is the harder part is actually like passing the knowledge of what the detection is uh and like building and testing it like you said writing the query is the easy part I think everything else after that is the hard part yeah and figuring out if you actually have the data available

to write your detection if you actually have the logs if the logs are coming in from the right spot if they're set up correctly with the configuration like you said yeah and then there could be red tape in your organization to um get those logs ingested and that kind of thing so yeah if you've got the logs you can maybe get something a prototype of a roll out in in an hour or two yeah long answer thanks you so much David yeah no worries thanks everyone

GF - Detection Engineering Demystified: Building Custom Detections for GitHub Enterprise

Related talks