Considering Cloud Coverage in SIEM/XDR Design

Name: Considering Cloud Coverage in SIEM/XDR Design
Uploaded: 2025-06-05
Duration: 43 min 50 s
Description: SIEM and XDR implementations often fail to capture cloud visibility where it matters most during incident response. Chris Beckman explores system coverage as a critical extension of threat modeling, examining what events actually belong in your security architecture—from IAM logs and SaaS integratio

BSides SLC · 202543:5055 viewsPublished 2025-06Watch on YouTube ↗

Speakers

Chris Beckman

Tags

CategoryTechnical

TopicCloud IAM Detection Engineering Threat Modeling

TeamBlue

StyleTalk

Mentioned in this talk

Tools used

AWS CloudTrail Azure Monitor Elasticsearch Splunk

Platforms

AWS Lambda AWS Secrets Manager Datadog Google Workspace Microsoft 365

Service

Amazon S3 New Relic

Concepts

CSPM Endpoint Detection and Response Web Application Firewall

Vendors

Okta

About this talk

SIEM and XDR implementations often fail to capture cloud visibility where it matters most during incident response. Chris Beckman explores system coverage as a critical extension of threat modeling, examining what events actually belong in your security architecture—from IAM logs and SaaS integrations to custom application logging—and how to prioritize coverage where it counts rather than chasing compliance checkboxes.

Show original YouTube description

🔍 You bought the SIEM. You’ve got the SOC. But are you really seeing the full picture? In this vendor-agnostic BSidesSLC 2025 talk, Chris Beckman (Principal Security Engineer at TaxBit) takes a deep architectural look at SIEM/XDR system coverage, with a focus on cloud infrastructure and what your SOC team can actually see during an incident. Too many orgs implement SIEM or XDR solutions just to meet compliance checkboxes (CIS8, NIST, ISO 27001)—without critically assessing event visibility or integration depth. Chris explores the overlooked concept of system coverage as an extension of attack surface and threat modeling. What you’ll learn: -What SIEM/XDR solutions are and how they differ from EDR, HIDS, and NGAV -What kinds of events and integrations actually matter during a real incident -Why cloud visibility (IAM logs, SaaS events, WAFs, outbound traffic) is often missing -How response capabilities vary, and whether they’re useful in your environment -How SOC teams (and MDR providers) interact with alerts in the real world -Why transparency, privacy, and access to your own security data should be part of the conversation Whether you’re designing, purchasing, or tuning a detection and response system, this talk will help you rethink your assumptions and maximize your coverage where it counts. 🎤 About Chris Beckman: Chris is a Principal Security Engineer at TaxBit with extensive experience designing, implementing, and integrating SIEM/XDR solutions across tech companies. Outside of work, he’s a Raspberry Pi tinkerer, photography enthusiast, and proud dog dad to Toast. 👉 Learn more about BSidesSLC: https://www.bsidesslc.org/ #BSidesSLC2025 #ChrisBeckman #SIEM #XDR #CloudSecurity #SOCVisibility #ThreatDetection #MDR #SystemCoverage #AppSec #SecurityArchitecture #TaxBit

Show transcript [en]

Okay. So, feel like I should make you all like gather down here, but it's all good. Uh, so thank you for coming. Uh, very excited to give this talk. Uh, it's on considering cloud coverage and sim XDR design. Um, design. This is a talk about something that I've lived and needed to deal with. And so, I'm really excited to kind of share some of the things that I've learned in in working on it. Um, so, uh, first of all, in case you want to download the slides, I did create a short link. Um, feel free to, uh, take a screenshot or, um, write it down or whatever works for you. No pressure. Hello.

Hey. All right. So, let's get started. First off, just sort of an overview. Um, we're going to first talk about some definitions. I feel like in security, we have endless acronyms. We have endless definitions. We have multiple definitions that mean the same thing as other definitions. It's really important to get it straightened out at the start. Second, we're going to go over the most incredible inc uh important thing in a sim, which is the events. What are you actually ingesting? Then we're going to look at investigations. Um, coverage, the concept of coverage and why it matters. Uh then we're going to look at the socks view. Um which hopefully will be some uh good fun. And then some final

thoughts. All right. So who am I? Uh I'm Chris Beckman. I'm a principal security engineer at Taxbit. Taxbit is headquartered here in Salt Lake City. I'm actually from Seattle. We have another office out there. I'm excited to be here, excited to visit. Got to meet and uh hang out with a lot of my co-workers. So this has been great. And Bites here is incredible. Really impressed with everything I've seen. Um so you know I've worked uh in Seattle for around 10 years at tech companies. I worked at everything from public companies to very tiny startups. Uh I have been the 11th person at a startup before. So you know I've seen kind of every size of these implementations

which you know I think has provided me a little bit of perspective. These things aren't the same for everybody's environment and it's not realistic that they would be. So I'm going to try and give kind of general advice that might fit different um different companies. I also uh hold my JD from the University of Iowa College of Law uh with a a focus in cyber security. Uh that is definitely not where I learned how to do my engineering role, but but it has provided a lot of useful context. So, all right. Uh so, uh uh first we're just going to do a quick survey. Uh you know, there will be no quizzes. I will not ask you any questions. You don't

even really have to raise your hand, but it it might help me a little, you know. So, first, has anybody here just never used a SIM? Maybe heard of it, but never actually touched one. Totally okay. No judgment. Okay. Second, light usage. So, maybe you've played around with one or it's on your team, but you're not the person responsible for it. Cool. Uh, next, heavy user. So, you use one a lot. You know, it's it's major part of your role. Okay. Uh, you have coded an integration. So, this might be different than those other ones because you might not normally use a SIM, but just want to see if anybody out there has had that

fun. Okay. And then last, sock master. You have run a sock. You are on the sock. You are telling people how to use it and you're really frustrated with customers or love them. I don't know. One or the other. Okay. Cool. All right. So, what is SIM XDR and why do I keep using both terms? Uh so a SIM is sort of the the older term for it. It is the one that I think is most accepted. Security information event management. Uh I think of it as an immutable store for security events. Um XDR is sort of a marketing term, but it's generally significates I said ridiculous marketing buzzword. That's a little harsh, but uh it signifies an

endpoint focused detection system and the ability typically to respond to threats. It's you know extended disco detection and response. MDR stands for manage detection response. This means that there is a sock on the other side of the tool that is um reviewing your events for you 247 alerts and typically you communicate with that sock somehow through the tool. All right, so I did mention there are a lot of acronyms. Here's the big list. Uh hopefully I don't offend anyone. I I sometimes wear wear my opinions on my sleeve, but like you know I think it's slightly ridiculous that we have this many terms for what are really only a few concepts. Uh but they are

important some of them. So the SOCK I've used it already security operation center. This scales all the way from an in-house sock with a couple people all the way up to follow the sun socks across the world. Um so next would be SIM. We already covered that. XDR covered that. MSSP. Uh so I I think of this as sort of overlapping a little bit with an MDR except that it's typically maybe providing you some other services. Um and you know congrats you have a sock uh sore. So that's security orchestration automation response. I think of this as a fancy term for response. There are similar functionalities between this and XDR. Uh so MDR we kind of covered that MXDR I

don't know it's an MDR. uh EDR that is endpoints. Typically, this is more like next-gen antiirus, but you can often hook them into a SIM or they can be the agents for a SIM. So, it's complicated, but these are important definitions. So, why do we use them? I always like to think just because of the roles that I've had of things as being split between pure security and compliance. And we shouldn't have to think that way, right? like in in a in a wonderful world, compliance would just be the things that we want done. That's not the world we live in always, right? And so um but compliance uh standards do provide frameworks that can be really

useful to understand what we need to do. And so the pure security side of the world is we just need to know what's happening. We need to know if there's an an intrusion, an event that we want to respond to. We want an efficient response. Uh we want forensics. So we want somewhere that this I keep saying immutable. That just means it can't be changed. or putting it somewhere that is uh going to be a store for evidence later. Um and then last, I think this is really important. It seems like I I don't hear this as often, but it informs defensive efforts. If you have really good uh SIM XDR product, you kind of

know where you're being attacked, even light attacks or you know where you might be attacked. It it provides information to you that means that you will have a better time, you know, prioritizing your defenses later. I think that's really important. So from the compliance side of the world, ISO 27,01 does say that you need to store events and security events. It doesn't specify a SIM, doesn't say an XDR product, but it kind of lightly suggests things that would be really easier if you just had a SIM. So um SOCK 2, similarly, SOCK 2 can be kind of choose your own adventure. Uh but uh uh it also, you know, there are standards within it that suggest that you need a

SIM. CIS8 is much more explicit. So this is more like a general framework that that helps companies understand uh uh you know what is their security um posture and similar with a NIST cyber security um you know they uh CIS 8 really clearly says you need a SIM NIST is again a little more general but it's pretty embedded in compliance standards that you should have one of these. So first off let's talk about events. So the whole idea of a SIM is that you're pulling in events that you think are going to be important later. And how does that work? So first the event happens and you record them because you think they might be relevant to a

security event later. Then you ingest them into your SIM XDR and we want to put them in an immutable searchable store. Third, and this varies from from product or solution to um solution, but in general from there they become generic events. And at this point it's really like any other logging solution like a Splunk or any of these. you're really just putting them in something that's going to store them like a log. Um what where they become more like a SIM is what happens next which was we often try and normalize those events. We try to say this is part of this category or these events are duplicates of each other and we don't want them to be

duplicates. Now then we start to make those events more useful for security purposes. And then last we trigger alerts. So there are lots of different ways that alerts get triggered. Different products do it differently, but the general idea is that your events that you normalized somehow trigger alerts. So, here's 10 common event types. I say common because I don't know what sim you're using, and I actually really want to have this be general, but they often overlap. Uh, these are these are the most common ones you'll see. So, generic we already described. It's, you know, we don't know what it is yet. Sometimes this is called raw. Um I've heard it called raw raw events. Uh

second would be process. So this is the processes that happen on your Windows computer or your Linux server. And I mean they often look like the commands that are run but they don't always. It kind of depends on the system. Um and there are sometimes a separate pro uh category for commands which will actually separate out the commands. Network which is network flow or network traffic. O which is you know I tried to log in I was denied um or I was approved. Uh audit which audit is really broad think of it like uh uh external audits of what h a user did in a system. So if they tried to access something within the system or they promoted

another user to be uh an administrator those would be audit events. I've heard this cloud called cloud audit before which is kind of suggestive of what it's often used for. Uh next is DNS. We all know what DS and DNS is. It's always the problem. Uh uh seven is HTTP. Uh that's uh uh you know that's like a load balancer. So you can record HTTP events as they come through. Alert, this is one place where things get complicated. Alert is alert outside of your SIM system. So, if you have an EDR product that has separate alerting or you're using a cloud alerting system, um, that might get pulled in as a separate alert and it may or may not create alerts

within the sim itself. File modification, very cool if you have this coming into the sim. Awesome to be able to see what files were modified. Sometimes your agent will do that. And then registry or persistence. So, these are events that happen that would suggest that a a machine has been compromised. Um, you know, sometimes it is literally the Windows registry. Other times it is things like certain files are modified that suggest that they've been um, you know, they've been compromised. Okay. And then we investigate. So events create alerts. Only the highest security alerts are actually investigated typically by a sock. They're not going to go investigate all your low alerts. Uh, lower alert levels provide context. So

they're less likely to suggest that there's a breach, but if you think there is one, they're sure helpful to provide context. Your SIM XDR hopefully provides, and these are my opinions. There's lots of different opinions on how a SIM should operate or an XDR product, but I think it's really important that you have the ability to use SQL, that's, you know, like the database searching, uh, like searching tools. So you want powerful advanced ways to search it, and I'll show you later what that looks like. Either built-in intelligence or the ability to hook up feeds. So you want to be able to know these things uh you know this hash is malicious or this IP is malicious or

these DNS are malicious and you need to get that either from your SIM or an external provider report generation at some point you got to show your boss what's happening uh and for compliance reasons it is genuinely important that you have um report generation investigation tools uh so this is actually often used to communicate between the sock and you so if you're not in the sock you you've hired an MDR service. It's really good to be able to see what the sock is doing, what they've changed, what they've investigated, and then get their report back. And then last, this is very much my opinion and it is not uh something that is true for a lot of the different products, but I

think it's really important that you as a customer be able to see the same view as the sock. It can be really confusing if the sock has really advanced views into your data, provides you a report, and you can't see that same search. You don't know what they saw. And in particular here, I think um I like the ability to see exactly what this the sock team searched on or what they what they looked at or what their their changes were to an event. This is not common in SIM products, but it's something I look for um when I'm when I'm selecting one. Okay, so system coverage and this is sort of very broad, but I think of it

as network, endpoint, cloud, and application. And in particular, one thing that I want to point out as part of this prevention uh this presentation is are you monitoring all of your cloud attack surface? Because this is in my experience often the part that gets left behind and we can we're going to go into it in the future here. of network coverage. So networks is and this is this is a little bit a hard thing to put uh it's kind of become less important over time like network detection systems where you have something on the network that ingests all traffic looks for specific network packets. That's how we used to do it and it is still how some

organizations do it and it has its merits. It might be a good part of your detection strategy but it's overall become less the core way to ingest events. Um, however, that doesn't mean the network data isn't important. If you don't know what's talked to to what, you don't really know how an attack went through your uh network. So, in general, I've seen kind of a shift from we're going to have the full packet, you know, output on of all of our network and more to just what's talking to what. I'll also say IP addresses are still a key indicator of compromises. Obviously, there are problems with IP addresses. The attacker may have switched IP addresses or used lots of IP addresses

in the attack. But through your system, it's still very helpful to know what at what talked to what BIPP and externally we can still get a lot of information by what IP address came in. So next is endpoint coverage. This is kind of the the XDR part of it. Uh typically this looks like running an agent. In the world of containers, it gets more complicated. Uh this usually looks like putting in the container itself an agent or running that agent on the host system. Uh where it gets really complicated is serverless systems. So in a serverless system you don't have access to the host. You don't have you you're not the one running the container. And so typically the people

run you know making these have move edr products and XDR products have moved to cont uh sidecar containers where you modify your definition. the a lot of the important information from the OS goes through to the sidecar container and then we can pull that out before it actually hits the host. Um it's a pain to set up but I it's one of those things where if you do it once the pattern kind of applies to everywhere else that you need to do it. So what does an endpoint agent collect? These are just a small number of them. Usually there's like a big list but the most important would be network process DNS off and sometimes

specific logs that they know are important. So next is cloud coverage and I'm talking about claw cloud broadly like how how grandma or grandpa thinks of the cloud like you know like both your public cloud and all the the SAS products that you bought. So the the issue here is that usually a SIM XDR product you deploy it and you get endpoint agents out and you get some of the basic things in and that gives you broad coverage across a lot of what you're running your employees laptops you you get maybe the alertings uh if you you know if you push out actual agents then you can see more than that with the process and networks you get

visibility into your AWS but you're still missing a lot of external cloud systems so like account you think like an accounting product or um or you know your well we'll get to more but there's a lot of things that can be missing there but in particular there's a lot of public cloud things that are missing. So what happens is you get this good enough thing onboarded and it's hard to justify going back and pulling in more data from new systems as you roll them out. So last and I'm just going to briefly touch on this application detection. So there are some really cool new products that look in how applications run and whether there's something that might be a indicative of

attack. There are you can also just pull in application logs and that can be useful. Um but it's another layer that can go into the sim. So what are you missing to detect an attack? And my dog is constantly thinking about this question. Never never missed this question. What a new part of our house might be where the the intruders come from. But I I think I think it's a good brainstorming act exercise. Sit down and think through your infrastructure and think, well, if we have just a standard SIM setup, what other weird areas might somebody be able to attack us through? If this sounds like a threat modeling, it's because it it kind of is, right?

So, are you monitoring this big giant list of things? Uh, so IT infrastructure, VPN logs, self-hosted internal tools, third-party SAS. The big one on here is your identity provider. That is a that is a really important source of information and is sometimes left entirely out of SIMS and XDR products. Like off is the major path of attack. If you don't know who's trying to log into different users, you're going to have a problem. Um, so IDP integrations, sometimes there's third party MFA integrations. Those could be useful. Another good one is Google Workspace and Office 365. They sometimes have alerting for fishing or suspicious user activity. Great thing to pull into your SIM. Uh data loss prevention tools

if you have one that can be telling that an attack might be happening. Miscellaneous SAS audit off logs. This is the catchall, but you know if you have some weird application, if you can get some kind of extra logs out of it, you might be able to pipe it into a SIM and get really interesting information. The other big area is public cloud. So, IM audit logs, load balancer, HTTP logs, web application firewalls, that can be really helpful. You can see every attack that's coming against your your public uh endpoints uh and and see what kind of trends there are in the kind of bots bot scanning that happen against your infrastructure. Uh database authentication logs. I think this is

really important. Uh custom application logs, fishing email detection, we mentioned that. developer VPNs. This kind of in with zero trust dev access tools. Again, off is critical. So if your dev your devs developers have a specific path into your application, if you can monitor when and where they go within that, that's great information in your SIM. Uh and then last, cloud security posture management tools. This is more on the vulnerability side of the world, and it is potentially nice to have vulnerability information on there, but this gets kind of muddied. So sometimes there is useful information here to pull into your sin. The last note on here, when you threat model, so when you get together with your peers or

with your company and you sit down and you try and think through how is my company going to be attacked, when you prioritize your systems, when you find out what has the biggest, you know, concern for you, that's how you prioritize these. Like realistically, you're not going to add all of these things to your SIM. You probably don't have time. But you can get the coverage of the most important systems. And threat modeling is a process that can help you get there. Even just I I like to call the uh short version of uh threat modeling evil brainstorming where instead of going through a very structured process, you just get in a room with some people and say how are we

going to get attacked and that can be enough to get to help you prioritize this. All right. So why expand coverage? Uh first visibility. So, and if attack happens in uh a system that you don't cover, you won't see it. It's easy as that. If you if you were attacked through some thirdparty SAS product and you didn't have any way to see it, you won't know about that part of the attack until it gets to you. And if the the data you needed is actually in that system, you may never see it at all. Um second, earlier detection. So, even if it's just an early way for someone to get into your system, you still want to

know about an attack as early as possible. and you want to get the full trail to know it. Last is compet and generally alerting systems like more is generally better. Uh, and you know, it's good to be able to filter it down, know what is the most critical alerts, that's all really important. But if you have multiple ways of showing that an attack is happening and someone, you know, uh, gets a reverse shell on on an instance and you have like lots of different ways to see that that attack happened, you have high confidence that you've been attacked. Whereas, if you just have an agent and it says, "We think there is an attack," you don't

know if that's potentially something else happening. I think we'll we'll get a better sense of that in a minute. So, this is just something I sketched out um and did very quickly. It doesn't include a lot of the thirdparty SAS, but it kind of shows you how like an example company might start to diagram out how things get into the SIM. Uh so, you know, they're going to pull in OC, if they use Octa, they're going to pull in Google Workspace, the you know, the their admin console data. AWS, they know they've got some of these things running, so they're going to pull in that data. then that the EDR potentially that a lot of SIMs

support EDR as essentially an input and so you you hook up that integration and then ideally at the end you've got two groups watching it your sock team which is either internal or external and then your security team same view but is essentially dealing with the escalations that happen so uh again no pressure but uh what unusual systems can you think that might provide useful data I totally okay if nobody has any ideas but any any proposals. Yeah. Yeah, definitely. HVAC would be a Yeah, that'd be a very great idea. Yeah. Yeah. Uh and actually I think your kind of your local infrastructure if you're like a cloud fun uh focused company, I think often companies forget about all of

their hardware and internal stuff. I've certainly seen that, you know, where where we think our critical thing is in prod and so everything that gets further away from that matters less. Um, anybody

else? Cool. All right. So, let's keep rolling. Uh, so putting this all together, I'm gonna try and give you a sock team's view of what might happen. And hopefully this is the same view that you can see. Now, I really wanted to make this sort of a vendor agnostic presentation. Like I I like the idea that, you know, I don't know what you're using. I don't know what your tools are. And I don't want to make this about one of the ones that I happen to have used. So, in order to get there, I mocked up my own sin. And this is a very basic view. It only includes the things that I important here. I also um uh made up an

SQL language. So, please don't take screenshots of this and try it in your SIM. It's not going to work. Uh, but uh so first of all, we have some uh uh good radio bars, and no one at the company knows why they're at those numbers. We assume that if they get higher, that's either good or bad. Uh the diagram on the right, I think that's bad for sure. And then the response actions uh we have you know nuke which I don't know a panic shears an old manager of mine told me that you know shears and the Ethernet cables is a great way to stop the cyber security attack I mean if you think about it some

of the great attacks maybe they should have tried the shears would have stopped the the ransomware and then doom which I think that's just despair right so this is my little editorial about response actions response actions can be dangerous there is an availability concern. You know, I I mentioned earlier that a lot of these systems really promote the idea that there is an automatic response or the sock team can click a button and respond. I think that can be very useful. I've seen it be useful where you know that perhaps you know an endpoint it's okay if it shuts down and we'd rather it shut down than cause a problem. But just be very mindful that n network isolating prod is

really bad and and sim and mdr providers might tell you to just turn it on universally and the sock will know and it's okay. Network isolate anything that has a problem. Network isolating prod is bad without some serious considerations about what it's going to do and not nec even from a security perspective. You know availability is a part of the CIA triad. uh we should we should be considerate about what happens with with our actions. Um so just my little editorial. So we're going to we're going to put back network network isolate where it belongs and uh move on from that. So first of all you are a sock team uh and you wake up you just got on

your shift um and you see two critical alerts. Uh so we are reviewing you know in this case we're reviewing critical. I think in most systems it would be like critical and high but we've got two alerts. one is S3 anomalous behavior and S3 known malicious IP. Um, and so you can see on here we've got on the left we've got alerts and event search. We're currently just looking at what are our critical alerts. So we're clicking on that and we get alert details. There'll usually be some kind of what are the extended details of the alerts. Um, so in this case we see uh that there's a source IP attached to it. Um, and so,

you know, I I think it's a reasonable thing at this point to say, let's go look and see what this public uh source IP is doing if we've get got some kind of really high alert that we've set up for anomalous behavior. And I I will say anomalous behaviors are often the noisiest of alerts, but hey, we'll give it a go. So, uh, next I'm going to walk through it. So this is again gobblygook uh SQL but uh you know usually be something like from cloud trail or from some some type of data where this source address equals the address. Sometimes there's a blanket IP, so like at IP and you can search everything. And in a lot

of these systems, you'll have some kind of a pivot where you can click on one of the events from the last one and then say it will, you know, narrow it down to just the time range that would be relevant considering the alert you're looking at. All of that's really nice. Uh but in this case, you know, we've got it all laid out here and we see a bunch of git objects to this from this bucket, I should say, because it's a get object. uh from a bucket that we've called sensitive data. Uh so that's not good. Um and there's a lot of numbers on the pager. Take a guess if we keep going what the rest of them look like. Right?

So we look at all the event details on the right and we see access key ID and you know that's AWS in this case. There's access key IDs. They're different the same similar concepts. But we we've got this. So the next step might be and by the way we've got these add to investigations. This is uh in many sim systems how a sock team would build an investigation that they would later give to the customer. So they would add all of these searches and events one by one into an investigation, write it up, and ideally you would then be able to see all of the searches that they did. So the next thing we're going

to do is search for that access key. So similar idea from cloud trail, which if if anybody doesn't work on AWS, cloud trail is like the management layer of AWS. It tells you what, who's done what, even resources, what they've accessed with an identity. So, it's very powerful information if if you've got a cloud object. So, we're going to search for that cloud ID. Here we see we've got a combination of IPs. Some of them look pretty normal like start instances and describes, and those are coming from some kind of uh very home network looking, but anyway, ignore that 1921 168210 uh uh IP and some looking from a public one. So that's strange. Um, so

you know given that we now have another IP, another reasonable thing to look for would be what is happening from alerts for both of these IPs. And then we see some interesting additions. We've got some RDS off failures alerts. These are only medium. So they wouldn't have touched off anything that the sock would normally look at, but they provide context. Maybe something weird is happening with that instance where it is trying to log into a database and failing. And again, there was an amazing session yesterday. I should have looked up his name, but on um you know, providing adding uh uh doing uh uh uh structured analytics on investigations. And I think it's important to understand

that we don't necessarily know these are related at this point. These are just different things we found. But we we can kind of provide correlation where over time we're seeing that they they um together might provide a picture. Um so we're going to go ahead and look into that. You know, it seems bad. We've got off failures. So the next thing we're going to do is look at that particular instance and if you've got the endpoint hooked up, you can see from process where this IP and we can start to look at the actual processes that are happening. And again, if you've got these this coverage hooked up, you'll see we've got some really suspicious

looking activity. We, you know, somebody using Root is looking at the different users home directory. They're looking at what commands they've run. They looked at uh the password file and they've looked at AWS config, which typically might have the actual config in it. Um, so that's, you know, the AWS potentially if if if in if if set up in maybe not a great way, might actually have the secrets in it as well. So, uh, from there we're going to look at, uh, the O that happened on that instance. And we can see there's a user called Charlie Brown that logged in. I don't know why it's doing the flippy thing, but logged in. There's a privilege escalation, and

then, uh, they, you know, log out. So, from there, again, we're correlation. We don't necessarily know, but we're going to look in Octa for that particular user. And maybe we get some impossible travel. Maybe a you know at this point you might say well what if octa is hooked up to a dev access solution and that's how that user is getting into that instance. A lot of questions from here but I think the most important part of all of this and you can see it's from Japan is that all of what I just showed you is only possible if we have all of these different systems hooked up. And if all you have is the end points you're

going to get a tiny part of all those attacks that might have reasonably happened all at once. So, it's going to be really hard to piece together all the stuff I just showed you. Um, but if you have a broader set, and this isn't even close to all the stuff I showed you earlier. This is just kind of a core set of it, but you'll get a much better picture of the different systems as somebody attacks, and you'll you'll have better coverage of that attack. So, um, next, so first of all, any questions at that point? Um, I'm going to kind of dive into another small topic. So, uh, cool. So the next thing I wanted to talk

to you about is custom integrations. So realistically uh your SIM is not going to support all the things that you have out there and that's going to be a major problem like okay a lot of them have an octa integration right octa is a really common IDP uh a lot of them have google and and o 365 but I keep coming back to your you know your old school accounting application that has logs or um uh your dev access solution. I really doubt that those things will have an integration from your SIM. They might have logs in a structure format that they give you, but it's not going to out of the box work with your SIM. And so, how do you get

those in there? And this can be easy and it can be hard. It really just depends. So, you know, typically a good SIM and XDR product supports custom integrations. And the simplest way that this looks like is good old-fashioned SIS log. So, you know, if it's in your network and there's a a way to send it out via SIS log in uh CF or LEAF format and you can send it to a SIS log server, usually a SIM will support that. You can start up a a essentially an instance in your network. you can send it via hopefully encrypted um communication to that uh collector and then they will just show up as a as a generic event

within your set. So that is a really nice path if it's available. Of course, the world is much more complicated than that. Not everything these days runs in our network. Um and not everything supports SIS log. In fact, I feel like that's getting pretty uncommon these days. So it can also be as complex as sending processing them and uploading them to an API supported by your SIM using OOTH to handle keys. And uh I'm going to look show you a little bit about what this looks like. But once ingested, you will need to create a normalizing script. Um so you know, you don't just it's actually fine if they're in there as generic. If that's all you

can do, that's wonderful. They're still going to show up in a lot of the searches. They're still going to provide context, but they're not going to, you know, show up in the broader searches about different event types. They're not going to show up, uh, when you do really broad searches. Um, and, uh, they're not going to trigger alerts. I think that's the most important part. So, uh, if you do create normalizing scripts, they will then show up as HTTP events. So, let's say you're, you know, integrating, um, a custom load balance or something. If you, if you make this script, they will then show up as HTTP events. Um, last, if you're really ambitious, you can set

up your own custom alerts for uh, alerting. And this can be not actually that bad, you know, like you I think most people know what kind of things are bad within their environment. And actually, some SIMs support, you know, once it's in the O format, they have general rules about what things might be bad within the O format, and they might be picked up at that point. So uh I've created a custom integration for a sim in go using S3 lambdas and as secret manager for a third-party W log. So this third party W is hosted in the cloud. It's not in a specific environment and it was not at all supported by the SIM.

Uh what that looked like was the third party W supported sending the logs to an S3 bucket in your environment. So they would just show up one by one as uh you know usually it's like JSON CF leaf there's a couple formats. Um and then what I did was I essentially created a lambda coded in Go that was run on every object that was uh created. Hopefully your SIM will give you examples of how to do this. Mine didn't, but but you know the documentation can vary, right? Like if they're giving you a custom integration path, uh it's great if they show you here in you spin up a line lambda in Python, here's how you send it to us by

running a lambda with this and it just depends on which one you're running. Um these are good things to look for in the documentation before you buy the sim to know know what you're getting into. Uh so uh then the lambda ran an ooth flow. This was a complicated part of it. So, you gave it a long-term secret and then it had to pull down a temporary secret, store that in AWS Secrets Manager um and then provide those back and forth until they expired and then go get another one and do that flow again over and over again. Um, this process then created a a request to upload to the SIM, which in this particular SIM's case created an S3

upload pre-signed URL, which it's one way to do it. Uh, it worked fine. um but you might see different paths. So it's just it's however they want you to get those logs there after that. So this experience was really enlightening. I think a couple of things um is my particular sim did not provide great uh visibility into what happened if they were ingested but there were issues after that. And so one of the frustrating things would h that would happen is I would get it working send it to the API the API would accept the events but then where are my events in the sin? And you know it might be an error that was never serviced to the

user. Um or in uh my particular favorite uh part of this I may have put it on the slide. I don't remember but I one was my mistake. Yeah it is on the next slide. I I'll get to a second. So uh lacking documentation by the sim. I think I said that like oftentimes the documentation may not be complete and last handling failure states. But I did get it to work and then it ran happily then forever out. And what was nice about that was once we got that working, it was a pattern that we could apply over and over and over. If anybody gave us logs that could go into an S3 bucket, as long

as I could get them in this format, I could then repeat that process and get them into the sim. So the first time was really painful, then everything else we built, not so bad. Um, so I mentioned that there are custom parsers. This looks like trying to get the event from generic into HTTP. Um, in this case it was a WTF, so it was actually HTTP with some particular notes as to whether the W had blocked it, um, or or allowed it to go. Um, because that's really all that the W adds for the most part. I think there was actually a category too as to why it blocked it, which is actually really useful. Um, so again,

this is not the exact uh, formatting. Um, but this is an example CF log um, with the uh, you know, you got the lines in between each thing. There's usually standardized at the front and then you have fields after that. And then this on the right is the most basic version of the script. So you're just saying like values equals and we're going to parse the CF with the original data and you know these are defined in the the documentation and then we're assigning different outputs to the different values and we've noted here what they are. Um there was no documentation about how that script worked. And while it looks really simple um hang on like it

you know it looks really simple I will say uh when you don't have anything you wind up like really reverse engineering it um and uh actually I don't know fortunately we are really close to the end so uh you know I uh it was an adventure there were several parts where I felt like I was reverse engineering like really basic logic but uh we got it to work um so that was the end of that particular custom parser adventure. But uh I did go back and look at my sim. I gave a lot of feedback to the sim provider during this process which sometimes felt really frustrating. It felt like I was you know throwing

features into the void or there was like a add suggestions button and I was oh like every couple days I was adding like where's my event? How does this work? There's no documentation. And my favorite bug that happened during it was I mistook uh how to convert time um uh from epoch which is a classic uh programming error. And so I wound up putting all of my W logs in Egyptian time. I was like where are my WAT logs? Where are they? And then I realized what I was done did and searched way back and was like oh well they're there. Um, but I went back to my SIM provider and looked at the documentation today and

they actually did go out and build out everything that I had run into trouble with which really I don't know gave me a lot of um hope for for for SIMs everywhere that you know I think it can be frustrating but they are slowly building and adding and uh uh it's worth it's worth communicating with your SIM provider if you think that there's lacking in the documentation. So anyway that's that's everything. I hope this has helped uh understanding of of cloud coverage and why it matters and uh thank you so much for coming. Um, are there any questions? And yeah. Yeah. I I don't I mean you can. No, I explicitly don't want to. Um

cuz because I like I think it's hard for one person to have any enough of a vantage point to really say like I've seen vendors get so much better over time. I've seen ones, you know, go into obscurity. I've worked with a number of different ones over my career and I just wouldn't feel comfortable comparing them based on the different dates and timeline. Um if anyone has specific questions I I'm happy to answer after. Yeah. Any anybody else? Yeah. Yeah. So um uh, I would say the, uh, like, you know, I've seen really good, like one of the classic ones is to just, you know, you can, um, spin up your own, uh, oh, so

there there are some uh, uh, I I think the one that I'm most interested in is uh, oh, shoot, what's the name of it? Uh there's a it's a really funny name, but there's a big uh open-source XDR product that um it starts with a Z, I think, but I've been playing with that and it looks really interesting. I love the idea that you could host it yourself and it would be standardized. I'll look it up if you have questions after, but um I've also um done quite a bit of work with um Secure Works Tius and I liked their setup. They had nice parsers and that kind of thing. Uh yeah, there there are

good products out there I think. Uh again, and I've seen people with like really good just they used Splunk and then they customized everything from there. That can work really well. It just depends on your environment, your size, and what you set up. I really don't think there's a one-size fit all fits all for this. Yeah. Yeah.

I think the biggest thing for me is that visibility. Um, like I I I mentioned it before, but like there are a lot of systems where if I were going to name and shame, my main complaint is that they they clearly have all this functionality, but you don't get to see it. And actually, another concern with that to me is, you know, you wind up relying upon them. You give them all this data and, you know, if you can't go and validate it, you're kind of stuck with them. You know what I mean? Like you you don't have enough information to tell whether they're doing a good job. So that's my biggest complaint is if if

I feel like the SIM or XDR product is hiding the data from me. Anybody else? Okay, I'll give you a little bit of time back. Thank you so much. I This was a lot of fun. [Applause]

Considering Cloud Coverage in SIEM/XDR Design

Related talks