SOC Like a Genius: Cognitive Agents Delivering Wisdom at Scale

Name: SOC Like a Genius: Cognitive Agents Delivering Wisdom at Scale
Uploaded: 2025-12-08
Duration: 27 min 6 s
Description: Identifier: DWYE8M Description: - “SOC Like a Genius: Cognitive Agents Delivering Wisdom at Scale” - Presents cognitive detection framework using LLM-based AI agents. - Agents reconstruct attack timelines and prioritize based on business impact. - Demonstrates pipeline from alert enrichment to autom

BSides Las Vegas27:0615 viewsPublished 2025-12Watch on YouTube ↗

About this talk

Identifier: DWYE8M Description: - “SOC Like a Genius: Cognitive Agents Delivering Wisdom at Scale” - Presents cognitive detection framework using LLM-based AI agents. - Agents reconstruct attack timelines and prioritize based on business impact. - Demonstrates pipeline from alert enrichment to automated response. - Provides demo and KPIs for smarter SOC operations. Location & Metadata: - Location: Proving Ground, Firenze - Date/Time: Monday, 11:00–11:25 - Speakers: Sarah Young, Oudy Even Haim

Show transcript [en]

So, hi everyone. Uh, so I just took your picture, so I hope it's okay. If not, let me know and I will delete. Um, so today I'm going to speak about Huh. Okay. >> Oh, okay. I will speak closer to the mic. Um, so, uh, today I'm going to speak about AI agents um, being leveraging the sock. Uh I actually uh changed the subtitle of of this uh presentation because like at first it said a agents will actually it implicates it will going to replace uh so analysts in the future but this is not what I'm thinking and this is not what you will get out of this uh presentation that uh and uh so like uh

when we are seeing uh this uh the future sock uh like I see it as a collaboration between humans and AI agents. Uh and we'll dive into it right away. So at first I want to introduce myself. Uh I'm Woody. I'm currently working for Microsoft as a research uh program manager for our XDR SIM solutions defender and and Sentinel. Um mainly working on the quality aspect of uh of our product mean that I'm spending a lot of my time making the product better. Before that um for many years I worked for uh big four companies on the offensive side um leading uh red teams assessments research programs and so on and like the beginning of my career was

on the critical infrastructure domain um so like I know um let's call it the challenges from both sides from the attacker perspectives and al also from uh from the stock sock perspective Um and let's dive into it. I won't read through the agenda. I want just to say thank you uh also for Sarah Young was mentoring me for this talk and the besides uh um team for accepting this uh this talk. Uh this is my first time speaking uh in this uh kind of conference international conference. So thank you very much. Um by setting scene I will start with uh the sock challenges that we are currently um or not currently but we have uh when operating a sock. So the

first thing would be that uh so analysts actually spend a lot of my of their time chasing nothing like half of it would be just to triage signals and uh trying to uh to investigate incidents that are not really there like you can see two reports one by sans and the second one by Microsoft that describing these challenges um there is a lot of volume we see like an increase of 13 trillion um from 2023 to 2024 and I like most of my work is to get it actually decreased um during 2025 and and and uh like later. So this is a main challenge. It's it cost fatigue. Uh and it also actually make the analysts feel that

much of the work is actually for nothing because they're not um having an impact on the business afterwards. Uh of course the that landscape is also evolving. Uh there's more sophistication mainly. Um also attackers use AI in order to attack like I did it, my team did it. uh and also of course real attackers that actually targeting in uh a lot of companies are also doing that and also something that to most of the organizations actually have an attack path uh live attack path to uh a critical asset in the network. So that's also something to to note meaning that there is something out there to chase and I don't remember any uh penetration test or red team that we didn't achieve

our goal and like I spent three years attacking Fortune 500 companies and I don't remember any um example that we didn't succeed in a way to get what we wanted out of the goals that we set up front. So this is a very big challenge uh that most organization are facing. So what would be like the ideal experience like if I was sitting in a sock [snorts] and I was an analyst I want to have only a true positive incident meaning that each operation that I'm doing is actually make sense and providing some u impact on on on the company. Um I would also want something to prioritize it even for me that my next action would be actually the

optimal one. I wouldn't want to chase even that if it's true positive um I want to chase something that is actually has the most impact on [snorts] on on the company. Um I would want to have some kind of automation uh for triaging also for response. Uh we will spend a bit later talking about that. I don't think it should be automatic but it should be in a way [snorts] and the system should also learn from past experience and um implement that on future uh work that humans and also agents would would do. So this is like the ideal experience as I see it. Um and there are actually like many more things but this is uh like the main things.

Um so this is this describes how detection evolved um from like pre uh uh 2000s until now and I will also describe each phase and where I think we are right now. So we started by alerts which are distinct and discrete um logic that actually implicates uh like uh suggest on specific event that happened. It can be like a port scan from IP X on machine Y. Um, but it doesn't tell us the story. It tells us a specific event that happened. Um, an alert says that it has some sort of uh a security um let's call it security. You need to act on it uh in order to uh um to block the the specific threat. Um

and this cause a lot a lot of noise. We are still like most organization are at this level right now ingesting a lot of alerts from dozens of products um to a sim um and analysts are investigating like one by one uh and spent a lot of time as we said half of the time chasing actually nothing. Um and then it evolved to incidents or cases which are um few or less grouped into a specific let's call it a container. Um that suggests on a story um but it's still uh not like it's a big lip but it's not where we want to be because here we understand like what happened. Uh the story most of

the time isn't complete. We see bits of it but we don't know the intent. um and not the impact. Uh where we are right now is like gen 2.5 as I call it which means that we have machine learning and uh graph algorithm that enhancing these alerts. Uh if uh incident centric uh signals were like correlation of entities meaning that the same IP is um is on specific alerts and then we understand like the concept the the concept of the incident. So for machine learning we also can leverage patterns that we see uh and behaviors and so on but it's still not something that um suggest on like it doesn't have a reasoning behind that. It's um just like patterns that we

can have the machine learn and spot incidents based on that. So the next generation would be to understand the reasoning behind all of this. um as a human analyst would do uh it can be T1 or T2 and so on but the job of the analyst would be to understand the reasoning behind what happened find more relevant signals and have a complete story end to end that describes what happened why it happened and what's the impact on that

and there is um like a model that describes all of that um from u like from information management which called the DKW model. Um so at the base we have data um which shows all of the uh discrete uh events that happened. After that on top of that we have information which is where we are right now. So it contains like incidents alerts uh that suggest on what happened they provide some insights on that but usually there is no context. So there is some knowledge above that today with many products but we are not at the wisdom level that suggest on why it happened and so what what we are going to do. Um and this is uh like the reasoning that

I'm speaking about. Okay. So why doing and intent matter even? Um so as I said prioritization as analysts we want to have um again the optimal next action to take in order to to do our job better. So here here uh we have uh like the agents that that can help us and I will describe them in a minute. Um the we want to know what happened but also why and how important it is. What's the priority of taking care of that um and what's the potential impact? We also want to have the reasoning behind that. So the machine won't just say uh this is important go do it. No we want to understand why. Um

because if we want to make decisions based on that, we need to understand why. So what are we going to do? Um we we can harness a bunch of agents and this is not a complete list of course. Um and I describe later describe later maybe some more functions that can also be leveraged. Um and we will go like uh each uh by each one uh and I will describe each one and its intent how how it can assist us and so on. And it's actually 24/7, right? Uh it doesn't depend on if if an analyst didn't sleep well at night or had a fight or is not concentrating. it's always it's always there [snorts] um helping us, assisting

us and making sure the analysts are actually focusing on the right things. Um so this is something that I need to uh read because there are some demos [laughter] um for in the upcoming slides. So all the example that you you will see it's actually um it's not in in production. It's like it's real client data but it was copied scrubbed before it enters our demo environment. Um, so it's like real data in there, but it's uh scrapped. It's it's a test and we then we can have like this kind of uh environment that we can test our agents or and other things in order to to see if like if it makes sense what are the results and so on.

So the first agent that I'm going to describe describe is the orchestrator which is kind of u let's call it the the shift manager um or um the agent that that will um trigger the agents the other agents uh it will like do the process by it can be like automated playbooks. Think of it like you can u import all of your SOPs the standard and the procedures that you have uh in the sock and have the orchestrator uh act based on that. Uh it can paralyze the tasks meaning I will show you this in a minute. Um and also it coordinates between everyone meaning that it can take one output um and therefore it has an input to another

agent in order so it will be like in a pro as a process and not a discrete specific task that some agent is doing and it's not like connected between uh like for the overall process and most importantly it can force the guardrails meaning that if you don't want something like some automated uh action that uh will happen we can set it here for the orchestrator. So nothing will happen until we um set this as something that we approve. The investigator is where things try start to uh to become interesting. Uh so think of it like as a tier one or even like a tierless agent that its goal is to ingest all of the information, parse

it, and then look up the context. uh meaning that it will understand like what is that IP um is this machine belong to someone's I don't know our CEO or is this server what what is this server is doing um like when we have alerts that saying post scan on this IP we don't understand what is this IP right so it will look for the context it can it can also leverage threat intelligence um it can be general threat intelligence meaning that we have specific uh report for a specific for a given uh threat actor and it can be also like our own threat intelligence that we know that we are going to be targeted by a specific

threat actor with specific TTPs and so on. Um and its goal is to actually gather all of these data connect the dots and assemble a story end to end. This is the the goal of this uh investigator agent. And as if someone here is working uh as an analyst like this is actually what the analysts are uh in charge of the the tier one analysts to have the all of the data ingested, build the story and and later on act on it. And of course it cuts the noise um because it takes only what is is belong here. So here there is like um a real benefit because it's not entity based. It's not a specific IP that we saw and then we

try to look uh for other for other alerts that has the same IPs or the same machine. uh it actually reads the entire text of the alert and try to find patterns in other alerts to have a complete story. So in this example you will see a business symbol compromise attack that it assembled uh from four alerts. I know you can't read all of this but I will highlight the important things. Um so um if I remember correctly the the first alert that um actually someone noticed was that uh we have uh high volume of spoof emails sent from from a given account but we had four actually uh in the system. One said anomalous IP

address and we will get back to it in a minute. Um and also uh an inbox rule that got changed by an unauthorized person. Um so those alerts were there but they weren't connected into an incident. So the agent actually connected them and he al he also found another alert that uh it's actually it wasn't an alert. It was an event uh a search event. So the same user that got compromised uh searched for u specific words in sharepoint in one drive. One was tax and the other one was um mobile bank or something like that. We will see it in a minute. Uh so it added another context. It also uh it was banked mobile

and taxed and also the anomalous IP address which if someone here walks at the sock this is actually anformational alert if if any uh so when he did like the the reasoning behind that uh he found that this is a specific IP is actually a to proxy IP uh which which suggest on a specific attack u attackers attack group in this Um it also found the root cause which is actually also a task that we are doing at the sock. Um provided the timeline, the executive summary and the root cause. So this is this was the output of this specific agent. Um so now we have the story but is it the most important one? Is it even true?

We don't know yet. So the next agent is the prioritizer which would uh it's like a tier 2 analyst. He will triage the event um classified as true or false positive even uh provide us with a confidence level and we will see it in a minute. Um and then we provide the reasoning behind that. So this is uh it's actually skimmed output of of the of the agent but we can see it like it trashed it as true positive. The confidence level is 90%. Um and based on that actually we can also have the orchestrator have a logic saying if you seen a true positive uh score more than x then do that. Um so it

also um can be leveraged later. It provided us also the reasoning behind that which is the most important part at this uh at this output meaning that it's not just a number but behind that there is the story um and of course it provide us all of the IOC's which we need later in order to response um and also provides the recommended the recommended actions as we will see in a minute we will leverage that in order to conduct manual or automated um efforts afterwards. So the responder is um next in line. So now we have the story. We know it's true. Um we have the confidence in it and now we need to act on it. So the

responder is like our firefighter. Let's call it like that. Um it can suggest on what to do but it can also do it if you connect it to the right systems of course. So it can like block a specific account. Uh it can add some rules and so on. Um and do it at machine speed. So you don't need the nist always there. So if a specific um actions were permitted by the orchestrator and it happened uh I don't know during the night and there is only one analyst in in the shift. Uh the agent can do it uh instead. And of course it's it's consistent. it's always there. Um so until now we had some sort of a

process, right? We um investigated something then we crashed it afterward we we responded. So this was um like a process but we can also par parallel uh some stuff and the hunter would be one of them. So right now we have an um but maybe we have more fishing emails that um someone got and other um things that got compromised. So this is the way like um to hunt further and find similar um threats within our environment and this is like second line of defense. So here you can see also the output and because we don't have time uh I want to mention that and and just I will say that this is specifically like um

the um we have less false negatives meaning that there are maybe something that we missed. Um this is the way to to get it less. Uh the last one that I will describe is the blacksmith. this uh like once we have um a specific uh incident, we can then write a new um detection rule based on that because um let's say before we didn't have a rule to detect specific keywords in in the search. So now we can write that for example. So it will take what we learned and apply it for next actions. And this is an output that specific uh for for this specific incident. Um here we see the JSON. We have also KQL for

that. Uh and it wrote a rule. We also compare it um to like real uh humans that are writing those kind of rules uh in the sock. And we saw it's actually comparable. So this is the process I described. Um and so we can also parallelize some stuff. So the impact would be less noise. Um first the containment mean meaning that the meanantime to response the time that uh analysts uh take the time in order to uh to response to uh the threats will be much less. Um the efficiency go up also the burnout go down um and it the sock would be more efficient and more effective leveraging that. Um so the adoption path um I don't say

that you need to go tomorrow and implement all of that but you can start small define a process find your pain points um and have a specific agents to address one of them pilot it integrate it into your existing stack the I think the most important thing here is that is product agnostic it doesn't care it's it's a text so it doesn't matter if it's a I don't know a sim by company X or an XD by company I as as long as you ingest everything, it can read through. Um then you can expand um always measure, iterate and of course train your sock in order to um leverage all of that and collaborate with agents and know like

where are the boundaries as we have like the shared um the shared model in in the cloud. So it's it's kind of the same here. Um so wrap up and key takeaways and looking ahead. Uh so as I mentioned at the beginning I don't think that agents will replace the analyst in the sock but they will make their life much better. Um they will let them concentrate on the more important stuff and not doing all of the time consuming and actually chasing a lot of false positives. Uh so this is how I see it and I don't know if we have time for questions but uh if you if you have some you can also reach out.

question. Um I must say that this is a key here and currently we are leveraging large language models and the more u the more you invest time on that training um actually SLMs um and not LLMs it will decrease the cost by like 90 something% and this is what we are currently uh chasing in order to implement it at scale like we have the technology right now but so it will be beneficial for our companies we must shrink it and and have it like targeted for for target for security and not like I don't know fishing and uh and hiking in in the desert. Yeah, >> thanks for thanks for the talk. Very interesting. Throughout this entire

processing are you using also data that is not trusted? I mean do you have like snapshots of data that from the alleged attacks and if you do uh how do you deal with um concern that some prompt injection somewhere will actually take over do intent manipulation for the entire process. >> Um so what you also it's like it's actually backand it's not something that uh someone is imprompting in order to do that. We do want to have like we call it like a vibe hunting in the future that the soc will like input their questions or what you want to uh to do but it's not there yet. So currently we're not dealing with that but this is a great

question and something that is very concerning I guess. >> Thank you. >> Thank you. >> Yeah likewise great talk. Um so question is on the kind of the orchestrator uh or I can't remember what you called it the the kind of planner agent at the very beginning. >> Um so you mentioned uh it's going to be fed with kind of your SOPs and and things like that. >> How are you dealing with kind of just the unstructured nature of those in terms of some are playbooks, some are just kind of handwritten notes and then ensuring that the agent always follows at least the top kind of SOPs that you've defined. of say for business email compromise.

>> Again, a great question because that's actually also something that we invest a lot of time on. Um so when you are ingesting a lot of data in a whole source of information like the model supposed to take care of that but then you want to as you described um know that you will follow it. Um so there are some let's call it strict boundaries or guardrails that we are embedding in the code in order to do that. Um, so right now it's like since it's not scaled for all of our customers, um, it's quite easy. This is a challenge and we will need to to figure that out. >> Sure. >> All right. Thank you everyone. Um, if

someone else want to reach out, please feel free. Thanks. [applause]

SOC Like a Genius: Cognitive Agents Delivering Wisdom at Scale

Related talks