BSidesSF 2026 - Architecting the Modern SOC: The Evolving AI Reality for Blue Teams (Panel)

Name: BSidesSF 2026 - Architecting the Modern SOC: The Evolving AI Reality for Blue Teams (Panel)
Uploaded: 2026-05-12
Duration: 45 min 5 s
Description: Architecting the Modern SOC: The Evolving AI Reality for Blue Teams Bryan Fite, Dean De Beer, Nicole Grinstead, Swathi Joshi The architectural decisions behind SOC modernization determine whether AI-assisted security operations succeed or fail when faced with reality. We’ll dissect integration rea

BSidesSF45:05122 viewsPublished 2026-05Watch on YouTube ↗

About this talk

Architecting the Modern SOC: The Evolving AI Reality for Blue Teams Bryan Fite, Dean De Beer, Nicole Grinstead, Swathi Joshi The architectural decisions behind SOC modernization determine whether AI-assisted security operations succeed or fail when faced with reality. We’ll dissect integration realities, policy-flexibility tradeoffs, and human-in-the-loop design patterns that actually work at enterprise scale. https://bsidessf2026.sched.com/event/e113e91f7a28a3f2a312abd85beb0491

Show transcript [en]

All right. Good afternoon, Bides 2026 here in San Francisco. Welcome. All right. For our panel, we have our moderator, Nicole Greenstead, right? Comstead. Greenstead. Okay. She's a senior director of platform enterprise of platform enterprise and application security at uh Roblox. And then we have Oh, yeah. You like her? H. Do you know her? Did you plant them? >> I plant AI. >> You did? Okay. Yeah, I was a plant last last panel. All right. Then we have Brian Feight at AI Sherpa Worldwide Technology. Round of applause for him. And then we have Swathy Jooshi. She's an SVP of cyber defense engineering and operations at TransUnion. Very nice. Last but not least, we have Dean Deir

and his title is a fancy one. He is a co-founder and CTO at Command Zero. Did I say that right? >> Closely. Okay. Very good. All right. So, I'm going to have uh Nicole take it away. And uh we welcome and please no heckling. If you heckle, I will call security. All right. Be nice. Play nice. >> Positive heckling is welcomed though. Uh how's everyone's bides? >> Yeah. Yay. That was a great reaction. I was mentally preparing for like a It's fine. Uh this is a hyped crowd. I love it. So, welcome. Really happy to be here with you all in this panel. And we are going to talk about embracing an agentic future in our sock. And I want to start

with a little bit of a story that is shared between Swathy and I. So circa fall of 2017 we were joining Netflix and we had this vision we are going to be totally sockless all our detections are going to be high fidelity true positives we won't need a uh you know sock analysts to review them swathy did we reach the sockless future I'd say we operated sockless for a while and then stuff happened. >> Yeah. >> Um you know as as alerts grew our um the the the money that we were spending on content grew. So you know employees were growing the the attack surface was increasing our obligations were growing. um you know we were we were lucky in the

sense at the time at Netflix it was a very homogeneous environment right and it was not that regulated so there were there were a lot of advantages with it u detection engineering was was grown like inhouse we had an in-house detection platform so we could control like hey here are the types of detections we're writing and we're going to go in the reverse order right like we're going look at what are the high loss scenarios and then we're going to match what specific telemetry do we need and then we're going to write the detections and then we're going to push them um in production. So we had a lot of control over it. So a lot of the factors to

build the sockless future both culturally and kind of business-wise existed. Um so that was the advantage and we did operate and then the the alert queue grew. >> Um and I think at the time there wasn't the the the AI future didn't look like this right? >> Um and if it did, I'm I'm sure we would have capitalized on it. >> Yeah, absolutely. And and that's why I wanted to dovetail into the you know we had the alert Q grow and grow and then we built the sock and you know now fast forward nine years now can we you know are we able to be sockless is the sock analyst going away and if so why are there so many open job

postings for sock analyst roles I'll start with you Dean we'll go down the the line Um so I think for the past 20 years uh every cso or CISO at every organization has wanted to replace tier one. Um I think what they really meant was I want my tier one to function as a tier two or three uh and really operationalize them. I don't need them to do pattern recognition. Uh hence farm it out to MSPs, bring it back in, farm it out to MDRs, bring it back in. Um with the advent of AI and SOCK which natural place for it to actually pre present itself. Um I don't think the role goes away. Uh I'll caveat that within six

months. I have no idea. Um but I don't think the roles in a sock go away. I think the roles evolve. Um I think the idea of tier 1 through three um is probably our cake at this point. Um and I think the tier one analyst who is traditionally been the pipeline for tier 2 and three um we need to maintain that pipeline at some level. Uh so you can be ultra pessimistic and say you know we don't need tier one and we'll we can we can use AI to replace them and in certain functions you probably could. Um but that also means that you end up in you know 3 to 5 years when your tier 2s

and threes move out you end up with an organization whose only role is to validate whether or not you know if it's agentic your agents are functioning not whether they're functioning correctly. Um so I think for the foreseeable future I don't think the role goes away. I think it evolves. you'll see the type of um the functions that they perform change as well. So um rather than doing pattern recognition, you'll find that the analyst will now evolve to do um you know ontology engineering for example like if you you know bad data in bad data out. So better understanding deeper understanding of data um the ability to build ontologies of data and apply that to you know those language models

becomes critical. um the more structured data you feed a model, the better structure that you can actually extract from it. Um we'll see, you know, as you start dealing with, you know, agents in the sock and if you talk about the sock, you know, going away, maybe the boundaries of the sock evolve, right? And they're no longer the classic centralized sock that we have. Um you now have agents that have access to different data types in different departments or in different groups. Uh so you know roles like you know identity and access management which was traditionally like you know who has access to what at an audit level um becomes now you know trust and

authorization. So how do you build programs that allow you to provide trust to a set of agents that have a security function? Um that becomes more critical when you you ask those agents to start performing containment or remediation actions which you know from where I said it's surprising but it's actually a big ask of folks that you know remediation is now performed automatically or at least containment. So you know how do you control that? How do you do crisis management if something does go wrong? How do you ensure that the person that is looking at at those agents understands what went wrong? And in order to do that, you still need technical knowledge, right? I think it's

just applied in a very different fashion. Wow. Amazing. So, real quick before we get deeper into the subject, how many people in here have heard of Slido? Okay. Slido, sli.com. We would love your participation. Nicole's going to do a 10-minute Q&A at the end um of the panel to open it up. and we would love your participation. If you go and you log in to slido.com, you want to register at bsidesf2026 and we're in theater 12. Again, that's slido.com. besides SF2026 and we're in the 12. Take it away, Nicole. >> All right, back to you, Swathy. The sock is it dead? Yeah, I think sock as a function is definitely changing. The you know manual alert

handling is going to be of the thing of the past. So what's that replaced with the new SAR is going to be Qless and alertless but investigation as a function I don't see that that disappearing and in terms of how do we square with so many sock analysts role open right I think AI absolutely reduces um low value triage that that'll absolutely absolutely happen but then kind of work shifts from alert handling to authority decision making like Dean said crisis response and then detection life cycle management and also the life cycle management of the new sock is going to be with agents with you know that's context aware where it's human augmented so there's some decision

making or clicking going on so that's that's my vision for the what the new new investigation um era is going to look Uh yeah, I don't think the humans are going to go away anytime soon. Um being a human, I'm biased, much less large language model in that case. But uh no, I think um automation, you know, uh AI might not be the hammer to your nail and automation has a lot of business value and I think we've kind of stopped doing that. And you know, one way to fail at scale is to automate a broken business process. So I think optimizing those processes that have playbooks today um is probably a good way to get some value but

absolutely um it's going to change. I am a little concerned uh that there'll be entry- level jobs that won't be there. So this idea of being able to be agile and pivot. So um that's good because you're probably never going to get bored as a human in the business. Uh but uh I just I think um we're going to have dynamic teams and they are going to um we're going to have synthetics with us and that's a good thing. We're going to have to get learn learn to live with these weird machines that we've created. Uh so I ultimately think that we we benefit there but it's definitely not going to be uh the traditional you know

start here work for three years go to tier 2. Um, and I think having some business acum and understanding parts of the business that we didn't necessarily need. You could stare at screens all day and be really good at your job and have a really good career. I think we're going to be expected to do more. And um, we were talking about this before as far as uh, you know, alert fatigue, portal fatigue. Um, we will probably see some of these systems start to evolve to not be human friendly but more agenic friendly. Um, and maybe at even a lower level because they'll the the volume and the speed they'll have to react uh will

will necessitate that and you know all the pretty pictures are just for us so they won't need those. >> Yeah, absolutely. And building on that a bit, as our analysts are becoming more decision points, crisis management, as we're building out these new agentic ready systems, what are the kind of signals that you're exposing to the humans? How are you designing for that human in the loop? What are you exposing in terms of what the agent has already done to help the human be more effective? >> I'll start with you, Dina. Um so I think the role of human in the loop is going uh well will change as well right today uh in most um sock

automation type uh scenarios uh you need a set you still need a set of checks and balances right the problem with new technologies uh is it's those trust boundaries right so you need to be able to be transparent you need to prove trust before you allow for any levels of automation beyond you know uh that have a real impact for example containment Um so today I think the human in the loop will do um the human in the loop role especially tier one doing triage um one it'll speed up their role but the role is still to you know determine whether something is a true positive or false positive right at a very simple level right like you know is this

investigation that you produced for me correct um you know are the entities in this investigation correct is the breadth of the investigation correct uh did you cover all bases did you ask the right questions uh did you follow process right those are things that are deterministic and you're dealing with nondeterministic systems. So you have to build them in such a fashion as to allow for determinism but you still need someone to be your your gatekeeper. Right? So that series of gates are not going to change. But I think as the sock evolves, as we see the sock um become something that is more collaborative that crosses you know departmental or even verticals um like boundaries um the

role for human in the loop um you know changes from gatekeeper to collaborator right um you know we have systems today that can run thousands of investigations someone still needs to look at them and so you know you don't look at all of them that's sort of like you know recreating the problem right so you surface those which are important and you reach out to an analyst to collaborate with that analyst on are those investigations correct or not. So I think we'll see an evolution of uh human in the loop as well where it becomes more collaborative and less gatekeeper. >> And how are you all measuring whether AI is actually improving or making our

investigations more efficient or are we just shuffling around human cost to token cost? >> Uh well that Brian >> Yeah. uh you know I'm child of the 80s so I used to pump a lot of tokens and quarters into video games and I see people spending their tokens like you know drunken sailors but um no I I said I think definitely the business case uh needs to be scrutinized because with you know at least in our practice what we're seeing is two extremes we're seeing people who are blindly running in and using it everywhere building some really cool prototypes but not understanding the life cycle or total cost of ownership that would be we're shifting

cost around So somebody's going to pay the bill, maybe not your department. Uh and then there's uh the FOMO where people are uh so afraid afraid that they or the well sorry the Bureau of No, where they're essentially u trying to put prohibitions on these and they're stifling innovation. So what we try to find is that that middle space of riskreward optimization uh don't go for the big bang. uh go for continuous improvement and hold your stakeholders accountable for um uh for whatever business case they're bringing forward. So one of the easy ones in sock u any kind of operations if I can take 20 clicks to two clicks that seems like it would be pretty easy and if I can do

that across you know 200 uh use cases or 200 workflows and um you know you're not going to get full automation from for me the two clicks are the humans in the middle uh or in the loop uh where they happen to be in the process because ultimately they're they can be held accountable and we do know that these weird machines um they they hallucinate um they have bias uh there's lots you know what I call NIST day 600-1 the dirty dozen those 12 areas of harm so you can just pick any one of those and look at your day-to-day operations in a sock and you know theorize a scenario what could happen if that uh that

machine exhibits that behavior uh in a production environment so uh I I don't think um we will uh have to account for um uh if if you make big bangs, you're going to do everything at once, you're probably going to fail. But if you take something that you you is a painoint today, maybe it's the role that you haven't been able to fill because maybe it's uh there's not a good work worker pool out there or it doesn't pay well or there's just some things that humans are not good at and that portal fatigue and looking at lines and lines of logs. We're not good at that. So if you pick the right use case, make sure that you

hold the um you know you have a way to measure success whether it's clicks or token consumption and then compare that against another way to do it. Um and I would also uh suggest not bringing in new more tools uh because that's just more opportunity to fail. Um so if you can actually bring something in this is what we always would tell vendors. Oh you you have something new great which two things get to leave the environment. So if actually you have a use case that's going to get two clicks 20 clicks to two clicks and maybe get rid of something um that's redundant that becomes a pretty easy business case. >> Yeah. Love it.

>> Yeah. I think you know in terms of productivity gains from my perspective that's that's really easy to kind of check. Success is measuring you know false positive reduction. Um have we done dduplication of rates? Are we doing alert clustering you know and then we are dduping it and then as a team are we running multiple kind of parallel escalations so that's just direct you know are we calculating the load so I think those are all like great measurements for productivity so goal is in that um it's to reduce kind of cognitive load on the analyst and then to expand sort of coverage right like it's not really alert closure rate. I I I don't like that metric. So that's not

the focus in terms of tool efficiency. So you also have to check now with the agentic systems. Do we have the right tools in place to do that? And that from my perspective is also easy. You know, is the tool able to give you all the decision making it has already done and present that to the analyst. Right? Is there an investigation memory that's already done? Is there you know alert clustering like I said and then is there all the decision making related to the case already presented to the analyst then yes we have the right tool and then if it's humanass assisted decision-m then the analyst has all the information in front to either say yes or to say no

and then you kind of go towards as your you know false positive or like the recall rates and see if that's good then you kind of push that into part and say the the the machine can make the decision. So for me it's two types of measurements. >> Yeah, I'm hearing focus try focused experiments over all the tools. Pick good measures of success and start experimenting and measure and iterate on the human in the loop for full automation. We've touched on this quite a bit but love to dig in a little bit starting with Dean. you know, as AI is becoming more and more embedded in our sock, what are some of the key places and architectural decisions

you're seeing made around how to gracefully insert a human in the loop? >> Um um I think probably the most important place is um the ability to structure data. So in order to be an efficient sock, you have to bring in data from multiple sources, right? It's not just security sources. It's not just what lives in your SIM but it is um operational sources whether it be GitHub or AWS data or SharePoint content um HR data right that all forms part of an investigation to give an understanding of the context in that organization. So I think one of the critical places where a human in the loop not just in terms of am I gatekeeping the agent or am I

collaborating with the agent uh is controlling and managing the data that is provided to the systems. Um it's got to be you know we security we work with highly structured data. Um and which is a good thing and a bad thing. Uh the good thing is that you know if you have highly structured highly typed data you have an understanding of what it looks like. You have an understanding of what you can get from that data which means you can very easily you know build context around that and articulate in that in such a fashion that a system or a language model can take advantage of that but it does require a lot of upfront work. So I think you know the

the human analyst um you know SLengineer slashcontext engineer at this point um becomes more critical you know at at the get-go and then throughout the life cycle of you know that organiz organization um you know products don't have context you know language models don't have context and by context I mean they don't understand the organization that they're in they might understand the vertical like I don't know it's whatever vertical but they don't understand the nuance of an organization what is a critical you know incident in one organization might be policy violation in another. So that is what has to be brought to the system by that that that team right. So whatever the shape it takes um you've got to codify

that in some form and make it available to those systems. So um I think at least for the near term uh you'll see an increase in importance of data management and that means not just understanding what data is available to you but understanding the business context of that data in that data what is critical what is not critical uh should you escalate something if you see you know whatever artifact or is it you know just run run the business as usual. Yeah. So there's a lot of context to be added. We can't just run, you know, let loose the agents yet. We have to get our data well structured, architect these systems to be ready for handling and

knowing what is business critical. What do these alerts mean? Do they actually matter? And I'd love to hear Swathy compare and contrast a little bit of what kind of alerts you're seeing and task seeing being able to be fully automated and where you are needing that human decisioning. >> Yeah, I think humans will continue to be in the loop for say escalation, right? Like, hey, does Suadi need to know about this specific incident? Do we need to send an executive report out? um does Swadi have to inform the rest of the executive team like that like there's going to be human in the loop you know high-risk prioritization like oh there are multiple investigations happening

here is the attack surface here is like here are the these many systems affected how does that you know work with the rest of the the investigations that are coming in so I think you know for high-risk kind of prioritization human needs to be in the loop if there are any compliance-based decisions. Yeah, you can kind of completely automate like here are the different frameworks that are going to be affected with this incident or here here is specific control that's missing but then what do you do with that data? I expect kind of human to be in the loop. Um I definitely see you know automation like completely handling like enrichment if there is you

know alert decoration that you know so many tools have tried so hard for so many years to do that like that like there's great news there like alert alert enrichment and decoration can absolutely happen if there are telemetry gaps like it's going to be really easy for us as the defense team to say here are the gaps that exist here is where like we don't have visibility Right? So every time like you're doing an investigation and you need any kind of like change analysis, here is what's happening at the intra layer, here is what's happening at the app layer and here is what's happening at the the user layer. I think it's getting really like

amazing to put all of that data together and you have this, you know, your chain chain of events. I think that's kind of exciting. Um and then you know any kind of like basic blocking, sandboxing, segmenting the specific machine, all of that is going to be completely um automated and um that that's really exciting. >> We touched a little bit earlier on some of the pitfalls with LLMs like hallucination non-deterministic outputs. You know, I remember some of my early days working with LLMs or just chatting with chat bots and you'd get these really funny outputs like how many Rs in Strawberry? Oh, it's two. Like no, no, I don't think so. Oh, yes. Yes, you're absolutely right. It's five.

You're like, oh my gosh, like what? Like I don't think the future is here yet. But now, you know, fast forward and in my experience and I think what many others are experiencing as well is these tools are becoming more and more powerful, seeing a lot less of these silly mistakes. Uh they're really getting it right more than they're getting it wrong. Um but you know, there is still hallucination risk and risks of prompt injection. So let's talk a little bit more about that. Like what architectural patterns help reduce the hallucination risk uh risk? what are you seeing work? I'll start with uh with Brian. >> Um yeah, well picking the right model. Um and of course that's not just um uh

you know large language models I think are great for being creative and uh maybe inspiring but maybe not so much for redundant tasks. So uh we've done a lot of work on small uh language models that are more deterministic but focused. So I have like a tracker that understands everything about uh IP and and transversing those those networks but doesn't really know anything else. Doesn't necessarily have the primitives to tell me how to make Ryson uh or suggests I go uh hurt myself. And so those types of guard rails or anything that makes it easy to do the right thing and hard to do the wrong thing from an architecture standpoint is really really important. And you know we're getting

opportunities now that are kind of once in a generation to build the sock of the future today. And you know the converge sock which is cyber and physical right now there's some fairly perverse um business models out there. EPS anybody familiar with that? How you actually pay for signals? So there's a disincentive for you to consume more signals. And if you're um asking these uh these weird machines to help you co-pilot or make decisions on their own and they don't have the right information or they don't have context because you're you you don't have that data. So have this concept of total telemetry. We want more data. So from a business standpoint, how do you create the you know perfect storm

to allow that to happen? You got to get rid of some stuff. And so you can't be the data hoarders in most existing organizations, you know, kind of the the land of 10,000 data lakes. So this concept of federated um data fabric where you only have what you need when you need it, but you can get as much as you can. And I know that uh we were talking I think earlier about you know um you know shared context and that being good especially you know with MCP. I mean there's a whole bunch of different um agent agent protocols but I'll pick on MCP right now. um you know and you got your toolboxes that's great

but some of the threat cataloges that come from that are are nuanced so you can actually poison that well um and you can have mass hallucinations uh and you can uh actually um when you start stringing either prompts together or API to API I mean you know certainly fatigue alerts bad but API sprawl is ridiculous and so in these these next generation uh sock of the future today more data. But how do you get more data to have what I'll call digital doppelgangers? Uh even lowfidelity digital twins that you can start to build from these signals. Imagine a world where you can test every permutation or you know millions of them uh against the change that's going to go

into operations. So I'm kind of moving up the pipeline a little bit from you know maybe being more proactive. Uh but no those the guardrails you honestly uh governance risk and compliance is awesome. I was a a risk officer in compliance and I love effective governance but threat and control are so much more exciting and they're quantitative and so I I prefer to have if you have a good threat catalog then you can start to understand what you need as far as controls to have system confidence and there's an RFC on that. So if anybody's interested, hit me up after the panel. >> I think the systems now are shifting to be API first injection, right? Like

that's that's going to be like the big architectural shift and then of course like streaming pipelines um near realtime detections are have to be sort of the norm uh moving forward. And then we talk a lot about sort of structured data as input which is true. And now the output also needs to be machine readable right like whatever the system is putting out then now it's going to be fed into a different component of the machine. So that's happening as well. Um you know now agents are also going to be treated as consumers just like humans are. Hence the output um is going to be important. The other big change is going to be kind

of you know how how we've been operating in structured telemetry. So we have like EDR say IM you know structured telemetry will of course then result in reliable reasoning and which is what we want for level one to be like completely automated. So now because of this normalization and kind of high increase in um in in reasoning now sort of this multi-source correlation right like having these many sources to correlate like now that's a positive that way we are not relying too much just on edr telemetry but we are actually looking at like various sources that kind of increases the reasoning the last thing I will say I don't want to say hey this is how it's

shifted but from my kind of personal experience running sort of the AI sock it takes it takes some time to kind of train the model right like so it'll have sort of this global context so if you if you have a specific vendor it'll have the global context of say it'll get threat intelligence from various sources and all of that but you have to feed it kind of your localized like business context if you want to if you want to check for an anomaly in your payment system, what's the actual payment workflow in your company, right? Like, and what does anomaly look like? So, it took us about, you know, 10 10 to 15 months to actually

train the model to go towards like human assisted and completely AI um um systemled decisions. It took us um 10 to 15 months to get there. So, that's the other big change. Yeah, I think we've got a few minutes left before we open it to audience participation. So, let's see. I want to very quickly zoom out a little bit, you know, beyond just the sock workflow in the detection and response life cycle. Where are we seeing uh other accelerations in, you know, writing detections, rule generation, validation, tuning, all these things? Um want to maybe start with you Dean? >> Yeah certainly. Um so at least in the security domain uh detection engineering has been a discipline in of itself right

so with the advent of you know various tooling whether it be you know whatever type sim you have you have folks that have spent a wealth of their career learning and understanding the data that's available to them how it's been normalized what kind of queries they can write they end up learning one query language then another query language um so I think um applying language models to that space is it's a force multiplier without a doubt um it gives gives you the opportunity not just to for a junior individual to come in and understand what that query means. And I'm sure you guys have seen it. Some queries are like a dissertation in themselves. Um so you

know the ability to explain a query to someone who's new in their career is incredibly important. Um the ability to take a query and say okay I need variations of this. Um I'm dealing with different data sources, different timelines. Um I think that's an incredible opportunity in a space that um you know you don't need tooling for that. you have enough models that are open source and available out there to do this for you. Um and so I think we'll see and we are seeing a lot more activity in that space where detection engineering is not going away. I think it's just becoming a lot more efficient. Um folks are coming up with queries that

they didn't previously come up with before. The language model has a broader understanding of the data. We talk about data architectures. So if you can, you know, provide data or the structures of that data and the shape of that data to models, um, you're giving them a blueprint to come up with queries that provide for highly efficient types of hunting. Um, and not just sort of like one and done, but to be able to go through and take data and say from this data, I need to do another step or another two or three steps. Uh, versus putting that, you know, that initial query result in front of an analyst and say okay read through this, figure out

what the critical piece is. So um we are seeing and we and we develop for that as well is the ability for detection engineering not to go away but to become even more powerful because that knowledge that those analysts have um is something that you you can't codify you know overnight. >> Swathy I think you've done some interesting things here as well. >> Yeah I think there are sort of two areas where I've seen kind of one area mixed results um and other area really strong results. I think for specifically for fraudb detection I think like the entire you know we've been using ML in that space for a while. So in in some ways

you know the the fraud and bot kind of detection space is is a bit more ML forward and there we've seen you know really headwinds like huge kind of improvements in terms of like defining user risk efficacy or even say device risk we are able to do that successfully using um models there. The other in general like I think I would love to see kind of industry improvement and us push for it too is writing sort of a multi-stage detection logic right like to Dean's point if it's like hey here is a specific simple rule that's easy we won't see a whole lot of you know human-led uh pioneering there but in terms of like

hey here here is a multi-stage attack that the red team has done chaining together there's an ode and then there is another thing so sort of putting all of that together and then we need to write a really complex detection for that there I don't know if the I haven't seen kind of machine be able to just spit it out and say here it is but each stage of detection writing to Dean's point has gotten faster right um that in itself is lot of efficiency gain >> yeah yeah absolutely so seeing detections writing simple detections from thread intel feeds, more global things speeding up really quickly, but these multi-stage, more complex attacks. We're not quite there.

>> Yeah. >> All right. And I'm going to shift over. We've got about 10 minutes and we've got some really interesting questions here. I love this one. I'm going to start with a spicy one. What do you think the outcome of the first major breach blamed on an AI sock uh triage miss will be? Um so similar to like target missing the fire eye alert but now it's the agentic socket has missed it and the breach has happened. What happens? >> Um I'll I'll give it a try. Uh >> um and you know we accept you know fallibility in analysts right they're going to miss something. They're dealing with a a large workload. Um you know

they you know they get tired something slips through. they don't escalate it or it's just innocuous I don't know application consent grant right it's you know you look at it you move on but it's precursor to something a little monfarious um so things get missed I don't think um you know at least for the near term that you know you know AI sock is somehow going to provide you with 100% true positive or false positive analysis right so um at least for the foreseeable future you have to accept that um there can be a which is why we spoke about having checks and balances and ideally um you know if an analyst is not overwhelmed and they're looking at

that which is important or things that they wouldn't have looked at previously ideally something like that would be caught um I don't know who you would blame if it does lead to a breach because they missed something um I don't know like a race >> let the lawyers sort that out >> erase the agent you know rewrite his prompt um you know um yeah I guess it could happen but I think the same way that you know the analyst team gets blamed for missing something in the current state of things um the AI sock is not a collaborator I think it'll end up there uh the tooling is just that it's a tool right so uh it still

requires a level of responsibility from the analysts um you know and not to ex ass assume um that what is provided to them is ground truth >> so yeah I I think it it's a great question because it's absolutely going to happen and I'd also say okay then why why didn't you catch that one person who applied for the job and then pretended to be somebody else on camera right like there's also sort of you know recruiting scams happening so I think kind of the the the similar argument there I will say that in terms of like ownership of like you know who to blame or who's responsible like the security leadership is still responsible right so I kind of

see the sock kind of evolving as or Even like in in general defense and product organization there is a control plane and in this case there is a control plane mess of you know something happened but the governance layer and then the um decision authority kind of still exists and then now whoever is the governance layer and whoever is the decision authority that's who is going to take responsibility to fix a system. I was also maybe a couple of weeks ago I was on a panel and somebody asked a question of like hey you know as as leaders we manage you know people as as as leaders you serve your team now are we far from the future where you're

managing agents I think that'll also become part of the responsibility >> I I think to have you need to have a creative threat catalog so we have a class called um two classes one's called lying to robots and the other is called hiding from robots. Um and presumably if they're going to ingest things you could poison them uh via poison pills. So uh ensuring that you've thought about those what we I call it's poo point of origin hacking you know think of that yeah think of the the the you know the craziest one you can distill the world of possibilities down to TTPs you know thanks MITER for that. So now we do have a way to kind of create this uh

simulation. So, I would I would suggest that it'll probably be um something especially when we we reach that point where everything is being reviewed by an agent or an LLM or some type of weird machine ahead of the human um and uh it that those signals that they can't see. And if you want to get a whole new area of cognitive security um if anybody's interested in that, all of those attacks that work on humans um actually work on these weird machines, too. So, um, just, you know, social engineer your your systems and see if you can get them to do stuff. Now, the irony of that is it's machines, you're using weird machines to

test weird machines. So, that doesn't end well. I think it's probably not a good use of your tokens. >> So, I'll add one more thing. Um, you know, the same way if there is a breach, you know, any any organization should be doing an afteraction report, right? That's what lessons learned were all about, right? You take those lessons learned and you bring them back into the team. either your workflows change, your playbooks change, uh your guidelines uh change for the analyst so that the next time something happens, they're aware of it and they can follow up on it. I think let's assume that the agent or whatever tooling it is does miss something. Uh the opportunity to you know to bring

that data back to codify it means that in an ideal world it shouldn't miss it ever again and you do it once, right? You're not required uh you know of the analyst to be on point 24/7. So I think that's one advantage. So if there is a miss, you can teach it if you will um that context so that it doesn't happen again. >> We're almost out of time. I'm going to take one more question to round us out, which is how do we need to influence other security and engineering orgs to make the sock more effective? What changes are urgent to truly enable the AI powered sock? I'll quickly add my two cents and I think we just have to start

experimenting responsibly. You know, we're not going to throw AI agents at everything and think it's going to be magic. You know, we have to be responsible. Um there is still accountability, but to the points earlier from the panel, you know, we start something, we measure and we iterate. And um love to hear your thoughts here, Swathy. I think maybe to answer it like a bit differently, we have to be more autonomous and that's a win for the sock. So if we can reduce dependency on other teams, uh then that that's really a win. So I think as a really tactical point like getting the right SOPs from other teams. Okay, these are the exact actions need to happen. Then we actually

don't need your team. we are actually going to make this easier for you and then the sock's going to take it and automate it and then that specific problem you don't have to worry about. Um I think that would be that would be the selling point and and a win. >> Do we need to wrap up now or can we Okay, does anybody else want to hop in on that one? >> I would just say let's build trustworthy and responsible AI systems, please. I know that's a difficult subject. It means something different to everybody, but have a position on it and you know that be your northstar >> I because I don't want to work with

unethical machines. >> Uh um I'll take a bit of a different approach maybe. Um so security operations tends to be a centralized uh entity. Um you know security is that you know element of the business that you it's always been difficult to sort of like quantify. Um and you know I think in you know as we move forward um we need to see the boundaries between security and different departments dissolve. So you know we're talking about breaches we're talking about oh there's an attack and you know so forth but there's a lot of other investigations that happen right you know whether it be you know a user's last day whether it be just you know

sort of anomalous activity. Um all these investigations are oftenimes not driven by security but are driven by different departments. So um you know build the organization evolve the organization use technologies that are available to us language models tooling around those language models uh to minimize the friction between you know I don't know HR asking for you know the users's last day and the report that is produced uh in the future there should be no reason why HR simply doesn't ask of your systems for that information you know certainly checks and balances and controls and all that in place but they shouldn't have to go through a ticketing system and wait two weeks for an individual to return them a report. So I

think that's an opportunity for evolution, you know, in security operations. >> Right? I think we're just about at time. So thank you all for joining us and thank you to our panel members. >> Fantastic. Big round of applause for our panel. Thank you, Nicole, Brian, Swathy, and Dean.

BSidesSF 2026 - Architecting the Modern SOC: The Evolving AI Reality for Blue Teams (Panel)

Related talks