← All talks

A Trusted Teammate: Helping AI Agents to Help You

BSides Seattle · 202522:5666 viewsPublished 2025-06Watch on YouTube ↗
Speakers
Tags
StyleTalk
About this talk
How we work with software will fundamentally change, and knowing how to manage and mentor AI systems will become a critical skill for cybersecurity professionals. This session will cover why humans will be needed as reviewers and managers of AI agents, why human-in-the-loop design is critical for these systems, and how cybersecurity roles will change. Just like new junior analysts, AI agents are smart and capable but need guidance and feedback to get better at their jobs. Security professionals who learn how to manage AI agents will have a major advantage in the evolving SOC. This session will cover how to: - Benefit from the automation of repetitive tasks by focusing on high-value cybersecurity projects - Manage AI agents by onboarding them with documentation and providing ongoing feedback - Look for human-in-the-loop design to help AI agents to learn on the job - Recognize and correct automation bias to maintain trust in AI-driven security decisions Bri Hatch Dropzone AI Founding Engineer Bri Hatch is an adamant Open Source advocate and security buff, author of Hacking Linux Exposed, Building Linux VPNs, and numerous online articles on the topics of Linux, security, and coding. He has been securing and breaking into systems since before he traded in his Apple II+ for his first Unix system. He does not know how to use a mouse. Tyson Supasatit Product Marketing at Dropzone AI Tyson grew up in the Seattle area. He worked at the UW Computer Lab when they still used PINE and got his first job writing news summaries for the Association of Computing Machinery's weekly email newsletter. He's worked at several cybersecurity vendors over the past decade. A Trusted Teammate: Helping AI Agents to Help You
Show transcript [en]

Okay, Tyson, we're live. All right, thank you for coming to our talk, a trusted teammate helping AI to help you. Uh, I had to come up with a with a clever title, but if you wanted to think about like a simpler title for this talk, it's about managing AI agents. And uh basically what we're going to cover today is uh what are AI agents? Um thinking about routine and and strategic cyber security tasks and then also uh how you should be thinking about AI agents. you know, they're they're a productivity tool and um things that are going to help you to be more effective. Again, like helping AI to to help you and then things that

you should be preparing for the future. So, it's a pretty short talk, 25 minutes. We'll try to make time for Q&A. First, before we get to that, a little bit about us, uh my name is Tyson Sup. Hey, Jeff. Uh, we're um I grew up in Auburn. Is anybody else from the 253? Yes. Okay. Yeah. Got to represent 253. Uh, I I did have a job in IT at some point. I was at I was in the computer lab at the University of Washington, but uh yeah, today I work at uh Drop Zone AI. It's a cyber security uh AI company. uh with Bri. I've been working with Bry for over a decade. You want to introduce yourself? Yeah. My

name is Bri Hatch. Not sure if this is on. Yeah. I'll just talk loudly if it doesn't seem to working. It's working. It is working. Great. Uh I'm Bri Hatch. Uh I started way back when in the days of Command line and exclusively Command line. Um my security, you know, career pretty much started when they kicked me off the computer systems for hacking and then hired me for my security skills. Um, after that, uh, I was working contracting, pen testing, uh, took a stint at Google as SR, uh, where I sneakily embedded some Easter eggs inside all the HTTP headers of very high QPS services. Um, and, uh, now I'm here at one of the founding engineers at Drop

Zone as well with Heisen. Cool. So, uh, just to level set, how many people feel like 90% confident of what we're talking about when we say like AI agents? Can I see a show of hands? What about like 50%? Okay. So, so you guys are uh AI agents. What we're talking about today is not like AGI or just, you know, AI super intelligence. We're just talking about tools, okay? They're they're productivity tools. Um what's interesting about AI agents is that they use LLM technology. And what LLM technology brings to the table is the ability to quote unquote reason. So it can do things uh automate tasks that weren't previously able to be automated sort of like tasks that require some

some reasoning capabilities. Now there are a lot of drawbacks to LLMs and there's a lot of reasons to be you know ambivalent you know about uh the impact and and you know uncertain but they are going to they can help you quite a bit. Um the difference between an AI agent and just like a chatbot or just like an LLM is that it's autonomous. Okay. It's it's it's using LM technology in the back end, but it's chaining them together in it's breaking up a large task into smaller discrete tasks and then um uh able to do some pretty interesting stuff. They don't have contextual knowledge and they can make mistakes. It's not a deterministic system where you enter in one value and

you're going to get another uh predetermined you know something you can uh anticipate coming out the other end. Um but yeah it's like look ma no hands you know they they they just work on their own you know they don't need uh they don't need you to be there to be pressing the button and stuff you just kind of like tell it hey this is the thing I want you to do and it's going to go and and do that thing. Um when it comes AI agents are going to impact every white collar job. They're going to impact every uh like knowledge worker job including cyber security. So what we are showing here on this slide is the types of cyber

security tasks that uh you know are routine and repetitive and probably not what you got into the field to just do all the time. And then there's strategic and high value tasks that actually make you very valuable to your organization. Okay. So talking about uh I don't know Brian, do you want to like which of these do you not like doing and which of these do you actually like doing? Yeah. So the things I really don't find bring me joy are the the dunking bird things. Oh, look. It's an alert. It's boring. It's alert. It's boring. It's alert. It's boring. all those things that I feel like I should be able to write playbooks that can easily run this

through anything where I could say, "Hey, intern, uh, you came off the street. I've got you a little bit of, you know, instruction, but doesn't, you know, require somebody with good thinking skills." Uh, the difference is when, let's say, you know, you find an alert that's real and it's actionable. Everyone wants to jump in. Everyone wants to look at the thing because it's new or maybe because it's a highv value thing. Uh, we got an alert about, you know, somebody's laptop is definitely running malware. Maybe you should check out that it was the CEO who was giving a presentation in front of the board before you just manually turn it off. That could be a little bit of a

careerlimiting move for you. That's one of those things where you don't trust the person off the street. You know, you can decide how am I going to tackle this more interesting and more fun. Um, I like seeing things that are actionable, that are valuable. Um I I like problems not because I like having things to solve but because I like to use my brain and so there's so much out there right now that is trivial that should be automatable. Yeah. So yeah we can look at this list um you know updating user accounts uh testing deploying validating patches monitoring for vulnerabilities you know all of these things are kind of like routine and after a while it gets

very repetitive you know fatigue can can start to can start to come in but on the other hand the things that are strategic the things are more like more proactive security the things that are actually helping your uh improve your security posture rather than just reactive and firefighting. Um these things are are uh actually turns out to be the less automatable things that AI is not as well suited for which is great. You know, you should be be it ideally it's going to free you to be able to do more highv value tasks. Um also this this AI agents are you know we've seen this movie again. Uh, you want to talk a little bit about um automation?

Yeah. So, so Tyson asked me to to give an example of of something I've done in the past, and I'll warn you right now, this is an example of what you shouldn't do, right? We've all kind of been there before. Um, you've probably heard of Larry Wall, creator of Pearl. One of his big things was that, you know, good programmers have three different characteristics. Uh, hubris, laziness, and impatience. I want to get stuff done fast. I want to get done lazily. Um, laziness might mean doing a whole lot of work to automate a task that only takes up a little bit of time, but it takes up that time day after day after day after

day. You want to do something that makes it easier for you to have systems work for you rather than you're working for the systems. So, uh, decades back, uh, I had the Unix side of a house and there was a window side of the house and everyone's heard of patch Tuesday in this room, I assume. Um, so every every at Microsoft I probably been probably been a thing and so my work was to upgrade all the Windows or sorry the Unix machines. They upgrade all the Windows machines and they had the team at 10 and I had the team with me. Um, while I thought it might be just convenient to say, let's just upgrade

them every Tuesday at 12 o'clock in cron. That would be great. That'd be awesome. um unless we decide that we're actually going to do it on Wednesday that week and then I have to go quickly go and update the cron jobs. So I would sit down the same as everybody else and I would go and I would log into the machines and I would run apt update, yum update, I would do all the standard stuff and then I would automate a little bit of SSH. Um but there were some of these machines literally I would have to dial up modem I I kid you not in order to be able to get in these things behind

multiple layers of firewall. So I wanted some sort of RPC mechanism to trigger these on demand. What's what's the most logical idea if you don't have direct TCP connectivity? If you said DNS, that's the wrong answer. And that's what I did. Why not? It goes through all the firewalls. It's always available. It can be retrieded. And so I wrote a cron job that would go and check for a DNS record based on my host name. Would go up to a DNS server of my choosing. And when that record was ready, it would say, "Oh, it's time to update." Do it all itself. And when it's done, has to of course report back. So I needed a reliable

mechanism. It goes through all the firewalls. So use DNS. And so there's a DNS reporting back mechanism. If anyone is shaking their heads thinking this is a horrible idea, congratulations. You're in the right room. That's the right answer. But again, laziness, automation, and the wrong amount of hubris. These are the kind of things you do when you're early in your career and you know prioritize exciting, fun, ingenious. No one else has thought about it. Maybe they thought about it and realized it was a bad bad idea, but these are the kind of things that you you do when you're early on. But uh again, that that patching process was not something that sparked joy for you. The the patching

was not writing something cool. Yeah, that did it for me. And I could sit back and say, "Oh yeah, I'm working. Play Doom. Play Doom. Play Doom." No, totally working on upgrading everybody. How you doing, Windows team? You know, different levels of effort once the uh kickoff happened. Ideally, uh you are going to be to a point where you are thinking about AI in this way. It's going to make, you know, automate the boring stuff. It's going to um automate things that that you don't want to be doing. Um it's going to require some specific skills, which is kind of the meat of what we're presenting today. uh you need because they're not deterministic systems where you enter in

a value and you get uh something expected out. You actually need to think about these differently than you do normal software applications. It's helpful to think about them as a new hire on your team. I'm not trying to anthropomorphosize these like they're human or they have consciousness or anything. It's just the behavior of the system and it it will help you to interact with the system more effectively. So what do I mean by this? You need to onboard these AI agents. We're talking about uh commercial off-the-shelf AI agents that primarily um but you could build them as well. it was just be quite require a lot more uh you know management and stuff but like

Microsoft a couple weeks ago announced their AI agents um and last week Google Google uh security announced their AI agents as well and then Service Now is is doing AI agents not in cyber security but um these things are coming and then you have a whole bunch of V vertical uh vertical specific to cyber security AI agents like what uh drop zone is is doing tackling different parts you know uh application security uh alert triage pin testing there's a lot of startups out there that are um creating AI agents and what you need to do is you need to onboard these they're going to work out of the box just like a new hire on your

team will be able to do some things out of the box they say they graduated with a four-year degree and they have someerts they know how to do certain things but they don't know the specifics of your organization. They don't know the IP ranges that are used for test and development. They don't know that Joe that sits over there sometimes does uh security uh testing on you know is his machine and that's okay. So they don't know these things. So you need to onboard them the same way that you would with a new hire. Give them access to internal wikis. uh give them uh you know any documentation that you have uh run books or workbooks that you give to your your

human um security engineers and then on uh besides onboarding you need to continually work with them giving them on the job feedback. So uh it's it's a different way of thinking about interacting with uh with the software systems. There's a term called uh human in the loop design. Um when you're working with an AI or an agentic system, you need to be looking for these four attributes that are I think you could classify under human in the loop design. First you need to be able it needs to be able to give you a trust but verify approach so that it should show its work and you can when you need to you go in and you verify you

can follow its reasoning. Okay, you came to this conclusion. What findings did you what questions did you ask and what were the findings? What evidence did you have? Which is the next part is the evidence locker. You know, you need to be able to see the raw like JSON query that or re uh result that came back from the the query that it ran. Um, and then you should be able to give it easy natural language feedback. We're not talking, you shouldn't have to go in and manage the prompt logic and and inspect that team. You should be able to just talk to it like, "Hey, uh, Joe does sometimes does computer uh, security testing. That's okay. That's expected."

Or, "We have these regular batch jobs that run at these periodic, you know, times." That that should be uh, part of the design of the AI AI agent. And then lastly, this is really important. Remember we talked about that one of the drawbacks of AI is it just doesn't know the things that it doesn't know. Um if it's not included in its training data set, you can't blame it. So that's what retrieval augmented generation is for. You it should have a a local data store where you can add contextual details. And if you if you think about this, it sounds like we're talking about something new, but this is what you do every time you have a new hire, you have

an intern, a co-orker, perhaps you're a manager, perhaps you're just a co-orker of these peers. What is the first thing they do is they say, "What is system 12783? No idea. Do I have to worry about this network compared to this network?" And there's, you know, instructional learning. There's institutional knowledge that you bring to these te people when you're doing code reviews. That is the whole, you know, same thing as training. Well, there are 25 different ways that we could do this particular thing in anible, in Python, whatever. This is how we do it here. Let's make sure we all do our things the same way. It's the same process teaching the AI agents as it is teaching your

co-workers. And when you do that, well, everyone works similarly and together. And then there are fewer surprises, fewer unexpected situations. That is, you know, what you do for every human. And we're just suggesting you think about that also for the AI agents you're going to be bringing in. Yeah. So human in the loop design. It's going to be uh you know something that you're you're going to hear more. Uh AI is not I mean it's replacing tasks. If your job is only doing those tasks, yeah, your job might be, you know, at risk. That's why you need to shift and focus to those strategic high-v value tasks. But AI also needs to be managed. So, it's not just like AI just runs on

autopilot and it's going to work just great just by itself without any human review or or management. you are still going to uh be very important. Humans need to be the arbiters of good and bad output from the AI systems. Um what your future resume will look like. This is actually going to you know uh impact your resume. You need to be thinking about okay showing or measuring how you improved the AI the accuracy and performance of the AI systems. You should be able to say okay this was it's a perform it it got like 95% of the cases uh correct be you know out of the box and I got it up to 98%

for example you need to take take credit for the results of the AI agents that you are responsible for that is it's it's uh and so what I am proposing I'm I hope that this is not being pedantic But we always talk about ourselves as users of software. And I think with AI agents, we are no longer users. We are now like reviewers and managers of the software because a user uses a tool to do the work, but now you're not doing the work. You're reviewing the work that the AI agent has done. So it's a it's a shift in mindset. Again, I want to come back to like the routine and repetitive and strategic and high value

tasks. Pick the ones that are on the the right side, right? Those are the ones that you should be like advocating like I need to spend more time on this. If we are going to bring this AI agent technology into our organization and integrate it into our processes, that means I need to be able to spend less time on the the left side stuff and I need to be able to spend more time on the right right side stuff. Um, so the final steps, things to take away, you know, study examples. Um, like I mentioned, you know, two weeks ago, um, you know, Microsoft made a AI agent for sec cyber security announcement. uh Google did as well you know on the

non-cyberc Salesforce and service now are are doing a lot and then there are all these vertical specific AI agents um uh you know experiment with GTPs building your own GTPs uh and just learning concepts like inshot you know giving it examples retrieval augmented uh generation chain of thought not that you have to build these things but they're helpful for you to kind of uh manage these AI agents better because you kind of know how they work or how they behave. Um yeah, anything you wanted to add? No, I think it's a good time for Q&A. The question was about how specific like does it need to get to the task um that you're trying to train it

to to do? Uh what I'm talking about is kind of like commercial off-the-shelf stuff. They should these should be um AI agents that are built to do a specific thing like automated pin testing or alert triage or uh prioritizing application vulnerabilities. So that's kind of the vendor stuff, but the it's the same with any vendor. They have to make they have to build their solution to meet all of their customer base. But each customer is going to have some knobs and dials to to dial it in for their uh particular organization. And that's where the human in the loop design is going to be super important. Yes sir. What's the advisability of using AI to generate

some process that process for that generate process for AI? Yeah. Yeah. You kind of like how should I be interacting with you? Like can you create a good prompt for for this? Yeah. Totally. I think using AI to help you use AI also works. Yes sir. Have you guys been using uh or baking in model context protocol features a developing so that they can use the tools that are process model context. I'm I'm I I know that at least for our company our solution we are teach we we yeah we do give it skills so it has like skills modules and those can uh so yeah so it's kind of an agentic system it's not just

one LLM it's kind of like first it like comes in as a planner and then it decomposes that task and then it calls the different guys who knows how to do the different things and then yeah yeah I would say the the answer is similar to the previous one and that is if it's working well then yes you know how much should we do before we start training it as much and no more think back to the keynote on risk you know do what you need and don't go further than that and we're getting the zero minutes mark out here thank you Jesse thank you everybody for taking our talk really appreciate