BSidesPDX 2025 - Saturday, Track 2

Name: BSidesPDX 2025 - Saturday, Track 2
Uploaded: 2025-10-25
Duration: 5 h 19 min 26 s
Description: BSidesPDX 2025 - October 24-25 at Portland State University. BSides Portland (BSidesPDX) is a gathering of the most interesting infosec minds in Portland and the Pacific Northwest! Our passion about all things security has driven attendance from other parts of the country. Our goal is to provide an

BSides PDX · 20255:19:26514 viewsPublished 2025-10Watch on YouTube ↗

About this talk

BSidesPDX 2025 - October 24-25 at Portland State University. BSides Portland (BSidesPDX) is a gathering of the most interesting infosec minds in Portland and the Pacific Northwest! Our passion about all things security has driven attendance from other parts of the country. Our goal is to provide an open environment for the InfoSec community to engage in conversations, learn from each other and promote knowledge sharing and collaboration. The Portland and greater Northwest information security community spans a broad spectrum of participation from CISOs, Fortune 100 company security experts, small business system admins, to independent security researcher. bsidespdx.org

Show transcript [en]

Heat.

[Applause] Heat. [Music] [Applause] [Music]

Heat Heat.

Heat.

Heat. [Music]

Heat.

[Music] Heat. Heat. [Music] Heat. [Music] Heat. [Music] Heat.

[Applause]

Heat.

Heat. Heat.

[Music]

[Applause] Heat. Heat. Heat.

Heat. [Music]

[Applause] [Music] [Applause] Heat. Heat. Heat. [Music]

[Music] [Applause]

Heat.

[Applause] [Music] [Applause] Heat. Heat. Heat.

Heat.

Heat. Heat.

[Music]

[Music] Heat. Heat. [Music]

Heat. Heat. [Applause]

Heat. Heat.

[Music] Heat.

[Applause]

Heat.

Heat

[Music] up [Music]

[Applause] [Music] Heat. Heat. Heat. [Music]

[Applause] [Music]

Heat. Heat. [Applause] [Music] [Applause] [Music]

Heat.

Heat up here.

Heat. [Music]

Heat. Heat. [Music]

[Applause]

Heat.

Heat. Heat.

[Music]

[Applause] Heat. Heat.

Heat. Heat.

Heat. [Music]

Heat. [Applause] [Music]

[Music] Heat. Heat. [Applause] Heat.

[Applause] Heat. [Music] Heat [Applause] [Music]

up here.

Heat. Heat.

Heat. [Music]

Heat.

[Music]

[Applause] Heat. Heat.

Heat. Heat.

Heat up [Music]

[Applause] here.

Oh yeah.

>> Hello mic check.

Good morning all. Hope you all enjoyed yesterday's sessions and welcome to whoever attending today's sessions. So I'm Vina Haragali. I am honored to uh uh introduce my today's speaker, Zach Meade. And Zach is the founder of Harbor Edge Consulting LLC, where he focuses on effective security consulting and helping organizations strengthen their overall security posture with over 7 years of experience in the security world. He has worked across red teaming, penetration testing and advisory roles to help organizations better understand and defend against modern threats. So, Zach is a passionate about bridging the gap between offensive techniques and defensive strategies and he enjoys sharing practical insights with the broader security community. So, welcome Zang. Actually,

give me one second. I just realized I don't have my notes set up.

Okay, there we go. All right, so I'm Zach Me. I opened an offensive security consulting firm last year. Uh, and I focus mostly on pentesting networks and web applications. So, uh, this talk is going to be about mostly password cracking and active directory. Um, I love password cracking. Done in a bunch of organizations. At this point, I think I've done 12 or 14 organizations for password cracking. Uh, passwords are, of course, nothing new. They've been around for a little over 60 years. They first were invented in MIT in the 1960s and pretty much as soon as they were invented, they started getting attacked. Um, but uh over time attacks got more complex and the defenses have changed to

keep up with it. The the biggest change there is that uh the exposure from the internet increases the attack surface dramatically to the point where Microsoft says that there's over 7,000 active directory I'm sorry identity attacks every second. So you know even after 60 years passwords are still really the key to ecosystems and to uh defending organizations. Uh thus knowing how to attack and defend them is crucial. Um, of course, attacks and defenses uh in this area change constantly, and it's this massive game of cat and mouse trying to keep up with it. Okay, so a brief bit of overview in terms of the technologies and methods that we're going to be talking about today. So, password cracking is uh uh

password hashes is a one-way process. So, you can't undo um a hashed password. So the way that you crack them is that you uh take a guess and then you run it through your hashing algorithm and you are uh going to check if the resulting hash is the same as the hash that you're looking for. Uh so the point of this is that it makes it very computationally intense to actually crack passwords which is you know the point. Um but uh it's uh pretty hardware intense too. Um, so there's a couple of different ways that passwords in Active Directory are stored. Uh, you have LM hashes and you have NT hashes. So LM hashes, if you

look up there, are really old. They've been there since 1987. Uh, and they can store passwords that are up to 14 characters, but uh, importantly, uh, Microsoft kind of messed up the implementation a little bit. Um, they don't actually store 14 characters. They split it into two chunks. So, an LM hash is two seven character chunks, which makes it dramatically easier to crack. Um, you'd think that something that's been deprecated quite this long would be gone from Active Directory environments. Uh, you know, this was no longer in use as of 2007, but there sure are LM hashes still out there in the wild. I've come across them five or six times, and uh, if you come across an LM hash, you are

pretty much guaranteed to be able to crack it. Um, NLM hashes or NT hashes are the modernish way of storing them. They're also very old. You know, they're from 1993, but they store the password as one whole contiguous chunk there. Um, which is better, but they're still somewhat easy to crack. Um, this is the way that the the passwords are stored in a hash format on domain controllers as well as on local machines. Um, importantly though, this is rarely the way that the passwords uh the hashes are going to be transmitted over the wire. That's usually going to happen over a protocol called Keraros or Net NLMv2. Um, so if you're snooping on a Windows network,

that's what you're usually going to see there. Um, those bottom two are much more crypto uh cryptographically complex. they take a lot longer to crack and which is a good thing because NLM itself isn't all that complicated. So, uh, lots of things are in the process of moving to Kerros authentication, but many active directory services still allow and accept those sorts of legacy authentication protocols. Okay. So, I'll go into each of these arrows a little bit more in a second, but just a quick overview. These uh there's kind of three different uh eras of cracking for passwords in Active Directory and uh you know, things have changed pretty dramatically over this time, too. So, of course, it's been this

cat-and- mouse game of attacks getting more complex and defenses trying to keep up and coming up with better stuff just for the better stuff to you know have uh issues that are found with them. So, this first era uh biggest part about it like uh I mentioned earlier are LM hashes. They only support 14 characters and also they're case insensitive. So they don't even store uppercase or lowercase information with the hash itself which uh also doesn't help with the cracking speed on them. Um so they also aren't salted. So if you get a list of active uh if you get a list of uh password hashes, you uh can just look down it and if you already have a

correct PA hash for any of those, you just match them up and you know that that account you already have the password to it. um NTLM v1 hashes or NT hashes are pretty similar too. Um they don't have any salt uh just like LM hashes. So in this era, rainbow tables were uh a pretty big thing and a rainbow table is just a premputed table of hashes that trade storage size for computation speed. So uh with non-salted workloads, they make sense. They don't work with salted workloads, but uh they are no longer really in play for reasons I'll go into in a little bit. But defenses today are relatively simplistic by today's standards. They uh haven't

really kept up. Uh and those defenses are just password complexity requirements, uh rotation requirements for passwords, and you know, lockouts after failed attempts. So, you know, the the reasons that this no longer persists is just going to be weak caching and predictable uh transforms. So, you know, if you need to reset your password and you previously had password one, you might switch it to password 2. Okay. So, era three, uh, I'm sorry, era 2, uh, things changed pretty dramatically when, uh, GPU started becoming a real thing. So, hashcat uh is a tool that is uh uh uses GPUs for much higher uh cracking rates. So, in this area, you started seeing people building cracking rigs. So, if you dump a bunch

of GPUs into a uh a system, you can uh build yourself quite a bit of uh cracking power for under about five grand or so. Um but as a result short passwords like under 8 n characters started posing a serious risk to organizations just because they were too easy to crack. Um the other thing on this is that mutations and masks uh with hashcat allow you to uh cover a much larger uh key space without actually having a larger word list. And in this time word list got quite a bit better as well. So uh uh moving on to the rainbow tables that I mentioned earlier uh as you start looking at longer and longer passwords they start getting

exponentially larger. Um for NL for NLM uh hashes if you have if you're trying to build a a table for a 12 character password uh it would take between 10 and 15 billion terabytes to cover that entire key space. So longer and longer uh uh ent uh longer and longer rainbow uh tables that cover those uh lengths just stop making sense and GPUs uh make a ton more sense. So bringing us up to era 3 which goes through the modern era we start getting more complex attacks. So the first one is dumping credentials on a local system which uh is done through Elsass. LSAs is the storage mechanism that uh Windows machines use to cache credentials and through dumping

them you can get NLM hashes, Kurros keys and tickets. Um and if you have a local admin on a system you can do this. Uh the important thing there is patch the hash and pass the ticket attacks. So with that you can uh instead of having to crack that password you can just take the hash itself and hand it to the machine you're trying to authenticate to and say hey I have this hash thus I may I must be allowed to authenticate to this machine. So it just means that you don't need to crack the password at all. Um next one is Kerber hosting. was discovered by Tim Meridan um in 2014. And pretty much what it does is it uh is

if you have a service principle name, you can request a Kerros ticket for it and that the response that it gives you is going to be encrypted with the password hash of that account. So you can take that password hash and then go and crack it offline. um which allows you to potentially get access to elevated service accounts and other accounts in a pretty quick amount of time especially if kerros passwords are not particularly strong. Um so uh the last portion there is going to be asp roasting. So, Kerros has a uh a security mechanism called pre-authentication that requires you to uh provide a password before it hands you over a ticket. Well, uh that can be

disabled. Uh so uh with asposting, you are able to request a ticket and get a password hash out of that uh system without even having any any credentials at all. So, it's a a way to get password hashes with no creds, which is awesome for pentesters. Um, the bottom one there of, you know, what cloud GPUs have done to to this market. So, now instead of needing a five grand cracking machine, you can go spin things up in AWS and do the same thing for 30 bucks. uh if you get hashes out of like a an active directory domain controller, you can crack roughly 90% of them for less than $30, which is a huge boon to

attackers and lowers that uh the floor to get into this kind of stuff pretty dramatically. Um there's tools in uh AWS called uh like NPK which lower that barrier of entry even more that allows you to spin up password cracking utilities with a single command. Um it's a pretty cool tool. I'll show you a little bit of that later on. Um but at this point pretty much all organizations have some sort of cloud resource and cloud-based authentication systems. Uh so password spraying it can now happen completely externally and if you use a tool like uh fireprox it allows you to rotate through passwords with every request and avoid smart lockout and detections. Okay, so really quickly on detection and

monitoring, here are some things that blue team members can use to make sure that uh uh common attacks like curb roasting and uh and as RIIP roasting are caught. Um as well as uh you know on this most of this should be caught relatively easily uh by modern endpoint protection solutions but making sure that this stuff is uh monitored on on and alerted on and getting pentest is pretty important to make sure that the uh that the protections are working as you expect. So moving on to a quick demo. Okay. So this is cracking a 14 character uh LM hash. So like I mentioned earlier, this uh the hash is broken into two seven character chunks. Uh this is not

on a crazy computer or anything. This is uh done on a modern gaming computer with a Nvidia uh 4070 on it. But the the hash rate on this is still very significant. Um, you know, it's running through 6.7 million hashes a second. Uh, and it's fast enough that, you know, I'm not even bothering using a word list here. It just cranks through it extremely quickly. If you look at it, the first half of that LM hash has already cracked. So that it's stump to working on the second half uh there. Um, but you know, password cracking has gotten so fast that even if this was an NLM hash instead of a LM hash, that doesn't even

help all that much. Um, so the entirety of an 8 character keyspace on with an NT hash uh was done in 2.5 hours in 2019 and now it could probably be done in an hour and a half with NPK. Um, so password cracking is getting really sophisticated and uh the old mechanisms that active directory uses to protect uh passwords are awfully outdated. Um, so you can see that the second h half of that cracked now. So the full password is going to be stump town 77 pdx. So that's 57 seconds to crack a 14 character complex password. for the full NLM version, it'd probably take under 10 hours to crack that password. Okay, so here's the screenshot of that

tool I mentioned earlier called NPK. Uh you create it with a oneliner in AWS and it uses spot instances to spin up pretty beefy uh hardware with a bunch of GPUs in AWS. Um but it handles all the administration tear down of the resources for you. So you don't have to worry about extra bills or anything. And it gives you a nice guey to manage it all. Uh in pentest it's pretty common to hear ABC which is always be cracking because as soon as you come across a hash you know you're going to want to try and uh crack it and uh use that hash for lateral movement and privilege escalation. So moving on to protections and

mitigations. Um there it's there's kind of a co a a shortterm uh portion to this and a long-term portion to it. So things that need to happen quickly uh for organizations that are trying to pro protect passwords is MFA is isn't a silver bullet but it really does help. It cuts down on the potential for compromise a lot. So um MFA on all interactive and privileged accounts is absolutely critical. Um also uh require Kerros pre-auth to uh mitigate the risk of asp roasting. Um that's very important too. Uh, Active Directory lets you put in a banned password list, which means that you, if you, you know, use welcome 2015 exclamation mark as the default password, then no one can set

the password to that to try and mitigate some of the the issues that users may bring back into the environment that you thought that you had taken care of. Um, last portion there is to is to harden monitoring. Um it's important to make sure that you know what's happening on your network and that you can catch malicious activity in all of its forms over time. Uh short term there um it's uh with service accounts those pose a unique unique risk. Uh so those uh passwords should be audited and you should try to go crack them yourself to make sure that they uh aren't weak. And also those should be replaced with managed identities if you can or uh uh

GMSA if you can. Uh also uh make sure that you know you can actually detect Kerost and Azre uh attacks. Um you know most modern tools should be able to detect it but you don't know for sure until you've actually tested it. Last there harden endpoints. to make sure that LSA protections are on and make sure that local admin passwords aren't reused across an environment and segment admin workstations from other computers. Longterm there uh this uh is pretty important too. Uh moving away from password-based authentication uh is something that organizations should plan for in the future. uh pass keys mitigate all sorts of problems and are uh a great thing and we should be using more of

them. Um last there uh well second to last uh Kerros keys should be audited and reset on a regular cadence. Um if you do uh have a compromised key and it's sitting out there for you know 6 n months uh that key could be cracked offline by an attacker and they could just go back and use uh the key once it's finally cracked. Regular resets uh prevent that kind of uh attack. Um and then last just security hygiene um making sure that uh uh external monitor uh exposure monitoring is in place and that uh the tax surface is reviewed on a regular cadence is critical to stay on top of things. So to wrap things up here

um three main points uh attackers at this point win on economics. cloud GPUs uh make password cracking a very fast and easy thing for them and better word list reduce the overall time to compromise. Um and at this point password complexity rules and rotations really aren't enough and that detection MFA and pass keys uh plus band lists are easy ways to up the the security posture of an organization without all that much time and effort put into it. And last, identity hygiene uh is important and needs to be prioritized in uh an organization. So last kind of uh thing I want to leave you with is as attacks get cheaper, faster, and stealthier, defenders have to shift to detection and

identity hygiene to keep up. Okay. And now I'll move into Q&A. [Applause]

Hey there. So, thanks for the talk. I just had a question. When you get to the point in a test where you can dump NTDS from the domain controller, what tools are you typically using for that? And are you seeing that action being blocked more often? Yeah. Yeah. Um, Mimi Cats can be used for that and very often is blocked. Um, pretty much any sort of security tool is going to detect and hopefully block that at this point. Uh, the biggest thing there though is just talking to the client first. Lots of organizations are going to uh require everyone to reset every password if you do that. So, you can cause serious problems if you don't

communicate about that first. Great. Thank you.

Thank you so much for your talk. It was really appreciated. Um, I come from a uh MSP background as a systems administrator and currently I'm about to deploy laps uh local administration password and I'm wondering is lapse found in your scope for the hashes on a machine and if so is it could you use lasss to get off the laps password and compromise that machine if you already had some access to it? >> Off the top of my head I'm not sure. I believe that with laps, they're stored in a better format. Um, and no matter what you do, making sure that local admin passwords aren't the same across an environment as a win. Even if it's

not a perfect solution, >> that is what we're trying to do, which is it's hard. >> Yeah.

>> Yeah. Yeah. Hey, is there a particular flavor of uh credential dumping technique that you find in your uh engagements most often missed by you know the EDRs that your customers are running? >> Oh, that's a good one. Um, no. I think that if you are accessing um local cache credentials that the the access of the credentials themselves is usually caught. >> Okay, cool.

Thanks. So the band password list at my org is empty. >> Uh where should I start uh to build it and how big should I make it? >> Yeah, that is a great question. Uh there's a couple of different ways to do that. Um there's easy ones now. Uh chat GPT is actually pretty good at iterating through passwords. Uh so you can generate you know uh here is a default credential here is a name of a company here's a name of a local sports team and have it go through about a thousand uh if you trust chat GPD. If not you can uh start with uh words that are common bases for people's passwords like the

name of the company name of the street it's on that kind of stuff and use uh rules and hashcat to generate lists of passwords that you can try it against.

Uh, since you mentioned dumping credentials from Elsas using Mimikats, what's your opinion on Microsoft's newest credential guard mitigations that prevent that? Have you seen it used in production at all? That's what we have. And uh it's overall pretty good. Um and it it works. It's just uh it's not applied very frequently at this point.

[Applause]

Thank you.

[Music] Heat. Heat.

[Applause] [Music]

Heat. [Music] Heat. [Applause] Heat

[Applause] up here. [Music] Heat [Applause] [Music]

up here.

Heat. Heat.

[Music] Heat. Heat. [Music]

Heat. Heat. [Music]

[Applause]

Heat. Heat. [Music]

[Applause]

Heat. Heat.

[Music] Heat. [Music]

[Applause] [Music] Heat.

[Music]

[Music] [Applause]

everyone. So we h I am happy to introduce today's uh speaker. Can I get the attention? Okay. Yeah. I'm happy to introduce today's uh uh speaker. She's Udochi. Uh she is an infrastructure and product security engineer with over five years of experience securing large scale systems at Adobe, Coinbase and Juniper networks. She holds a master degree in cyber security along with the CISSP and CISM certifications. Udoi is passionate about bridging the gap between cyber security engineering strategy and helping the organizations move from reactive security to proactive resilience. Welcome Udoi. [Applause] >> Hi everyone. Um, uh, thank you so much for being here, especially as it's launch time. And thank you to BS size PDX for this opportunity. I really

appreciate it. So um if you work in endpoint security or detection, you're probably familiar with the problem of endless data, endless telemetry from endpoint devices, um sockets, processes, network events and these are things that are being recorded like almost on a per second basis. We get a fire hose of data and we get this daily of data but we don't really have like a lot of insight as to how to interpret that data and actually use that data in ways that are effective for security. And this talk is mainly aimed at teams that are looking to improve the endpoint detection and response program and get actionable information from their data. So the question then becomes what makes an

event interesting? When should we take a closer look at data? This talk isn't about collecting data or deploying models. It's about what do we do with the data that we already have? How do we reason about it? How do we build thresholds? How do we connect signals in ways that reduce noise without losing vital information? We could also ask when is the right moment to pull in an analyst or a detection engineer. How do we design alerts that are both efficient so we're getting the right data but also humane and we are not burning out analysts. Over the next few minutes we'll explore different types of endpoint queries to see what kind of information we can get.

So when we talk about endpoint visibility, we usually talk about adding more sensors, getting more logs. But the issue isn't really about the volume of data we should be collecting. It should be about um sorry, the issue isn't really about the visibility of the data or rather the visibility into the data, but the volume of the data that we're collecting. A typical midsize organization might generate hundreds of thousands sometimes millions of systems event a day. every binary execution, every socket is recorded. But then we noticed that out of that float probably like we can get maybe 10,000 alerts a day and even fewer will lead to real incidents. So the challenge isn't in collecting more data, it's in

distinguishing what is interesting versus what is actually important and actionable. And to build that logic, we first need to understand what different endpoint queries actually reveal. So once we accept that we can look at everything the next step is to decide what kind of signal each query gives us. Looking at this table the first layer looks at single event anomalies. So here we're talking about things that in by themselves might look odd. Like for example, say a file executing from the TMP folder, a new launch agent um appearing in Mac OS, a PowerShell command in B64 text. Technically, these are interesting events, but they don't really give us lot of information, and they're not actionable in and of

themselves. The next layer looks at correlated multi-event patterns. Here we start linking various events that we see. So for example thinking of that um file that was executed in the tmp folder what happens it says opens a network socket what happens if it drops a file into persistence. So now we are seeing a story not just a snapshot and our confidence is increasing and then finally we have time bound sequences that's when a behavior unfolds across time. So for example, imagine we have like repeated login failures and then we have a successful admin login and a service installation. So now we're moving into actual incident territory. So what does this tell us? Each layer increases confidence. The more

relationships and timing we observe, the fewer false positives we get and the faster analysis and focus on what truly matters. So basically the takeaway from this section is one endpoint queries tell different kinds of truths and two confidence grows with context correlation and timing are our compass. So now that we know what queries can reveal let's explore what patterns deserve our attention. So what I want all of us to take away from this slide is not just memorizing queries and being like the most technically competent query writer but having a mental framework of how to look at events. I mean we have different systems and we may never see this exact alerts but there should be an approach

that we take when looking at endpoint events. So across Mac OS, Windows and Linux alerts tend to fall into three families. We have execution anomalies, persistence mechanisms, credential or privilege abuse. Execution anomaly refers to when something is running out of an unexpected location. Kind of like the previous example of running a file out of the TMP folder. Uh present stance mechanism, something that is trying to stay, you know, something like a crown job, a scheduled task, um credential abuse, someone trying to expand access. So maybe dumping it to shadow um dumping elsas things like that. So these are just surface stable systems but the attackers motivations are you know for access expansion persistence expiltration through command and

control. So what matters in this context is you know one the behavior is that behavior in the wrong context executing files for example from like unexpected locations sequences that connect tactics persistence leading to network compromise leading to privilege escalation actions that break you know are we seeing um events from during off hours unexpected bust unusual hosts. So we need to develop a triage reflex that says you know if we are seeing abuse of trust if we are seeing deviation from the routine if we are seeing different events link across then that is something that we should probably take seriously. So if one can train themselves to like sport these three patterns um you know no matter what sim

to query language you're using um this forms a foundation for how to score and paritize alerts later. So now that we know what types of events matter let's talk about when they deserve an investigation and that is where the three question rule comes in. So after all these patterns and signals, how do we make it practical? This is what we call the quick question triad rule. And this is a rule that you know a handy nick rule that you can have. Say you have to look at something by 3:00 a.m. These are some basic questions you could ask. One, context. Is this something that is normal for the host user? If it's something that you've

never seen before, then it deserves a closer look. Secondly, does it touch credentials, admin account, system processes? Anything that crosses privilege boundaries definitely deserves a closer look. Third, correlation. Does this errors connect to others by time, behavior or system? Even if signals are weak in the individual, in the aggregates, they become more serious. So if you can answer yes to two or more of those questions, then this is definitely something worth investigating. So before we talk about thresholds and scoring, let's talk about what separates signal from noise. One, we think about intent. What are the attackers's goals? It could be to maintain persistence, harvest credentials, execute lateral movements. No matter what the exploit, no matter what the events you're seeing,

this will always be a constant. And normally a universal deviation detector, you know, something that breaks baseline behavior. It could be a rare process an offer launch an unexpected parent child link and context privilege sensitivity regulation. Now when we talk about context they don't necessarily make als malicious but they definitely amplify its impact. So think of these as your three metal filters even before you start the detection process. So you again like I mentioned you don't need to memorize every query you need to recognize patterns as we mentioned before context who where when often more predictive the content you know a true signal appears when multiple small behaviors connect intent to impact when in doubt ask yourself is this behavior

abnormal privileged persistent if it checks two of these boxes is worth your time um you know incidents don't announce themselves. They accumulate a sudden sequence, pattern of repeated noise, shift in baseline frequency. These things could be like your first hint of compromise. And obviously we can't run a sock on instinct alone. Um one we once we understand what kind of allies deserve attention. The next step is to make that judgment repeatable, explainable and tunable. And then this is when we begin to talk about thresholds and scoring. So you might be wondering what do these numbers mean? How did we arrive at it? Um let's break it down. So first of all, we define a base scale. In this example,

we're picking 25 as our lowest credible anomaly. Um for example, say something like an unsigned binary in the TMP folder while odd is common enough and can warrant like our lowest scoring and then from there everything becomes relative. For example, an outbound outbound traffic to a red domain happens less often and is more likely to lead to a compromise. So that warrants a higher base rate. Um a persistency like persistency like through a launch agent has an even stronger indication of compromise and that warrants a higher base rate. So we don't just come up with these numbers arbitrarily. We look at history. What have we seen previously from our systems? what have we seen previously

from the organization and we use that information to inform what the base weight of a certain system event can look like and then in addition to that we apply context multipliers. So for example, sometimes we could have the same behavior across different environments. But because they're happening in different environments, it leads to a different severity. Like for example, you know, an issue on a developer laptop would definitely have less severity than say something on a production server, which would definitely have less severity than say something on a domain controller. So when we talk of multipliers, we're thinking of example sensitivity. Is it like on the dev laptop, a port server, the privilege, you know, is it a user or

an admin? The data type, is it public information or PI protected information? Um, you know, the frequency and the clustering of events. And so once we have both the base weight and the context multiplier, we can we can calculate a final score which we map to threshold. As you can see here, we have informationational medium high and critical. And you know we don't just again just like with the numbers we don't just come up with them arbitrarily we are connecting these numbers to the sock capacity. So say your team can only handle 30 manual investigations in a day then you should probably tune your high and critical alerts to produce like 30 investigations in a day. Um you know we

are trying to have something that is sustainable something that doesn't burn out analysts and a program that is realistic and workable. So the combination of the base weight and the context multiplier assigns a priority to an issue and obviously assigns the priority of like what should be investigated versus what can be dep prioritized. So diving in deeper into like base weights and context multipliers because you know this is the calculation and the prioritization and just want everyone to really understand how that works. So we think of base weights as confidence ratios like for example say a persistent artifact can cause more verified incidents than a benign unsigned binary. So it ends a higher base rate. Base rate

measures the inherent risk of an issue before we talk about context. So when we talk about scoring it depends on three layers. We have prevalence, directness and correlation value. With prevalence, we ask ourself how often does this event appear in our logs in our data. If it's common, then we assign it a lower weight. We talk about directness. This action does it indicate the attacker preparing for an attack or an actual attack action. If it's a direct attack action, say a privilege escalation and execution, it carries a higher base weight. And then for correlation value, how often the behavior has led to a confirmed incident that also bumps it. So in summary, we have you know if um we

have um less commonality more a direct action and correlating with other signals that improves the base rate that we assign to an issue. So and thinking also of in future this is not just static data. This is something that is fed by the actual data that we see in our logs. And so there has to be like a quarterly review just to make sure that this data is stale and is still relevant to the organization in future. And then the deep dive on context multipliers. We've already understood what base weights mean. But as anyone who triaged alerts knows, context changes everything. the same signal can be written in one setting and an emergency in another. So when we talk

about context multipliers, we're talking about small explainable adjustments that reflect where and how an event occurs. So we think of the base weight as the seed, the starting confidence. Context multipliers are the so weather and fertilizers. They don't change what the event is, but they change how quickly it grows into a response. Multipliers make a model alive. For example, a PowerShell command at noon from a developer laptop is different from receiving that same notification say at 3:00 a.m. at night. So when we talk about context multipliers, we have five dimensions. We have privilege, place, pattern, peers, periods. Basically, we're asking ourselves when we talk about an issue, who generated it, where did it happen? When did it happen? how often is it

repeating and how many other signals agree. The more of these tend towards abnormal or high impact, the higher the weights. And just like with the base weights, we know we calibrate this at regular intervals just to make sure that they're current and useful to the organization. So basically at the end when we combine inherent risk of base weight and the situational context we move from like um static to an adaptive detection. Basically a detection that is more reflective of what is actually happening in the organization. And so this is one way to reduce noise without losing visibility. So to make all this sustainable we wrap the scoring logic inside the governance framework. For example, each detection

rule lives in the catalog with its own owner mapping expected noise level and last training dates. So this makes it both accountable and traceable. Um every quarter just like I mentioned we review the weights based on true and false positive data and real incident outcomes. We then pipe this course into the incident playbook. In terms of how we address different levels of incidents in the playbook, for example, maybe low-level events can be logged automatically, mediums could be ticketed and then we can focus our um efforts on high and critical and make sure that we resolve them within whatever SLA aligns with the organization's risk expectations. So this keeps the volume of the alerts and the sock constant um

with the human capacity of the sock team. So we're not having burnout, we're not having overwhelm and the sock analyst can actually focus and do good detection. And then finally we measure metrics like say the false positive rates, the alert volume per t and the meanantime to respond. And then finally we get all this data and we feed it back into the waiting model. So making sure that the waiting model has continuous feedback, is current, it's informationational, is useful, and it never gets stale. So up until now, we focused on how to calculate what matters. In this slide, we're going to turn math into action without burning out analysts. So this table is the product of aligning risk

scores with human capacity. So basically for example for low-level um alerts these are low these areformational and low confidence signals for example single events routin noise common pattern. The goal here isn't to have a response is to have an observation or rather to observe them log them trend them and see if there is any significant change in that trend that warrants a response. And something that really helps here is automation because given the volume of information we get at this stage is going to be overwhelming for any sock team to manually triage them. And then the next level we talk about medium level incidents and here we talk about things where we are seeing correlation but we don't yet have full

proof that an incident is in progress. So for example an unsigned binary with outbound traffic but no credential use. So we can ticket this and maybe have like say a 24 48 hour SLA. You know, this is not something that we need to be on the clock with like so analysts. This is not something that we need to wake up someone by 3:00 a.m. to look at. So basically we are class, you know, just classifying low and medium just so that analysts have the attention and the bad and the bandwidth to focus on the high and critical issues. And then speaking of the high and critical issues, um this is where we have behaviors like

persistence, credential access. These go to the incident response team. Depending on the SLAs for the team, these could be things that maybe have to be resolved in an hour, two hours. So basically making sure that containment time, you know, equals survival time. So um how do we how did we arrive at this cutoff? We ask how many meaningful incidents can our team handle in the day. Then we tune our threshold so that roughly many these alerts land in each bucket. The point isn't to have a perfect detection program. The point is to have a sustainable program that we can work without burning out analysts. I mean at the end of the day if everything is

urgent, nothing is actionable and so tearing these um issues kind of make us you know realize okay this is something that we need to prioritize. This is something that I should wake up at 2:00 a.m. for versus having everything be a fire and burning everyone out. So um if there's anything I would like everyone to take away from today, let it be this. The job of the analysts and the detection engineer is not to analyze or triage every issue but to focus on what matters and focus what brings the best investment for the detection program. Some principles to remember think in pattern not to the signature changes but the pattern doesn't. Think about execution, persistence, privilege. Use

triage heristics and not gut instincts. A high critical a higher critical store should mean something to the sock team. You know if someone gets something that is labeled higher critical they should understand why that is. So it should make sense to the humans that work in detection and not just the system. Um calibrate thresholds to human capacity. A higher crit Oh sorry I've spoken about that. Keep your scoring model alive. Let data refine your intuition. Again, think about all that data that we talked about. They're living things. They're things that can change. It's not just something that is static. It's something that we keep monitoring and including feedback in to make sure that is relevant in current time. Um, detect

intent activity, the best detection mirror, adversary goals, not every possible signal, and above all, focus on actionable alerts and not noise. So yeah, basically um you know the big takeaway from this is you know not everything is important. Um focus on what is actionable. Um over the course of this talk we have gone through like the factors, the characteristics, things to watch out for to make know to help us identify what needs our attention versus something that can be prioritized. And so yeah that's um basically it. Um yeah so thank you everyone and see if anyone has questions.

>> Yes please. Have you found any offtheshelf?

I mean, I won't say that I've found a specific product, but I do know that especially when it comes to things like alerting prioritization calculating risk scores, that's something that AI could help with. So, that's definitely something that a smaller team could leverage to help cut off most of the manual work that's involved in this. >> Yes, please.

I like how you um think about kind of tuning or calibrating the severity of different alerts to like the capacity of your analyst team, but do you ever have like um organizations you've worked for? Like do you ever worry that that could be obscuring like well maybe we need more analyst capacity? And if you if you limit the number of high severity alerts, let's say to based how many they can handle a day, could that be could you be hiding things that you'd want investigators to look into? You know what I mean? >> Yeah, definitely. I mean, you know, in security an ideal state would be to secure everything, do everything, but unfortunately, we work within budgets.

We work within organizational priorities. So, that's something that, you know, the security team has to communicate the risk to the wider business and say, "Okay, this is the risk of not having enough sock analysts on our team and, you know, negotiating that." And you know unfortunately if that's not something the company can support then that is where we have like a security exception process just to make sure that the companies are aware of the risk of not having a sufficiently staffed sock team. >> Cool.

Thank you. Thank you so much. >> Thank you. [Music] >> Somebody lost an airport. If you find some 8.0 airport on while on your way, please give it at the uh lost and found at the first floor, please. Thank you. Heat. Heat.

Heat [Music] up

[Applause] [Music] [Applause] here.

Heat. [Music]

[Music]

Heat.

[Applause] [Music] Heat. [Applause] [Music] Heat. Heat

here. Heat. Heat.

[Music] Heat.

Heat. [Music] Heat. [Music] [Applause] [Music] Heat.

[Music] [Applause] [Music] Heat. Heat. [Music]

honored [Applause]

to introduce today's speaker is uh Febin. I know it's lunch time but believe me he's a great speaker. Uh Fevin is a senior security engineer on the detection and response team at Remitly uh where he focuses on building and refining detections, leading incident response and driving proactive threat hunting initiatives across cloudnative infrastructure. Uh with a background in digital forensics and incident response, FEVIN has investigated over 400 ransomware insider threat a nation state in intrusions and a cloud breach uh cases during his time as a senior consultant at uh Charles River Associates. His work also included supporting ransomware negotiations and advising clients across healthcare, finance, education and technology sectors. So welcome Fevin. [Applause] >> Um hi everyone. So my topic for today is

octa detection engineering from logs to detection. Um now why this talk? Octa primarily is an identity provider meaning it handles how your users authenticate and what kind of applications they have access to. So they gatekeep authentication and access and since getting access into octa for an attacker is mo mostly like getting access to almost everything in your network. This this is like one of the uh prime targets for attackers out there. And the challenge with Octa or any other uh IND platform is not the lack of signals but there's a there's a lot of it and from within that signal how can you identify what is an attacker behavior what's an insider threat behavior and what's your usual user

doing some dumb stuff right um now I'm using octa as an example because u at least from their website they claim 15,000 customers 23 of fortune hundreds meaning within all the IDP u o octa becomes like a high value target for threat actors and threat actor groups like lapsus been targeting IDPs as one of their initial foothold for a pretty long time and I guess like the recent Verizon report also mentions 81% of uh compromise happens or starts with a compromised identity uh and so that's why this talks more is about like making octa logs less mysterious and more useful now a little bit about myself I guess uh she gave a good introduction for me. Uh

currently working as a senior security engineer at Remittly it's a global remittance platform uh where we work on providing trusted financial services across borders. Previous work life it was more on the consulting side focusing on instant response and forensics. Um all right this is what we'll be hitting for today. uh breaking down what actually octa logs how those logs uh track back into user behavior how we can identify patterns within those behaviors uh from there I'll also move into some practical detection ideas that you all can use in your environment and finally I'll wrap up with some uh you know links to research and stuff that help me uh do my dete octar detection engineering work

um if that sounds good I'll start with stuff all right Uh now Octa provides you with a detailed system log of every almost every event that happens in your environment and they expose that via uh the GUI on the admin console side and you can also use APIs to transfer this log into your SIM solution either as stream or batch configuration. Um and each log type has an event type associated with it and that's what gives you like hey this log talks about authentication or this log talks about access. So event type is that identifier that lets you know what the logs are about. For my work I had to start organizing all these different type of

logs cuz last time I checked there's almost,025 unique event types. So you need to have an idea what kind of functional aspects these logs hit and this is the uh the categorization that I ended up with authentication and access event that gives you everything to do with user authentication. and what kind of MFAs are accessed, what policies they were hitting before an access was made. Management access relates to everything from an admin action perspective. Uh if they go on changing, which user groups can access, what kind of apps, stuff like that, any changes will be logged in the management events, application and SSO events, uh tag or log, all the o token path that happens between octa

sessions. um security and threat events uh is is a pretty uh useful set of logs. So when you pay for Optai, you're not just paying for the service, you're also paying for their threat and the kind of threat actors they are tracking uh the kind of insights they want their customers to know. So these events uh this classification will have those kind of events. Uh workflow and automation octa lets you automate certain tasks based on certain triggers and any action that goes into that site is logged under workflow and automation events. system and operational events are more like from API uh rate limit perspective stuff. It's it's more logs of how octa works in the background but not the

actual stuff that they provide us. But there are certain cases where these logs are also uh comes handy. Um now with these groupings I'll kind of jump to a quick overview of the log structure. Now my idea is not to go through everything but just point out certain stuff that that we'll be using um throughout the session. UU ID or the event identifiers provide you a unique value to track back a certain event. Um event type the one that I mentioned earlier tells you what kind of a log it is. Actor in any octa log is the is the entity who initiated the action. And for automation and octa related stuff you will always see octa uh system at the

octar.com and that's why that's how you identify work workflow based activities or like actions octa takes inside. client will always have the the access information of a user in terms of what device they were accessing, what kind of application they were using to login. Uh authentication context would always contain uh what kind of MFA did they use? Is it a fishing resistant MFA? Does it talk about uh user presence or user verifying stuff like that target is the entity that was target of an action? Now if it's a admin activity involving a user group, the user group would be the target. If it's a IT admin doing some stuff on one of the users, then that

user would be the target. Outcome will just tell you if the action was successful or not. And something that we'll use a lot would be the debug context part. Now debug context will contain all the debug actions OCTA took before letting a user in or denying their access. Um and this would be the stuff that we'll be using for detection. Before we start with the detection piece, there are like certain caveats and edge cases that we should be thinking about when we are working with octa logs. Um certain logs doesn't give you the ability to track back to the authentication process. Let's say you are in an authenticated session does certain set of actions. Those actions will also be

logged but in certain cases you cannot track back that action to the authenticated session. meaning it becomes tough for your instant response actions because you want to know how this action was allowed in your environment. In those cases, you have to find workarounds. Um this is something that that we have notified octa hopefully they'll be working on it. Uh admin console gives you a admin level access depending on what kind of an admin you are. Now any actions you take like from a change perspective or any edits they are logged but if you browse through stuff look through stuff and if it's a threat actor who is doing reconnaissance via the GUI you'll never know what what

resources they were snooping around. Um another big pain point is the target field JSON structure. U it's a multialue JSON array but without a good key value pair definition. Meaning if you want to know what is inside each of those object you have to go through it figure out the value and then decide whether or not I should be hitting it. You'll never get to see hey this target object is everything to do do with authentication. You won't get that. You have to go through and figure out. Um with that ev audience to see how a good detection should look like. um doing click ops or gy based uh save search is is not how we

should be hitting detection at least in this day and age. uh you should have detection as a code maybe a ML file that holds a good schema that your team has agreed upon something that'll talk about the detection you know what the intent of the detection is uh Splunk because I'm more familiar with Splunk I'm using Splunk as an example the actual code that goes in and then some metadata associated with the uh detection like you know who's the author what kind of miter attack tactic are you framing this detection for uh is it a TT DB or an IOC based detection stuff like that and also uh the configuration elements cuz when you run something in Splunk or when you

say I have a detection it's nothing but a search that goes through a set of logs you have. So you need to specify uh what's the window of the log that you're looking for at what chron schedule you want these searches to run. So you know codify everything and have it detection as a code is the right way to go. Um all right now actually going into the detection engineering side. Um we'll talk about how we can we we'll figure out what octa actually logs um and what these octaar detections actually mean for us. Um and the tough part for a presentation like this is there are so many of the event types so many of

detections you can write. You know how do you how do you come up with something useful for the audience? Um for that I'm taking these four detections. Uh with all these four detections have the same base fundamental that is login without pho um for people who are not familiar with pho it's a hard hardware based fishing resistant MFA and uh it it it saves you a lot of trouble like if if you are using pin code or passcode based or pushbased MFA you are you're really asking for trouble. This is how the industry should be going with and that's why I'm I'm starting with detection of login without pho once we have the login without pho set up then we can start

using it for multipro detections like go impossible login from to or VPN because uh a lot of the DPRK worker detections you can you can at least start with VPN and also threatened input that your team is providing related on what kind of ASN they are accessing from or if you are belonging to a certain type of organization There'll be threat actors targeting that specific organization maybe finance, healthare and your threat teams probably have a good list of um network infrastructure they be using. So you can have four different detections but with the same base logic. Now when I was working on dete octa detections the way I was going with is like researching the behavior meaning I need to

understand how octa sees login without a pho. Once I have that behavior sorted out, I can create a base search and you test that behavior because every organization is different. Every organization have their own uh exceptions or high value uh user groups that they don't want to have phyto based MFA and all those problems. So you need to understand how this detection would work in your environment. And for certain detections though you would be using octas horn um behavior insights and to use those insights you need to know how those insights are configured in your in in your environment. You cannot just use it out of the box because you have to configure it first

and then use it in your detection. Um all right and for again for people who are not uh used to phyto based MFA there is another besides Portland talk from 23 that says lapsis is winning um it talks about the uh the importance of having MFA based hardware based MFA and uh couple of other details to uh make sure that you you are you know you're ready for all the uh uh nation state detectors. All right. Now researching the behavior. So when a user is accessing octagated SAS application from a device enrolled in octa verify or for cases where uh the device is not enrolled in octa verify you get to see a flow of event types and

usually those event types are you know oath by MFA some policy evaluation and these ones. Now the goal of a good detection is not to use all the event types or all the insights you have but rather have rather identify those event types that'll give you everything you need make your detections less blotty or and or or less brittle and the required signals that I'm looking for when I want to write uh a detection for o without pho would be I should know if it's a successful login or not. I need to have the user information of the user who initiated the authentication. What authentication methods were used like is it pho uh the client information? What

was the user agent? What IP address they are coming in from? Um you know what's the as or name as value? Uh bunch of those information and certain octaine insights or enrichments where they say hey this user is new for this user. Oh, sorry. This IP is new for this user or this user agent or this city or this country is new for this user. So those insights and enrichments are what I'm looking for. Now um I can open up each event type and show you all but you have to trust me on this one that from my research what I identified as the best event types for identifying login behavior is user authentication by MFA

and user authentication verify. And I know it's kind of tough to trust someone you just met. So I have uh I have both these logs out here. Um so this one is a user authentication verify log that I was talking about. This event usually couples all the MFA o or all the MFA evaluation as associated with an event. So ideally if what there was like three set of MFA factors that you had to provide to get your authentication sorted all of those three o should be coupled into into user authentication verify but in certain events you'll never see this event but you'll only see the individual MFA o that's why I'm using both of these events uh for

writing my detection and you can see that you have the actor information you have the client information um you'll see if it was a successful auth or not um and the target field that I was talking about, we'll we'll look at it into further detail. But everything that I want or all the values that I was looking for in the required signal part is contained here and that's how I'm going to hit my detection. Now this is a pseudo code SPL and with with the information we have that we are looking for octa system logs. Um the names are painful right like you have to use a lot of spath to get to the actual

stuff you want. So there'll be like a lot lot of renaming um uh commands that go between the index and the search. And since I mentioned those two are the only event types that I want to focus on, I'm searching for those event types. Event outcome success. Um because whenever you have something like an IDP out, th actors are always knocking on the door. So if you start uh writing detections on failures, it'll it'll be a it'll be a mess. Um then you have to extract all MFA details. Now if you look at the screenshot you can see in the target section uh octa logs what kind of authenticator it was what's the display name associated with it and what are the

properties of the method that you use you can see fishing resistant device bound user presence all that information there now I particularly like to use the properties field and not the method type used because octa fastpass another uh tool of octa in certain event flaws are fishing resistant but in certain ones they are not. So just going by now the indicator's name uh or the method type is not the right way to go at least from my research. You should always hit the verified properties angle. And for me I'm specifically looking for the fishing resistant tag there. And the idea with this detection I had is rather than following each and every transaction ID or rather than following

uh logs in the in a transactional manner, what you can do is just bucket all the events that you need including MFA details into user email and source IP address buckets and within within the source u uh source user and the source IP bucket you look is there even a single fishing resistance assistant login that was enabled. And why am I doing this is let's say a user logs in to an app that's not a high priority app and they can probably login just with their password. But in that same let's say 30 minute time frame if they access something that's uh of a admin level access probably they'll have to provide the fishing resident MFA. So and if

that's happening from the same IP address, you essentially are not going to have false positive for the first event since you are bucketing all this information together and uh at the end you are like hey for this user and this IP address is there even at least one fishing resistant MFA and that's how you reduce your amount of false positive um and this this is the user authentication verify log it also has the same method used verify properties I was talking about. Um so just want to give the audience a chance to see how the logs look like in octaar and this is the actual detection SPL. Uh kind of looks ugly when compared to the

pseudo code but again just giving the audience a chance to see how it looks like. All these painful parts that I had to do is because uh of the target JSON field structure uh which doesn't let me identify what the what value I'm looking for. rather I have to look through everything to find the value that I'm looking for. Um and in a in a graphical repres representation this is what I say when I talk about buckets if you imagine both these buckets are for the same user it's the same events that we are looking for authentication verify o via MFA for two different IP addresses one bucket has only web authen fishing resistant methods so this will pass and wouldn't

trigger my detection but the other one that's using password push or octa verify push this particular bucket would trigger my detection. Now whenever you are using multi-events to find a detection or find a behavior uh the solution is always go for a sliding window uh approach where there is overlap between the time frames that you're looking for. Uh this way you can make sure that uh your detection is not brittle and won't break if there is like a log latency or if there is like inter event log latency dependency and and having a sliding window approach at least from my research has helped me reduce the number of false positives that end up uh reaching my pipeline.

This the sliding window diagram has nothing to do with uh the explanation. It just just want to visually represent how it looks like. Uh now that we have the base search that I showed you all like the pseudo code or the actual SPL I can now marry it with threatle input and be like non pho login and the source IP is part of our threat lookup or you can be like login without pho and octa identifies the IP as a VPN node. Um in the debug context debug data in the tunnel section you can see that octa was identifying this particular IP address as part of expressvpn and that it's type. Now uh from observation the

tunnel field exists only if octa identifies an IP address as part of a tour node or a VPN node and uh also also if you see in the behavior section you can see certain values new location new geoloc all these are behaviors that octa identifies for each user when they are passing through an authentication flow and we'll talk about it in the uh incoming slide. Rightes. But finally now you researched your behavior. You identified what logs are useful. You created a base search. Now you got to do the testing behavior site. And testing the behavior is almost like a threat hunt exercise cuz now you have a detection that will help you identify login without phto. Now you go through

your organization logs and you figure out are there any surprises, right? like your IT team can provide you information that hey there's no user or group without um uh you know deprecated MFA or who are using just password but you using your query can actually unearth and figure out if there are exceptions and if there are exceptions you can use those are talking points to communicate with your cops or IT team and and figure out what the hell is going on. Um also is a search slow. You can create logs that will uh you can create detections that uh that'll use all the different type of event types do all these funky things. But if it's actually a slow

search, how much pressure are you putting on your detection engine when you have like multiple of these detections? Cuz one detection won't solve your problems. You need to map out figure out along with the miter attack framework or some other framework and see your gaps. So try to strive for detections or searches which are not slow or too resource intensive. Um are there events where the detection logic is breaking like are there events that you never accounted during your research phase that is showing up from the logs maybe uh and also you cannot research everything and figure out there'll always be edge cases but having to have test your environment and test your logs you'll get to see all that uh places

where the detection logic might be breaking you get back to your workbench uh may either have a different detection if you feel your initial detection might get diluted or find a way to handle these um record the observations uh one so that you can speak to your IT or copsec teams and second I've always found that having runbooks that talks about common false positives is is a real good habit to have cuz you might have thought about the false positives have your suppressions in detection but there might be false positives of the same nature that might surface up so for an on call engineer or a sock analyst it's one of the best help you can provide by

talking about hey these are some common false positives while they go through the run book. Um, and also some octane answers like uh you might think that the only value associated with a particular event field is positive and negative because you just saw positive. You assuming a negative should be the next other value. But there are cases where octa cannot identify something as positive or negative and they be like bad request or unknown output. So there are nuances that you should be looking for and never assume anything when you're writing your detection. Another nuance that I've noticed is when you have like iCloud private relay uh on your Safari, Octa would identify it as a

proxy and then you know would allow probably on every user that's trying to access. So those kind of nuances you figure out and you filter them in your detection. Now this is the part that I want to talk about regarding decoding the behavior by m matching octa configuration. So our same detection login without pho and let's say we want to marry it with go impossible user um you can use octas behavior outcome in the debug context that says new location negative or it it says hey is it from a new location is a new geol location is it a new device so these are some of the behaviors that you can use in your detection u and if you

see the pseudo code here I'm using the same pseudo code up until the bucket bucket all relevant fields and After that I'm like hey check if velocity is not negative. Um another nuance uh always like don't don't make the detections brittle like don't make it too rigid by saying check if velocity is positive is the only time it should fire cuz apart from positive there'll be some other values that will be hitting this field that I'll show you in a bit. So try to make your detection more um I don't know like what's the right word but like it it should be accepting changes and not set to fail because it's just looking at one type of value. Um

now these are all the behaviors and you really need to be inside Octa in the admin console under security behavior section and look at how your current behavior uh configurations are. um let's say velocity which which is the one I'm using in my uh detection velocity is calculated by understanding the time and distance between two consecutive login and if you have velocity set to like I don't know 60 km per hour you are bound to failure cuz you'll have a lot of false positive but maybe set it up at the speed of a usual uh airline um or aircraft I guess it's like 800 to 900 km/h Um, and another tip, if you're ever having a detection that looks through

VPN, to node, stuff like that, uh, oh, pardon me. If you're if you're having a detection that looks for geo impossible, um, uh, behavior, make sure you are filtering out VPN and box not cuz that'll like it'll really uh, mess up how the calculation is done. And also apart from negative and positive two values that octa adds to the behavior is uh unknown when octa doesn't have enough history on the user. Let's say user was active last time 5 years back and they log in octa won't have the information to calculate the velocity or whatever. U bad request is another value you would see associated with the behavior when octa doesn't have in information in the event itself to

identify whether or not uh it's a new geolocation. In both these cases depending on how you configure octa does um force the user to authenticate again or show the MFA again uh if it if it u if it couldn't identify uh the behavior as negative or positive. But again you have to know how your octa is configured. Cannot blindly trust a tool to do all the good things when we haven't configured it to the right context we want. Um another part I want to hit is reducing noise with enrichment profiling. Now I'm a strong believer that effective security detections rely not just on visibility but on actionable context. If you look at the scenario that an employee is let's say working

from a cafe whose Wi-Fi eress is through a commercial VPN and it's a bad idea though but if the user reported a lost UB key and the help desk is like hey now you get to be on this 24hour u exception group from pho incidents will trigger our detection that looks for non-fer login from a VPN but what if our detection has the ability to checkup databases or or key value stores for each user and each machine in your infrastructure and figure out hey is this IP address associated with their managed device if yes don't fire cuz what you are looking for is not to not to see a glaring hole inside your IT or policy but you are

trying to see is this the attacker cuz if you are deciding to page your own call it should be it should be a high fidelity alert and it should be an attacker and not not a user behavior. But if you're trying to write detections from an insider threat angle or if you're trying to evaluate how your policies are, then yeah, definitely this one should fire. Uh or maybe we can think of it this way. If it's a manage machine IP, don't page your own call, but still create a ticket or just bring a visibility that something's up. Um and one of the ways you can do this on Splunk at least is Splunk gives you the

ability to run uh MongoDB databases called KV store. So you can have save searches that look through multiple other uh logs not octa it can be your EDR log it can be your MDM log and and keep filling multiple information that can be used in your detections into these profiles. uh I like to call them user and machine profiles and your detections should have the ability to check these profiles whenever they want to uh you know decide to fire a detection or not. Some other place it's useful is in the risk stacking part. So if you have like a BO outsource contractor who is logging without pho I don't know why uh the source IP address

match let's say so if if you have an ability to see all right you have a VP employee can we find their actual parent company if if that's available in your user profile you can check that against your ASN value in the O and be like all right it's a BPU user but they are logging in from their own organization um the login is inside the contractor schedule shift cause my user profile file would have hey these are the usual times this user logs in from uh and user only has access to lowrisk app. So you can have a risk stack that looks for BPU no idle login cool. Is it a BPO user? All right. Is it during their shift

time? All right. Is the login from an ASO that's associated with them? Cool. And you can be like, all right, maybe I don't want to page an alert right now. I'll still create a ticket. I still want my on call engineer to know that something's up, but I don't want to disturb their sleep at I don't know 2 am. Uh and and especially if you're working in situations where the direction engineering team is separate from the uh the sock team. Um you these are the ways you can be like compassionate partners and still make sure you have like effective security detections uh testing and productionizing detections. Now initially I I I showed a snippet of how detection as a code

should look like. Um and if you have detection as a code, if you have like a central repository where your detections are living, if you have validation steps, then this whole process kind of becomes easy and you don't have to manually do stuff all the time. When I'm thinking about testing my detections, I think about a unit test where a curated log can be passed to my detection and see if my detections is working as intended. uh to find this curated log either I can uh make up a log research figure out how these logs look like and feed it to my detection. Integration test is when I would like to run my detection in a pre-pro environment and

feed it uh historical logs in batches and see how my detection performs against a batch of logs and not just a single set of log. Um adversary simulation another good idea to have uh there's a tool called roti by elastic search that helps you um run some adversary simulation not all uh not all mitra attack stuff can be tested with this tool but there's a bunch of um detections that should be fire bunch of u actions uh threat actions that can be fired from dati uh also with this test you should have a success criteria uh what's the true positive rate that you're looking for in a seeded um environment especially if you're doing unit test um you should have 100%

success criteria uh false positives over a certain day of time you can figure out what's the aptitude for your team and and come up with a number that that supports the whole org um now my idea with this part of the presentation is not to explain how to set up a CI/CD uh detection pipeline uh but more like evangelize this. So I'll just run through how a good scenario looks like or a or or how a happy detection engineer might look like. U I have an SPL that I researched on figured out. Now I'm I'm pretty confident that hey this will find or this will alert on the intent of the detection. I need to have

some me as an engineer. I'll come in, I'll uh I'll take a branch from my main repo and run maybe like a Python file that'll help me answer some questions and come up with a final bundle of detection. Uh this is the same screenshot I was showing but only the second half. So me as a diction engineer just works on the SPL comes to that uh comes into my CI/CD branch run some script create this EML package and once that EML package is published the thing I have to remember is I'll take it as dev pre-pro and pro so if it's a dev tag my CI/CD system if it's GitHub like GitHub GitHub action should take that

test input you can see at the bottom there is a sample row uh integration test and certain uh integration assertions that hey if this detection fires on that particular sample log I should get at least one result and within that one result I should have at least I don't know 22 unique uh uh unique data points inside it and maybe at least 15 of them should be definitely present uh you know you can have multiple different assertion this is the assertion that I'm working with so if it's a dev alert my GitHub actioners or my CI/CD should know that hey Let's send this uh SPL to Splunk. Let's send this particular unit test. Get the output

back. And if it passes this assertion test, then the first unit test is done. And if it's successful, you now have a dev detection in your main branch. Now, once I'm happy with my dev detection, I want to take it to pre-pro level. I'll again branch my main detection. Uh change the tags from dev to pre-pro and again start the whole process. And this time my system should understand oh hey this is a pre-pro detection let me send it to the pre-pro environment and uh and I can also uh set up hey what chron job it should be running at what's the earliest time frame it should be looking what's the latest time frame it should

be looking and u and also the CI/CD should identify that since it's a pre-pro alert this shouldn't go into the main pipeline the alerts from here should only go to the alert author Um and as a team you all should decide how many days you want your detection to be in prepro. Once that's done um maybe after 14 days you can come back and when you're trying to push it to prod maybe prompt goes to one of your team members or all of your team like hey review this particular detection. So now they have the test detection output. They know how the pre-pro is running. How many false positives were fired on the last 14

days? If it was like 15 alerts, five of them were false positive, maybe that's a value your team is happy to work with. And finally, when the team puts a stamp in and that's when your detection goes into prod. So the whole testing part becomes much much easier if you have CI/CD setup or detection as a code setup. Um detection ideas, I'll probably run through this fast. Detection idea should always come from a framework. You should know what your gaps are, what your overlaps are, and that's how you should decide what to prioritize. Having that kind of a mapping helps you communicate easily with your uh leadership and tell them you know this this is the f this is the area we should

be focusing more or hey this area we have enough detections and good coverage. Uh also faster response something that's very useful is having runbooks that follow your attack framework and not detections cuz multiple detections can be still hitting the same kind of scenario that you're looking at. So there should be like one runbook to rule them all rather than run books for every separate detection. Um I have some hyperlinks that I have that I have which I'll be sharing in the other slide. Uh these are the places that I go to when I want some um inspiration or ideas. Octa has good detection that they provide their customer. Splunk has set of detection. Crowdstrike has an amazing

blog post on how they identify these events and how they are using in their identity platform and scanner dev has a real good blog post. These five detections though I don't see it used much in in much of these documentation. Maybe they have some of these now. But log stream tampering lets you know if you're dealing with like a threat actor who knows their [ __ ] and and and when they are in their network, they would like to uh hide certain of these logs. IDP modification and user impersonation impersonation is something lapses and some other threat actors been using. Um last year there was a lot of these instances. Um user reported activity, please please please don't ignore these.

uh stuff like solar winds came out because of detections that that fired for this intent. Uh session hijack is is a tough nut to crack and what I found from my research is having context enrichment is a way to go. Don't try to uh solve this one or maybe you can and if you all could then let the uh let the fraternity know but always hit session detections with uh context enrichment. Uh these are some useful resources, some links, some places where you can like u understand octa detections and how certain companies are doing it. And if you are an octa customer, please upload this request. Uh our director in the dart team uh is is strong supporter for

Java signatures in octa logs will help us identify user agent signatures much better. Organizations like Slack has already implemented them. Um thank you so much. It's my first time presenting in uh in a conference. So uh thank you all for being here

questions if any. >> Thanks for an awesome talk fe. One question I have for you is if we implement octa fastass right? >> Uhhuh. Fastpass I think uses TPM to generate signatures based on a processor key. >> Uhhuh. >> Now if you have I would say like call centers where there are agents with old laptops that don't support TPM >> right >> it is going to be very hard to implement fast pass. So how do you solve those problems? So in those cases right like um if if if you feel that you cannot fight this and you still have to live with what you have you do targeted detections like you need to identify what kind of user groups they go into

have a context enrichment in your detections that for these particular users you look at what IP addresses they are coming from and you have some strong policies that hey uh these particular contractors they can only access stuff from their parent company's network or stuff like that you have to find uh you have to find detection in depth on other angles where you can put some more restrictions. If there are old laptops you you probably cannot run octa fastass. So it's it's detection in depth and figuring out what other avenues we can enforce and have a tight watch on that group. >> Oh, all right. >> Yep. >> Yeah. Good. Good job. And yeah, I'll take this. Um I missed one thing when

you were talking about the unit test component of your uh your detection validation workflow and you said the first step was like yeah go one slide forward if you could uh that um >> oh shoot maybe it was one slide back just that yeah that uh using the deterministic event fixtures could you tell me again about what you mean by that? Oh yeah. So the thing is when you are researching when you are running uh your searches you will be in the in in the Splunk screen or whatever time you're using and running it with your authentication context. So you might have access to certain logs or indexes that your safe search might not. So my

idea with the unit test is to not make sure that detection is good but like your intent of hey can you hit this particular index? Can you hit this particular event field that you're looking for? Uh also with Splunk there are certain configurations you can make where when the alert is ingested uh when the log is ingested and when it's parsed you can be like hey it's a really ugly ass name right so let me use a smaller field name you might miss those uh and that's why unit test should be like can the save search head run the detection or the search the same way you can run it and get the output >> so making sure that the search that

you're testing is agnostic of like the Splunk user context. Okay. Yeah. Cool.

>> Thanks for the talk. Um earlier you mentioned in certain situations octafast pass is not fish resistant. What are those situations? >> Um probably so I don't have like a really good answer for that. It just abs yeah I have to confess in those ones right I haven't actually tried figuring out why it's happening and since I was focused more on the detection angle I was like all right I'll make my detection resilient to that uh but I feel it's always in the policy evaluation part where um in certain cases if the first MFA that was authenticated was a fishing western MFA then fastass uh push is allowed rather than fastpass biometric pin it's it's

something to do with how fastpass is presented. Is it like a pinbased one or is a biometric based one? Uh apologies I don't have like a good answer for it but uh something that I I I should figure out. >> Gotcha. Thank you. I think we primarily use the biometric one. So fingers crossed we're okay. >> Yeah. Um there are no more questions. Thank you.

Heat. Heat. [Music]

[Applause]

Heat. Heat.

[Music] Heat up

[Applause]

here. Heat up

[Music] here. Heat.

[Applause] [Music] Heat.

Heat. [Music]

Heat. Heat. [Music] [Applause]

Heat.

Heat. Heat.

Heat. Heat. [Music]

Heat [Music]

[Applause] up here. Heat. Heat.

Heat. Heat. [Music]

[Applause] Heat. Heat.

[Music] Heat up here. [Applause] [Music]

Heat. [Music] Heat. [Applause] Heat.

Heat. [Applause] Heat [Applause] [Music]

here.

Heat.

[Music] Heat. [Music]

[Music] Heat. Heat. [Music]

[Applause] Heat.

Heat. Heat.

Heat up [Music]

[Applause] here. Heat.

Heat.

[Music] Heat.

Heat. [Applause] [Music]

Heat. [Music]

[Applause]

Heat.

[Applause] [Music] [Applause] Heat. Heat. Heat

up here.

[Music] Heat. Heat. [Music]

[Applause]

Heat. Heat.

[Music]

Heat. Heat. [Applause]

Heat

[Music] up here. [Music]

Heat. [Applause] [Music]

[Music] Heat. Heat.

Heat. [Applause] [Music] [Applause]

Heat. Heat.

A [Music]

heat. Heat. [Music] [Applause] [Music] Heat.

Heat. [Music] [Applause] [Music] Heat. [Music] Heat. Heat. [Applause]

Heat. Heat.

[Music] Heat. Heat.

[Applause]

Heat. Heat.

[Music] Heat. [Music]

[Applause] [Music] Heat. Heat [Music]

[Music]

[Music] up here.

[Applause] Heat. [Applause] [Music] Heat. Heat

here. Heat. Heat.

[Music]

Heat. Heat. [Music]

[Applause]

Heat. Heat.

Heat.

[Music] Heat.

[Applause]

Heat. Heat.

[Music]

Heat. Heat. Heat. [Applause] [Music]

Heat. [Music]

[Music]

[Applause] Heat. Heat. [Music] [Applause] [Music]

Heat up

here.

Heat.

Heat. [Music]

Heat. Heat. [Music]

[Applause] Heat.

Heat. Heat.

Heat. Heat. [Music]

[Applause] Heat. Heat.

Heat

[Music] up here.

[Applause] [Music]

Heat. Heat. [Music] Heat. Heat. [Music] [Applause] [Music]

Heat up [Applause] [Music] [Applause] [Music]

here.

Heat. Heat.

Heat. [Music] Heat.

[Music]

[Applause] Heat. Heat.

Heat.

[Music] Heat. Heat.

Heat.

[Music] Heat. Heat.

Heat. [Applause] [Music] [Applause]

[Music]

Heat. [Applause] [Music]

Heat up here.

[Applause] [Music] [Applause]

Heat up here.

Heat. Heat.

[Music] Heat.

Heat. Heat

[Applause] up here.

Heat. Heat.

[Music]

[Applause] Heat. Heat.

Heat.

Heat. [Music] Heat. Heat.

[Applause] [Music]

Heat. [Music]

[Music] [Applause] [Music] Heat.

[Applause] [Music] Heat. [Applause] Heat.

Heat. Heat.

[Music]

Heat. Heat. [Music] Heat. Heat. [Music] [Applause] [Music] Heat. [Music] [Applause] [Music]

[Applause] Heat.

Heat. Heat.

[Music] Heat. Heat.

Heat. Heat. [Applause]

Heat

[Music] up here.

Heat. Heat. [Applause] [Music]

Heat.

[Applause] [Music] Heat.

[Applause] [Music] Heat. [Applause] [Music] Heat.

Heat

up here.

[Music] Heat. Heat. [Music] Heat. Heat. [Music]

[Music]

[Applause]

Heat. Heat.

[Music] Heat. Heat.

Heat. Heat.

Heat

[Music] up [Music] here.

Heat. [Music]

[Music] Heat. Heat.

Heat. [Applause] [Music] Heat

[Applause] [Music] up

here. Heat.

Heat.

[Music]

Heat up here. Heat. Heat.

[Applause]

Heat. Heat.

Heat. Heat. [Music]

[Applause]

Heat. Heat.

Heat. [Music]

Heat. Heat. [Music] [Applause] Heat. [Music]

[Applause] Heat. [Music] Heat [Applause] [Music]

up here.

Heat. Heat.

[Music] Heat. [Music]

Heat. Heat [Music]

[Applause] up here.

Heat. Heat.

[Music]

[Applause] Heat. Heat. Oh,

[Music] heat heat.

[Applause] [Music] Heat. Heat.

Heat. [Music]

[Applause]

Heat.

[Applause] [Music] [Applause] [Music] Track two, Besides Portland, 2025. This talk is the hardware procurement iceberg, a framework for keeping embedded research fun, cheap, and ethical. Your presenter is Yulie. You'll see spends his time uh during business hours conducting product security research for a large technology company. Outside of that, he spends an overwhelming amount of time quenching his curiosity with web, mobile game, and embedded security research. He's a bug hunter and live hacking enthusiast, and he took first place in District's inaugural Junkyard EOL Ponathon in 2025 and gave a talk at Defcon Sky talks back when those were still a thing. So, everyone enjoy the talk. [Applause] testing. Hello crew. Uh, welcome one. Welcome all. Hardware hardware

procurement iceberg here. Uh, I have to throw this disclaimer up. Sorry. Um, all views and opinions expressed in this presentation are my own and are not reflective of views or opinions held by my employer, my university, or any other organizations I affiliate with. Now that we have that out of the way, uh, a little bit about me. Um, for those of you that were confused by my handle, uh, my name is Ben. That's my real life name, but I do research under Yuli. It stands for you love to see it. I love presenting all of my research. So, I hope you love to see it as well. Uh like uh my intro said, I do security research

during the day, not like product security research, but sort of everything full scope. Um I'm trying my hand at life hacking right now. It's a bit of a a steep uphill climb. Uh so I'm stumbling more than I am succeeding. Uh to give you guys an idea about my background, uh I studied computer science. So, I'm not an electrical engineer or computer engineer at all. So, I've had to learn a lot of this sort of off the cuff. Uh, and one of my main tenants is doing more with less. Uh, fortunately or unfortunately, that means that I'm pretty thrifty and pretty cheap as a side effect. Um, some things to keep in mind. This is

just a a reason for you guys to leave if you won't find this uh talk particularly fulfilling. Um, if you've been steeped in embedded research or hardware hacking for a long time, or if you're just a cynic, uh this is effectively a how to buy stuff cheaply talk. Um, unfortunately, uh, concrete is the aim, but I only have like 17 minutes now. Um, I won't be really talking about too much of my research because, uh, the research that is in progress is competitive in nature. Like eventually they'll be submitted for CVS or for live hacking competitions. Um, some approaches that I cover in the presentation may be ethically or legally dubious in nature. Uh, for ethics, look inward as to

whether you want to use them yourself or not. Uh, in terms of legality, uh, consult with a lawyer prior to attempting it. Uh, this is an interactive presentation. Raise your hand. How many of you are okay with being interactive? I love it. Amazing. Okay. Uh, and of course, flashbang warning. Next slide. Uh, if you're epileptic, this should not trigger you, but just a fair warning. Okay. So, behold. Isn't it beautiful? How many of you have seen this exact iceberg before? Photos. Okay. We we spend time on the internet, so I hope so. Uh, for those of you that read the abstract, uh, unfortunately, I will not do three icebergs. I think ranking things on three different axes is just not quite

concise enough. Um, but I will go into benefits and drawbacks of everything that lands on the one iceberg. Um, so yeah, why this template? Uh, personally, I think that memes are one of the most effective vehicles to teach people new stuff. Um, all of a lot of you have already seen this before, but for those of you that haven't, uh, the scale is pretty intuitive. Everything on the surface of the ocean and above is very, very common. And as you start to break the surface and go lower and lower into the un viewed parts of the iceberg, it's much less common for reasons. Uh and of course, uh this scale also presents a little bit of nuance with regard to

risk. So the deeper you go, the higher risk you have like financial loss, time loss, or even legal repercussions. Um it's a meme, so don't take it too seriously. It's not rigorous. It's kind of relative. If you disagree with any of the placements of the techniques that I have, uh you can rearrange them for yourself and go with that. Um so why this topic? Well, uh, I think right now in 2025, we're kind of in a golden era in terms of getting people started for embedded research and hardware hacking. It's like perfect. Um, I'll shout out some Portland legends. I know Philipper did a talk on Defcon 27 on some telecom hardware hacking. I know Ray and Nicks,

Ray did a presentation earlier. This besides um they just presented this past summer and it was pretty excellent. It was on the the snitchbuck, which isn't too common of a target obviously. Um, one talk that does stick out is Andrew Bellini's talk, which is intro to hardware hacking, I think is the title. Um, he goes into a lot of establishing like the cheap starter tool set. So, we're talking like, um, I guess logic analyzers and adapters that you can find on Alibaba for super duper cheap. Um, and I'll give a special shout out to District Junkyard. A lot of live hacking primarily focuses on products that are currently commercially viable and distributed. Uh, District Con Junkyard

focuses on end of life vulnerabilities. So, think like forever bugs, like really old technology, and it's a good excuse to start hacking on stuff that you personally take interest instead of stuff that can be considered like work e or overly competitive. Um, philosophically, I like Milan Skovic's uh things you're allowed to do. It's a blog that hasn't been updated since 2023 and I don't agree with everything that's listed on there, but um I do think the education system has failed in a way by locking people sort of in a a mental prison uh by telling them things that they're not allowed to do. So sometimes it's refreshing to see things that you are allowed to do instead. Um some of

you might uh dollar and send me a little bit, but I think a dollar saved is about $167 earned after payroll deductions and stuff like that. Uh I think I'd prefer to save $1 versus earn one more in income. Um, I'm really happy that this was approved for uh besides PDX because Portland is kind of OP. Uh, you have no sales tax. Everything is a little bit cheaper in a way. Uh, and you all have like a pretty good right to repair bill. Uh, I think in the state right now there's some weird cutouts for automotive which I'm sure upsets a fair amount of people and also really suspicious cutouts like for electric toothbrushes. Why? Uh, you know why. But um I think it

kind of is perfect because it's like the ideal research climate. Hello. Oh shoot, how do I go back? Okay. Uh I'll run through the like end to end embedded research workflow that works for me. Um I would say if you're looking at a lot of consumer electronics, firmware flows very freely. Uh you can typically just lift the firmware right off of the vendor website. Uh unfortunately, sometimes it ends up being encrypted when it's distributed. Uh and sometimes it's just not distributed at all. So you'll have to do like chip offs and extract it directly off the chip. Um but at the end of the day, firmware is your source of truth for your vulnerability research.

Um I think emulation is the one way you can get away with not having to buy a device and it gets you pretty deep into the research cycle. Uh Kimu will do emulation for you that gets you very very far. uh from a research standpoint, it gets rid of any like dead code emulation that's just like bundled onto uh onto the firmware image uh and helps you identify entry points where you can feed in your input. Um firm firmware emulation is pretty imper imperfect. For example, if you have to stuff out like NVRAM stuff that you just can't get like perfectly right with your emulation, you're going to have to modify the uh the firmware itself. And that's when you

start to deviate from what you expect on a device. Um, and of course when you do end up finding bugs that you think exist uh on your on your research target, uh you need to verify your PC on hardware just to verify the vulnerability end to end. Um and it makes sense, right? You have a device that you buy, you factory reset it, thus you get the default firmware configuration. Um you prove that your exploit works, therefore the bugs are real, and then you get a payout. Um, occasionally if you do go away from the typical embedded research uh sort of workflow and get into hardware hacking, we do gore hardware. Um, that means anything from like

soldering, trace cutting, uh, scratching solder mask off to access test points, uh, doing chip off reads if you can't manage to get it without desoldering, uh, fault injection, anywhere from voltage to heat injection, stuff like that. So, um, not all of these methods will be catering toward that sort of workflow. Uh, and yeah, let's actually run through this. 9 minutes. Oh my god. Um, the most obvious one, just buy it from a retailer. Uh, it's kind of weird from a legal standpoint. When you walk up to the counter at a store or you purchase an or you submit an online order, it's actually sort of a contract. You submit it with your name the product

you want to buy and the price you want to buy it at. Um, it's up to the vendor to accept this contract and then the resolution of the contract is exchanging your money for the thing. Um, once the exchange is complete, it's legally yours. The reason why I outline this is say you walk into a store, something is suspiciously cheap and it rings up as suspiciously cheap and you pay for it. As soon as you pay for it, they cannot stop you and force you to return it. You can just walk out the store. Uh, if not, I think you have a a nice lawsuit on your hands. Um, any any additional clauses attached to the sale are usually

to your benefit. So think anything like return policies, uh warranties from the manufacturer, but of course if you end up modifying uh anywhere from the software to the hardware, it may avoid these. So keep that in mind. Um this is a good technique if time is money. Uh if you have like a deadline where you need to verify something very quickly, just buy it. If you're confident enough in your research, you will get a payout um to cover that cost. Um, it is kind of lame for anything that's not like consumer hardware and anything that's in the sort of international market because sometimes they won't sell stuff from abroad. Uh, so all the way at the top for sure.

Uh, borrowing from a friend. Um, we love the art, right? Thank you. Uh, how many of you have friends? How many of you have a community? Hopefully everybody. Good, good, good. Um, the subtlety of this is when you borrow from a friend, the custody changes, but the ownership is still theirs, so it's not yours. Um, being a good friend implies asking up front, establishing the rules of engagement, uh, whether it's okay to void a warranty, crack it open, uh, or like do any hardware modifications to it. So, always ask. Um, offer something in return, like collateral is fine if they expect the the actual item back. uh a flat like a flat cut of the bug bounty

money or percentage is always good because it sort of incentivizes them to help you more in the future. A hot meal a favor. Literally anything to like sustain that friendship. Um sometimes firmware is really hard to update. So I I like to update firmware on any device that I get my hands on, the official firmware of course. Um and the only drawback of this sort of uh approach is your target list is only as interesting as your friend's uh shopping list is. So, you can you can't win at all. Um, that's pretty high up. Uh, thrift shop. Who loves thrifting? Raise your hand. All right. Beautiful. Um, yeah, you walk into a thrift shop, you never know what

you can find. It's a bunch of secondhand surprises. Uh, it's very much subject to the luck of the draw. Uh, typically you'll just find like junk or legacy stuff, but if you're really just trying to try your hand in embedded research or hardware hacking, uh, it's like the undisputed budget champion. You might as well. Um, and it's very fertile grounds for if you want to work on like exploit development on end days. Uh, you can get routers for example, which is kind of like the entry point for very very cheap. Uh, e-waste recyclers. Has anyone ever been to e-waste recycling shop? A few hands. Good. Um, it's sort of the nexus or crossroads for all electronics. Um,

the the way the business works is that people will just drop off the electronics they don't want anymore. It's a donation. Uh but then they'll end up refurbishing uh wiping any drives that are attached to it and then make sure that's tested and put on the floor. Um sometimes these companies have contracts with like IT firms to get rid of all their decommissioned hardware and that's where you can get like lots and lots of like I guess higherend electronics that you can test on. Think like router racks or like switches. Um it comes with testing verification if they're licensed. Um the EPA is not in a great spot right now, but they do recognize two official uh licensings for

for these types of uh businesses, which is R2 and E stewards. Um there's a third one that's not officially recognized called Rios, and that's pretty good. Um and the only drawback is that testing can be shaky, but usually they even provide a return policy that'll let you uh come back if something's not quite right. Um keywords that you want to look out for is full function tested versus key function tested. If it's only key function tested, it could get in your way. So, now we've started to break the surface. Uh, eBay, who's a eBay afficionado in here? Who loves eBay? For sure. Uh, that's where many great hardware hacking stories begin. Um, you it's really good if you know what you

want to hack on specifically, but want to save a little bit of money and time is on your side. Um, it has a really great search feature, so you will eventually find it. Um although it's kind of at the will of like who's selling right now. So it may require some time or patience if you're looking for something really particular. Uh has a global reach. Uh tariffs have thrown a wrench in a lot of things right now obviously, but uh eBay has like the global shipping policy or something like that where instead of shipping peer-to-peer like across uh like oceans or borders, they'll have like a aggregate shipping um station like on the border of like certain continents

and then they'll just shove everything into a shipping container, bring it over the ocean and then ship it over from that uh distribution center. Um, so it brings the shipping cost down quite a bit. And, uh, eBay has great buyer protection. If something goes wrong or someone sends you the wrong item, if they don't respond, sometimes you get free hardware out of it. So, it's like you just get it for $0. So, good. Um, I have to shout out Ray and Nicks here. Um, I think the snitch puck was like 1,700 bucks, low four figures, but they managed to get it for 500 bucks or so. So, you know, the power is in eBay for

sure. Uh, Facebook Marketplace. Oh my goodness. Uh, surely all of you have a Facebook Marketplace story to tell. It's absolutely unhinged. Uh, everyone is just a person. There's no like feedback system attached to an account and there's a lot of scammers and a lot of burner accounts. Um, it's a really good place to like practice trolling or like I know haggling isn't really common in the West, but you can really haggle people down. Uh, it's up to you how play you want to be. Uh, it's by far I think the best place to learn market psychology with like minimal risk. So, be like, "Yeah, your Camaro is not worth that much. Give it to me for 350. You

might get away with it." Um, there's always a risk for stolen or fenced goods. Um, just keep that in mind. Some of the items are dubiously acquired. Um, the search feature sucks. People also just like misspell their products all the time, but even when it's perfectly spelled, sometimes it won't rank in the search correctly. Uh, it's hyper local. I feel like the most radius you can get outside of your your current location is about 50 miles. Um, and of course because of scammers, uh, basically all items are not going to be shipped. You're going to have to pick them up. Uh, and of course cash, sell, bartering, crypto, that all rules here. Um, but there's no return policy, so buyer

beware. I love the Zin machine. That's crazy. Two minutes. Um, who has heard of Govde deals before or any of the liquidity services brands uh, auction sites? Well, it's um it's seen a lot more exposure in the past few years. This used to be sort of like a niche oddity. Um but there there's a lot of value to be extracted here. Um it essentially it's liquidations from like state federal agencies and businesses. You can buy literally anything, especially if you're looking at like IC or SCADA stuff to hack on. You'll end up seeing a lot of things off of gov deals. Um I'm looking at a vending machine right now for about 180 bucks, which is unbelievable. You

can buy a plot of land in Arkansas. Oh, I saw that last night. It's so good. It's so good. Um, the bid increments are really, really, really low based on how they configure the uh the auction. Uh, sometimes the reserve doesn't exist, so you can get stuff for like criminally cheap. It's not criminal, but you know, it's low. Uh, stuff is often sold in lots, especially for smaller pieces of hardware. Um, I have friends that buy pallets of ThinkPads, for example. Um, but you know, if you can find buyers for all the stuff apart from the one that you need, um, you can end up, you know, net zero or even profit. Um, I have to

shout out Shady for this one. I didn't know about bidspotter.com, but there's other like liquidation auction platforms that exist out there that might net you something nice. Hello. Okay. Price trackers. Um, is anyone familiar with Brickseek? Raise your hand. A few. Okay. Um, price trackers are kind of here and there. Um, there's two ways to go about it. You can sign up for a service like Brickseek. Um, it's probably the most relevant one out of the bunch for embedded research, but it hasn't been too good in the past few years. Um, the one place where they do shine is giving you instore deals. So, for example, if Walmart has like a pre-built PC for 1,100, a random Walmart

may just have it for like 200 and it's totally worth the drive. Um, one other angle you can go at this is kind of living off the land and writing your own price tracker. Um, you can very much cater it to your interest or your preferred vendors. Uh, and then of course you can hook up your notifications to something that's like quick enough or convenient enough. Think SMS email RC Discord Telegram any of those. Um, and typically you'll end up beating the crowd of the folks that you know provide it as a service. Um, just keep in mind one drawback, deep discounts may invalidate return policies just because they're final sale. So you know you you get value but at what cost?

Uh so this is a really controversial uh technique that's been discussed in my circles but I want to formalize it in this presentation. I call rent to retailer. Um who here misses um fries electronics? I know there's one nearby sort of and they've they're gone now. So I I'm picking on them just so we can dissect the retail policy or the return policy a little bit. Um, this is like bending rules to the maximum for sure. Um, if you do end up buying a device with the intent to return it up front, you have to be really delicate with the tear down. So, put on your gloves. Uh, use a plastic spudger to peel the case off.

Um, put rubber bands on every single screw so you don't strip them. And put masking tape on everything that you could possibly scratch. You need this thing to come back in one piece. Um, definitely hang on to everything that's in the box. I'm talking twist ties, wrappers, like papers, all of that. It needs to come back. Um, and if you need to do like a chip off read or like you want to do some sort of fault injection, forget about it. This is not going to work for you. Um, you can always do this online and then ship it back, but they'll expect you to pay for shipping. Typically, I think it's a complete waste. You should support your local uh

retailer. Uh, and if you have like a really busy research cycle, just batch it all together and return it on the same day within that time. Um, just saves you some gas and some mileage. Um, if you do end up getting a bounty, or even if you don't, pay it forward. Um, if you get a little bit of Dota to spend, uh, go spend like, you know, anywhere two to 10 times the amount that you spend for the actual device on some gear that you need. Uh, if you don't, at least do some community service and update the item to its uh, newest firmware, official firmware before you return it. So, that's my best advice.

Okay. Um, Shodden, this is the bottom of the iceberg for a reason. It's not your hardware, guys, right? Um, it's just it's devices that are exposed completely open to the internet either by a misconfiguration or the vendor doesn't care. Um, there is a non-zero chance that it's a honeypot instead of the actual device like it's just serving up a signature, but it could be just like a honeypod. Um, I would never recommend testing a potential OD on this cuz you're inadvertently maybe handing it over to someone, god knows who. Um, but I will say this still lands on iceberg because there's a good it's a good way to test like nonvulnerability related assumptions and defaults about the

config. So I throw it up here. Um, take a photo now. Now's your chance. This is the uh hardware procurement iceberg. Oops. >> You're going to have to catch the recording for that. Um, whoa, whoa. Too early. On a parting note, thank you so much for coming. Uh, I'm going to say go for go forth and research. Thank you so much for listening. There's the key for the CTF. And if you have any questions for me, you can catch me outside or you can just shoot me an email. I'd be glad to chat. Thanks so much, guys.

[Music] Heat. Heat.

[Music]

[Applause]

Heat.

[Music] Heat. Heat.

Heat. Heat.

[Music] Heat. Heat.

[Applause] [Music] Heat. Heat.

Heat.

[Music]

Heat. Heat [Applause] [Music] [Applause] up

here.

Heat up here.

Heat. [Music]

Heat.

[Music] Heat. Heat. [Music] [Applause] Heat. [Music] Heat. [Music]

having a library called library best practices shopping at that moment is a massive calamity to do that. Uh your presenter today is Brian and Brian has 20 plus years of experience spanning software development and information security amongst a whole bunch of other things that uh he will then tell you about. So enjoy your talk >> there. Ah I'm here. Hi. [Applause] Thank you all for coming. It's nice even to see a few familiar faces out there. Um uh just briefly I started my career in in software development. Uh got ejected from that into management and eventually escaped into information security about 10 years ago where I've been happy ever since. I currently work as an independent software consultant

helping companies set up and manage security programs. Um, a disclaimer at the beginning, since I am self-employed, my the opinions I express here precisely represent those of my employer. Um, to save can save some of you a little work. If some of you are the type of person who feel you might be tempted to take notes, these slides are already online. The quick way to get them is go to my website, look at the talk section, and you'll find links there to download. And this URL comes up again on the last slide, so you have a second chance, right? So a few years ago um I was doing not two years ago I was doing some

research on ransomware and I ran into this report uh which was just marvelous. Um it's an 18page report produced by the British Library and distributed to the public about four or five months after they had a big ransomware attack. The ransomware attack was in October as you will see it was devastating. Um and in the following March they published this report saying what the state of things was exactly what had happened what their lessons learned were what their plan for recovery was. They really did not just here but elsewhere in handling the crisis and a marvelous job with communication and that'll come into the story too. So uh from now on whenever I refer to the report I mean this thing I

wish everybody would do this when they were uh victims of big catastrophes but I suppose it's easier for public organizations than private corporations to bear the dirty laundry. Um this is what we're going to do today. Uh I'm going to talk you just a little bit about what the British Library is to be sure you have a full sense of its scope but mostly to give us a sense of what information systems they would need to support the services they offer. Um, oh, sorry. Uh, then we'll talk about the attack itself and how it played out. Who did the attack? Was there ransom? What happened? Then the consequences both short-term and long-term and costs for

the organization. And finally, we'll look at the lessons learned section at the end of the report. And I'll draw some conclusions and make some observations about what we find there. And yeah, there's a good dose of Shod and Frea in this talk. The British Library is the national library of the United Kingdom. It is one of the premier cultural institutions in the English language world. Um and it will help in understanding the attack to know a little bit about its history. This isn't a complete list but though it did was founded in in the 1970s by absorbing some state libraries and over the years it has accumulated other collections. Things have been donated to it. They've

absorbed it different things different years. They're not even all alike. They're not all books. There's physical books. There's eublications. There are postage stamps. There's a few key bits of famous furniture. There's just it's it's a hodgepodge of things. Um uh and I think this is like a lot of businesses acquiring disperate things and trying to turn it into a unified service over time. I think many of us can probably sympathize and begin to guess uh what kinds of problems an organization that's grown this way might have. This library has everything from the Magna Carta to handwritten lyrics from the Beatles. Um, they put on exhibits using materials from the library like this very popular one from 2017 about

Harry Potter. They didn't just have Harry Potter memorabilia. They also brought in magic related items from their extensive collection. And over a million people visited this either in London or in various regional manifestations of the same exhibit. They have to store all this and it doesn't all fit on bookshelves. You can't just, you know, send a librarian into the next room to get what you whatever you want out of this collection. So this is a storage facility in York, far from London. Um where most of the collection resides uh and it has to be temperature controlled for archival purposes and it has to be automatically uh has to be automatic uh retrieval system since obviously no

person could walk through there. You put in some numbers, it knows which aisle and which crate and it gets forklifted down from somewhere and put in a shuttle and sent to London for whoever asked for it. Um they have a cafe in London. Their website has a bookstore. Um, they do uh they work with international partners to preserve endangered archives and materials around the world. Um, their collection changes every month. Their collection their bookshelves extend five miles every year. They have pabytes of webpage archives. They have a significant staff and a decent budget. That's the kind of organization we're talking about. um from that I can guess or it's pretty clear that they have at least these

information systems. So they have a public facing website with non-trivial interactive content. You can get access to their digital archives or at least some of them that way. You can register to be a user. There are online materials meant for uh schools for lessons. Um they have uh point of sale systems for sales in their stores. They have digital archives and just the normal author stuff. They have staff, they have finance, they have HR, all those kinds of office systems, email, shared shared drives, that kind of thing. Um, and they have an information security program. They don't highlight this exactly in the report, but I went through the report and listed every made a list for myself every time they

mentioned something that felt like it was part of their security program. So, this gives you a view of what they were like at least before the attack. Um, they had firewalls with recognizable vendors. Um, they have an incident response plan. They have uh some form of risk management. I'll let you read all that, but I'm going to call out in particular near the end cyber essentials. They had passed that certification in 2019 and indeed for a few years after I looked at that to see what it is. Uh, it's a UK guideline, UK specific guideline and it's a pretty low bar. It's about five major control areas um aimed at helping businesses block the basic most common internet facing or

internet incoming threats. That's all it is. So they did pass that. They were they were aware of security and not noobs even if it is a low bar. Revealingly in 2022 the cyber essentials standards changed in some way that I don't know but I do know that the library at that point stopped certifying against it because they said their legacy systems could not meet the new standards. So that's another hint about you know what happens when you aggregate all these different collections and try to bring them together and I think it's a highly relevant hint for what follows. So that's a bit about the library, the scope of its services and its current and its level of sophistication.

Um, let's talk about the attack. Reca is a pretty well-known ransomware group and REIA as you may know is the name of a centipede genus and this lovely graphic comes from the Recita organization's website. Um, they have one uh and oh they have more than that. uh they they've been active for some years and in 2023 in fact there was an uh from our national organization SISA there was an advisory about uh REIA and their attacks and this is from that 2023 advisory which by the way they updated in small ways this year. RCEA is still active and CISA um upgraded their updated their advice. Um, I took from that this description of their basic operating

procedure, the way they normally act, which seems consistent with what we saw at the British Library. And so this helps us fill in some gaps where the forensic evidence is not perfectly explicit. Um, they don't do any clever thing. They don't come in by hacking through vulnerabilities. They uh do something like fishing or social engineering or whatever it takes to fraudulently acquire valid credentials. So, they log into your remote services using credentials in an account you recognize and that doesn't set off alarms. Um, and they live off the land. They're not uploading payloads of sophisticated things to try and exploit you. Another way thing that makes them hard to detect. Um, they don't immediately install ransomware. They

look around, find what data they can, copy it down so they have a copy of it, and then run the ransomware on your system. So, they kind of double extort you. They can both uh make you pay uh to de- enrypt and make you pay to not release what they stole. This is not uncommon. Um so timeline, um we're in October 2023. uh and they did a good job of deleting a lot of evidence. So, we don't know exactly what started when or when they first logged in, but it's clear that at least in October, someone from Recita did have access to credentials for a vendor account with for someone who had legit access to the British

Library Systems. And we know they were there in October. And we know that late on a Tuesday night on October 25th, one of these fraudulently taken over accounts logged in. And a few hours later, the next morning, 1:00 a.m., set off some kind of alert that did like pedagra duty and woke up somebody from the British Library who logged in, couldn't see anything wrong. To be safe, disabled the account, uh, went back to bed, send a note off to the morning for the morning crew to read, and went back to bed. Morning crew comes in at 7:00. They look around more thoroughly and still see no evidence that anything that an attack has actually occurred. The

ransomware hasn't run. I'm not saying they should have seen something it, but nothing has happened that they can track down. So they reenable that account, they give it a new password and that's the earliest uh forensic evidence we have of the uh Recea presence in their network. Um right I just love this little bit here. There's a little d little little bit of British understatement. Um so what happened was uh and it's important interesting thing you know they came in the attackers came in through terminal services to the um through the the the British libraries domain their business domain um in 2020 the British library had conscientiously realized they needed to enable MFA everywhere and they did that and they

made a few exceptions the exceptions were uh let's see if I have the yeah I hear their words um they did not attach MF FA to their own actual domain for reasons of practicality, cost and impact on ongoing programs. Their risk analysis at the time acknowledged that was taking a risk. And that's what this is saying. We may perhaps have underappraised the risk at that point of not getting MFA on the terminal server that indeed this attacker went through. Again, we don't know that that's the only way that Rita got in, that that's the only account, but it would have blocked, you know, that one particular case. Right back to the timeline. Um, so for

the next two days after that that uh initial um uh login, the attackers were on the network and did nothing except look around and GP for stuff and try to find files on network shares. They moved laterally, looked around, tried to see what they could get access to. um searched for things that had confidential in the name or in the in the body somewhere. Searched for things that were that said passport, anything they could find. Just looking around, looking around. Uh and then on Saturday morning at 1:00 a.m., clearly not an accident Saturday morning when no one's at work, no one's, you know, low low traffic time. Uh it the logs show that a great deal of data left the network and

that's uh receing down whatever they had identified that could be sensitive. And sometime a few hours after that they enable the ransomware which they uh inject into running processes. Typically ransomware runs and Saturday morning there's an outage and whoever logs in, whoever comes to work Saturday morning sees the trouble, confirms that ransomware is present and there is an incident response plan and a crisis team and within an hour or two they're uh up and running and they have uh an alternate communication channel which is a good thing because main main channels were affected as you'll see and they're up and communicating on WhatsApp. Right? So, it's 7:30 in the morning and your company has just had a nuclear bomb

go off in its systems. What does that look like? This is not from the British Library. It's from a different Receita attack, but this is what your systems look like afterwards. All your files are there. You can see the names. They have new extensions. You can't open them because they're not what they used to be. They're all encrypted. The one thing you can open is a PDF with a ransom note, which looks like this. Again, this is not the one from the British Library, and the actual victim has been redacted, but you get a ransom note. It looks like this. I like it. Uh, at the bottom, it says, "Rest assured, our team is committed to guiding you through this

process. The journey to resolution begins with the use of the unique key, which is how you can connect with them." That's what the rest says, how you identify yourself to them and get instructions. Together, we can restore the security of your digital environment. Um, so it wasn't immediately clear to the first IT people who get there how widespread the effect was. It takes a while to figure out in a fairly large aggregation of many different systems what all has been affected. But this is what was affected as slowly became clear. Readers could not register online. The online catalog was inaccessible. You could not request books. Access to digital assets was gone. There were no deliveries from from

Yorkshire. In fact, the environmental monitoring in Yorkshire was gone. The phone lines did not work. Network access for the business was down. Wi-Fi was down. The public website was down. You could not buy tickets for the exhibitions. And you could not buy things in the gift shop. The business ground to a complete halt. And the fact that so many systems were affected will come up in the lessons learned. Um just to be clear what they did was you know I mean what was the effect from from a 20,000 foot level uh they stole in those 440 gigabytes of data that went off files from finance tech and HR departments that was what what was on

the networks that they found by grepping and copied off that included among other things some personal non-b businessiness staff files and we'll come back to that um and it included contact info for various constituents which is PII um they destroyed data by encrypting it. Uh and they say interestingly that the attackers also effectively I say effectively destroyed servers. They use that phrase destroyed servers not physically clearly but they um messed them up such that they could not be recovered. Uh last point here about the attack. Was there ransom involved? Of course, there was a request, but there is public policy in the UK that no publicly funded organization shall pay ransom. So, there was no chance REIA was going to get

ransom from this. They may not have may or may not have realized that. They certainly tried. A sign that they did is that a few months a few weeks later, Rita, evidently dissatisfied with the progress of whatever negotiations, if any, were going on, decided to strengthen their hand by publishing on the dark web for sale 10% of that 440 gigabytes of data they had stolen. And they had a pretty high asking price on that. I don't know if it actually sold. Um, but that still wasn't enough. Um, and they they could put all the pressure, twist the arm, all they wanted and there was just no way the UK government was going to pay ransom. So

that's why and then eventually uh a week later Rita dumps all the remaining stolen data. We're not going to get any money and presumably they move on to their next victim. End of story from Rita's point of view. Not the end of story for the library. And now we move on to the consequences. So um, back to the timeline. Now we're looking at the recovery steps. The attack was on October 28th on a weekend. The next m morning, the next Monday morning, they're able to at least reopen, but in a pre-digital state. What would your business be able to do in a pre-digital state? Uh, not a lot. You know, you can go through the front doors

of the office. If the book you want happens to be in the London facility, you can probably get it and sit in the reading room. You can probably buy something at the gift shop if you pay in cash. pre-digital. Um, now the the library, you know, this kind of system, it isn't, as I said before, it isn't immediately clear what happened. It's clear to the public that something disastrous has happened. That's obvious. Um, but the library is frantically trying to figure out how far the damage goes and what will what what it'll take to bring back, what they can promise they will have and when. That those are all really hard questions. So, it's a couple of weeks before they make

a formal public statement about it. Um and at that point they confirmed there's been ransomware data was stolen and that even then weeks later they're still determining the full extent of the attack. Um by January remember the attack was in October. In January a little bit of online catalog access had restored and when the report came out in March about 50% of the online catalog was restored. That's a slow recovery line and that's just the online catalog. Think of all the other things I mentioned they do that were still affected. Um there was of course a lot of press about this. This was not a great time to be working at the British Library. These

are things that were posted in November. Um some from the media, some from the British Library. Um and the report indicates that the effort to communicate with all the people who needed to hear from them after this, which includes the public, all the people they do business with, all their partners, the board, and all their employees, all needed to hear from them. And crafting those messages and figuring out what to say to whom and how best to catch it while you don't even know all the facts yet was a great deal of effort uh on the part of the the recovery team. something to think about in your own incident response plans and it comes up in their lessons learned

too. Um what systems were affected? Well, you know, rough outline of a very simplified view of their systems. Um the first things that were brought back online were their email, finance and other business systems. And that's almost certainly because those were almost certainly SAS products. So the ransomware didn't get um uh off of their system and into their vendors systems, you So if they were using Salesforce, well they wouldn't, but you know, whatever their their s if they were using Outlook for their email, I don't know what that was not hit by ransomware. So a good case perhaps for cloud-based services being helping you resist an attack. Unfortunately, what was most affected were the servers that

house the library management system, the the physical servers that the off that the library itself owns. And these include cataloging, circulation, acquisitions, interl loan. All the core library functions were hit and brought down. And as I said, some legacy systems were effectively destroyed because they could not be rebuilt. They had deleted whole partitions and these things were rubber banded together and out ofd. um they could in many cases the the software is no longer supported by the vendor or and or it could not be installed on newer systems. this presumably they don't go into detail about this and I exercised my brain trying to figure out exactly how this destroyed servers and they aren't perfectly clear but it clearly

has to do with a combination of factors like old oss and databases um abandoned drivers and middleware deprecated protocol dependencies um local and admin service accounts those kinds of problems uh were hard to reconstruct in a new system um I have sympathy for Now, this is from the report. This is a kind of a surprise really. This is one of the big revelations for me in reading this. We all understand that legacy systems are vulnerable. But the typical thing we mean when we say that is they're legacy. Therefore, they're not getting security updates anymore. Therefore, they have known vulnerabilities. Therefore, clever hackers will be able to find their way in by exploiting the vulnerabilities. The vulnerability here was simply that

they were old and couldn't be rebuilt. It was not CDEs of high priority that were well known to the attackers. The attackers didn't have to do anything that clever. And in fact, the biggest cost to this event is not the loss of data. It's the loss of those servers as you'll see. And that too is a an unexpected little wrinkle in how this ransomware paid out. So when the uh report came out in March, five months after the attack, four months after this was the state of things. Um yeah, the sound heritage grant they they had a grant to they had been given money to create an archive of sound, music, spoken word things. And like one of the

many little effects in this, so many more more than I can put in a list, is that they were no longer able to meet the terms that they had agreed to in that grant. they weren't going to be able to produce the results they promised, at least nowhere near the time frame they promised. So, the report gives this plan for recovery. They say there's a three-phase recovery. Remember, this is in October. Uh, no, sorry, this is in March when the report comes out. They say the first phase ended in January. We call that the um short-term restoration of minimal necessary functions. Just, you know, whatever it takes to keep people there, keep payroll going, whatever the minimal

things to have a business are. that that bit minimal stuff got done. Um when the report came out in um March, they were projecting that by June of 2024, they would be done with a medium-term phase of adapting as best they could to the new state of things, which was restoring anything they could restore and that's it. Um and then the long-term 18-month effort would be a rebuilding of of secure infrastructure. That is a major effort. Full recovery would take a year and a half. Imagine that in your business. 18 months of not of incomplete services full cost they estimated of 7 and a half to9 million. In May of this year incidentally the UK

government produced a report on cyber security that happens to touch on the um uh the library's cyber security event and it mentions that as of April this year um remember the plan wasn't supposed to be done till June. So the plan was still in progress. they had reached the upper end of this. So probably not surprisingly it's actually going to go over that. Um and there were of course indirect costs uh that don't get included in that $9 million. You know reputation uh lots of academic and business research that simply couldn't happen. People whose fellowships could not be performed risks to privacy all kinds of things. Um then I have a couple of notes the of

news that came out after the report. So between March of last year and now just few little glimpses into updates of what's happened. Last August uh it pro reported that the library had over half a million dollars to spend on hiring contractors to help them in the rebuild. This is clearly that big rebuild of security. But you're imagining, okay, it's August and within less than a year, that is by June, they had said they would be done with the rebuild. And in August, they're hiring a half a million dollars worth of new contractors who don't know their legacy systems. Um, I'm sure they need help. They couldn't hire people who already just think how long

it'll take to get them up to speed, to have the expertise flow back and forth. Didn't seem to I mean, I'm sure it's a necessary step, but I'm not sure they're still on track at this point. uh if that's the stage they're at. Um the in August they put out a public announcement the library did saying August of 2024 saying you know it's almost a year since the attack and we're happy to say that in time for the AC 2024 academic year the basic most popular learning modules and digital manuscripts will be up online for schools to access. So I'm glad for them to have got that far and I'm sure that was a big help but it's a big sign that

this was a really slow recovery. Um the information I know that's too small ICO is information commissioner's office and that is the UK GDPR regulator. So this is an organization that has the power to assess fines for mishandling PII for example. And in this year, more than a year after the attack, they said, "Okay, we're done investigating. We have no more uh conversations we need to have with the library." And they imposed no fines. And they specifically in this first paragraph commend the library for handling it so well, for communicating so well. This is a consistent note in the press about the library. Everybody admires the work they did in communicating publicly the effect. What

went wrong? May Kulpoa, we did it wrong. Here's what we're going to do. If you want an example of a great uh marketing campaign for coming back and restoring confidence, apparently they aced it. So, you could consider that as an interesting example. Um, as of uh just a couple days ago, I went on to the website. Remember their their projection was they would be done restoring service in June. It's now October and there's still a prominent section. This is a screenshot of their website listing things that are not up yet two years later. Okay, that's long-term and short-term effects of this attack on the British Library. What do they say they learned from it? Well, first let's look at what

they say the root causes were. And these are their words. You can read them, but I'll give them to you in my words. I think the first two are saying because so many of our legacy data operations are manual, properly isolating network segments was impractical. Therefore, once the attackers were in, they had a relatively easy time finding everything, which is why so many systems went down from this attack. And the last bullet is a version of they destroyed the servers and we can't easily rebuild because they were so old. Those are the root causes. Not some clever vulnerability in an unpatched system. Not even um primarily that they uh that one of their vendors had was not required to use MFA. That

isn't even root cause. The root cause why didn't they have MFA? Because they had decided the systems were too fragile to put it yet. That would that would make the workload too complex for too many people. And here are the 16 lessons. You don't have to read them all here. They come back in groups on the slides that follow, but I wanted you to have an overview. Um, and most of them line up pretty nicely with best practices, which is where this this talk gets its subtitle. This is what the library learned was to follow best practices. And here's I've grouped some of them from that previous slide and lined them up with NIST controls. I picked NIST

because that's what I use most often. I it would work just as well with ISO or any other comprehensive framework. Um, it's easy to see that they are rediscovering things we already know. Um, don't misread this to say they needed to do these things. They were doing almost all of them. Probably all of them. Certainly, we know they were doing MFA. They just weren't doing it everywhere. They're all things they were doing, but not well enough. They were doing them, but they needed to do them more. Um, and that's a pretty common finding post any incident. Um, the next couple of slides have my little red flag on them because they have contain items that made me think we

needed to stop and comment on them. These are the three that have to do with their legacy um systems. This is what they their findings were about legacy systems. Starting with um they need to manage system life cycles to eliminate legacy technology. That is in my words notice when our systems are going out of date and maintain them so they don't update them, rebuild them if that's what it takes. Uh which is goes along with the last one. Prioritize recovery along with security. It isn't enough for those systems to have been secure. They have to be able to recover them as well. That was really the thing that bit them. That's why this re this recovery is

taking them so long. These two items I really like because they are both sides of an obvious conversation that must have happened between security leadership and the board of directors after an event. And just imagine, you know, what in any company, what would the conversation be like? And how would you capture what happened in those conversations in your lessons learned? Well, in the first one they say okay in the future we will maintain a holistic view of cyber risk. I don't know exactly what they mean by that but I pretty clearly they will have a broader scope capture more of the risk more completely than they did initially and I interpret this perhaps incorrectly. This is me inferring a

conversation where leadership says, "You didn't tell us that. We never knew the risk that how could that risk be so bad and we had no idea." And so on the one hand, security team says, "Yeah, we could do a better job tracking risk for you." On the other hand, says the security team, let's be sure that we increase cyber risk awareness and expertise at the senior level. Um, uh, and then these two points, uh, I said we'd come back to this. So they're related in a not obvious way. The second one says one thing we say we'll do is review acceptable personal use of it. And that relates as some of you are nodding. I can see you making the

connection already to the fact that um some of the things that were stolen by were personal files and they were stolen because they were there to be stolen because the acceptable use policy for the British Library explicitly allowed in certain limited cases employees to keep personal files on company systems. Now, imagine if you kept sensitive files of your own. I assume some of them were sens sensitive or it wouldn't have shown up here. Um, and uh, hackers stole your personal files from your company's network shares, you might indeed feel you had were entitled to say a thing or two to company leadership. Um, and so that's why the company wants to revisit their acceptable use policy. And it's

also part of why they want to think more about staff and user well-being in their incident response plan. It isn't just that the personal files were stolen and that made some employees angry. That's one piece of the first item. It's also that the whole company had their work disrupted. Morale, you can imagine, must have been terrible for a long time. You can imagine a lot of extra hours and really hard work as people faced hard problems and probably, you know, tried to throw a void or accept blame along the way. Uh there was media scrutiny that made everybody uncomfortable. Um and all those things took a toll on the the staff and the users, but I think

this is focused mostly on staff. That's how I read it. Uh, and so their incident response plan needed to spend more time thinking about how to support people in all those different ways in an incident response. And I see Khloe, who ran the um tabletop earlier today, nodding at that one. Um, these are the things they said they would do uh, as a result. Nothing surprising here. After all we've been through, I think you expect all of these items. Um, uh, this also is from the report. They got to the end and they said, "Now that we've gone through all this and we put these plans in place, let's open our risk register and add some new items to

it." And I won't go through all of them, but the first two probably deserve comment. The first one is we have now been highly visible visibly victimized by an attack. Everybody now knows who we are and everybody in the security world if they didn't before. Everybody knows we a successful attack occurred. We are vulnerable. uh and uh everybody knows what an effect on the British public bringing down this library can be that may make us a more attractive target that may attract more attackers. Our risk goes up. Secondly, and another one I love uh again I imagine the conversations between security and the board. Um the we they the security team has now told the business we aren't going to have

full recovery for a year and a half. It will take us a year and a half to rebuild our infrastructure securely and that brings up this risk. You, the rest of the business, will naturally inevitably feel a strong pressure to get back to normal as quickly as possible. If you give into that pressure and force us to try to finish things quicker, that will introduce a risk of us not building back all the protections that we plan for. I think that was a pretty smart thing to add. The others, I think, are less surprising. Last comment. Uh this is an unfortunate fact in any in any um catastrophe. It creates opportunities um that you need

to use. I was at the Oregon Cyber Resilience Conference earlier this month and there was a panel of SISOS from four Oregon public or organizations, couple of universities, PGE um uh TRAT and they all unanimously and emphatically agreed that if you are so unfortunate as to uh undergo a catastrophe, don't waste it. At least don't waste it. And that is kind of the spirit of my presentation today. This is a catastrophe someone else underwent from which we have the chance to learn. So let's do that. Um I have a couple of resources up here, links to uh the report itself to the British Library um another report from with more information about the British Library um

standard advice from SISA on avoiding um ransomware if that's what you want. And if you're curious for more details about exactly how RECA attacks work, um Sentinel One has provided that. If you are working on the CTF and haven't yet acquired key2, there it is. And I'm done. Um, is I think yeah, there's there's still time for questions if anybody has any. And I I'll be around, of course. Yes. >> Well, since they know who the attackers were, still in business. >> Yes, Rita is still in business. >> And there was no legal action. >> Well, they don't know exactly who they are and they're not based in the UK or the United States. So, uh, they're Yeah,

they've attacked many people. They attacked the Chilean military. They've succeeded with ransomware there. They got uh city records in Ohio a couple of years ago. Um they're like one of many thriving ransomware organizations. Apparently they're hard to take down. >> I don't know. I might not know if they were trying. Uh but it's a good question. I'd sure love to hear that they've been brought down. >> Question. >> Yes. When you said dumped the data, they just destroyed it or they dumped it on the dark web. >> They dumped it on the dark web. So all that data got >> recover from there, you know, if they made any effort to pull it from the dark

web. >> Uh I don't think they tried to pull it from the dark web because as I understand it, they actually did have backups of the data itself. They were able to construct the reconstruct the data from material they had. >> Yeah. Yeah. >> Yes. Coy. >> Yeah. So updating legacy technology can be so expensive um and may require building new technology. Um are you aware of any like swag numbers from this incident of like if British Library would have spent x number of dollars to upgrade all their systems they could have prevented this attack or anything about that? >> No, I mean all I have is what you saw. So we know what they approximately what

the rebuild cost them. Uh but we don't know all the things that are included in that cost. Uh we can assume we can't of course know for sure that RECA wouldn't be able to attack them after the rebuild although surely it would be harder. They are certainly going to have MFA everywhere now. >> Great. Well, I look forward to chatting with some of you afterwards. Uh feel free to connect with me on LinkedIn if you like and thanks for thanks for attending. Heat. [Music]

[Music] Heat. Heat. [Music]

Heat. [Applause] [Music] [Applause] [Music]

Heat up here. Heat.

Heat.

Heat. [Music]

Heat. [Music] Heat. Heat.

Heat. [Music] [Applause] [Music] Heat. [Music] Heat. Heat. [Applause]

Heat.

[Music]

[Applause] Heat.

Heat. Heat. Heat.

[Music] Heat. [Music]

[Applause] [Music] Heat.

[Music]

[Music] [Applause]

Heat. Heat. Heat.

[Applause] [Music] [Applause] Heat.

Heat

here.

[Music] Heat. Heat. [Music] Heat.

Heat. [Music]

[Applause]

Heat. Heat.

[Music] Heat. Heat.

[Applause]

Heat. Heat.

Heat up

[Music] here. [Music]

Heat [Applause] [Music]

[Music] up here. [Music] Heat.

Heat. [Applause] Heat

[Applause] [Music]

up here. Heat.

Heat.

Heat. Heat. [Music]

[Applause]

Heat. Heat.

[Music] Heat. Heat.

[Applause]

[Music] Heat up here. Heat.

[Music] Heat.

[Music] Heat [Music]

[Music] [Applause] [Music]

Heat. [Music] [Applause] [Music]

Heat.

Heat. Heat.

Heat. [Music]

Heat. Yeah. [Music]

Heat. [Applause]

Heat. Heat.

[Music]

Heat. Heat.

Heat

[Music] up here.

Heat. [Applause] [Music]

Heat. [Music] Heat. [Applause]

Heat. [Applause] Heat [Music] [Applause] [Music] up

here.

Heat. Heat.

[Music]

Heat up here. [Music]

[Applause]

Heat. Heat.

[Music]

[Applause] Heat. Heat.

Heat. Heat.

Heat. [Music]

Heat. [Applause] [Music]

[Music] Heat. Heat. [Applause] Heat. [Music]

Heat. [Applause] [Music] [Applause] [Music]

Heat

up here.

Heat.

[Music] Heat. [Music]

Heat. Heat. [Music] Heat. [Music] Heat. [Music]

[Music] Heat. [Music]

[Applause]

Heat. Heat.

[Music]

[Applause] Heat. Heat.

Heat.

Heat. [Music]

introduce today's speaker Jason. So Jason is a director of advisoral researcher at research at hidden layer where he explores how the latest AI uh security research intersects with the practical application. Jason was among us the earliest researcher to recognize the need for AI security founding the secure intelligence team in Intel Labs in 2016 to research AI security and privacy threats and defenses for 20 plus years. Jason has covered such diverse security topics as CPU microode, authentication and biometrics, trusted execution environments, variable technology and a network protocols, resulting in over 40 issued patents and several high-profile research papers in adversarial machine learning and federated learning. When he's not working, Jason is either lost in Pacific Northwest camping and hiking with his

family or he's lost in technical project involving in 3D printing, microcontrollers, and uh designing a holiday lightning display synchronized to music. Welcome, Jason. [Applause] Thanks. I'm always glad when someone else introduces me because uh I hate that kind of stuff. So, um, yeah, so my name is Jason Martin. I'm the director of adversarial research at Hidden Layer. We are a, uh, AI security startup. I know there's a bazillion of them. Um, but we have been around for a few years. Um, I, like she said, I've been working cyber security myself for, uh, like 25 years, something like that. Last 10 or so, I've been focused on AI. U, most of that in Intel Labs. Um today my goal is

to talk about the potential risk sort of of Agentic AI through kind of a different lens. Um I will still provide very technical examples for it. Uh but really I want to think about it through the lens of insider threat instead of through the lens of software. Um and I'll give you a whole bunch of examples from my our research team. Um so the the first question is what what is an insider threat? I'm not really h here to have a long debate. Um, it's the potential for an individual or entity who has or had authorized access to an organization's critical assets uh to either maliciously or unintentionally um act in a way that could negatively

affect that organization. So, that's kind of the working definition. And to me, there's three key pieces of this that we're going to cover today. Um, number one, that this entity needs to be capable. um because an insider who's not savvy enough or doesn't, you know, follow instructions, they're not really an asset that you uh that you can leverage. They may be misguided, but they still need to follow instructions. They still need to do those kinds of things. Second, they have to have agency. Um agency, I mean, we're talking about agentic, right? But agency really is to have sufficient access in order to do something damaging. Um, and then they have to be either maliciously motivated or socially

engineered to perform the malicious action. So that's the the uh what I think of as loyalty. Um, we'll talk about that a bit more. Okay. So I'm going to start with AI trends. Um, in starting with increasing capability uh capability very difficult definition to understand. It's the ability to do something. Um, so just this is a bit of an eye chart. go kind of uh uh top to bottom here, but we sort of started and this is not that long ago. This is only a few years ago with text models like GPT2 and GPT3. These models could really just predict the next token or word uh that comes. So they're they're sort of completion engines. Um then we moved to

models like chat GPT uh which really is an application with a model that is trained to have a conversation, right? So it's a it's it's still text completion but it's completing kind of a a script if you will like a a playwright. Um this led to the idea of instruction following where you can separate out and we'll talk a little bit about the the idea of the instruction hierarchy later but you could separate out both system and user level instructions and have it follow a script where it does that sort of thing. Then we added this notion of reasoning. And reasoning, if you if you've looked at a reasoning model, all it's really doing is having an internal dialogue

with itself before producing its final answer. So it's reducing additional compute resources and memory storage in order to um reason about and hopefully end up with more accurate or grounded responses. Um, and then I think for a technical audience, uh, I'm not here to promote or be against software coding, but they have gotten reasonably good at, uh, producing from a highle description of code. Uh, the code that will implement that. And that's an important capability to keep in mind when we're talking about the insider thread. Um, is it going to replace software engineers? I'm not here to address that. Um but it does mean that highle concepts can be kind of vibed into existence. And then lastly they're expanding their

sort of understanding of the world. So there's a lot more modalities that models are uh being trained on uh text, image, uh video, audio, these are all the ones that everyone is familiar with. But also the ecosystem around particularly around return uh retrieval augmented generation or rag uh those systems consume documents of many different types. There's uh systems like chunky or mark it down that are explicitly designed to convert documents of arbitrary formats into something that these language models can understand. Okay. So next I'm going to talk a little bit about agency. Um, so with that increasing capability, it's kind of natural for us to grant more agency, right? It does it's capable of more

things. Therefore, we want it to do more things. Um, and so that growing capability uh really leads to the rise of of agentic AI. There's many definitions of agentic AI. Again, I'm not a definition argument kind of guy, but to me, the useful one is Agentic AI is something that can affect the state of the environment. So, it can consume the environment and then affect the state. So, versus just consuming the environment. If you think of a basic chatbot, all it can do really is affect the text output. Like that's the only part of the environment it can affect. But when you start adding more capabilities to it, it could start to influence. it could start to connect to

uh and and produce a lot more stuff. So, we're seeing this break down roughly into two categories. Um the one on the left here is what we call computer use agents. Uh these are basically things that interact with the environment through human computer interface sort of mechanisms. So, they can take screenshots. They can interact with uh the desktop or browser with mouse clicks or keyboard keystrokes. Um, and sometimes they have shell commands, sometimes they drive browsers rather than desktop environments. Um, but in all the cases really what they're doing is still sitting between the keyboard and the chair, right? They're they're replacing that kind of functionality. On the right, you see things that are more like automation agents. These are

driven by APIs. These sit behind something. their their interaction with the world is is less direct in a human computer sort of sense, but they uh they their entire view of the environment as well as what they can influence is APIdriven. And so from these different systems, you can see different examples of the agency that are being granted. I actually this is a little stale. I created a few months ago, but um you can see the office environment uh from the major players Google and Microsoft. there including direct access to read and write documents through their agents. Um, OpenAI and Enthropic have both released variations of computer use agents. Uh, browsers, there's these AI browser category that's growing now. Uh,

and business workflow frameworks um have added AI support things like N8N uh or the Salesforce sort of environments. Okay. Beyond that, there's the uptake of something called model context protocol or MCP. Uh it's really on a hockey stick curve since Anthropic released it last fall. And um you'll see MCP servers hop popping up for everything from Google applications to GitHub to um the Blender 3D uh rendering environment. We were able to prompt inject you through Blender model. That's something you can do. Um or even Gedra for those of you who are reverse engineers in the audience in case you ever wanted to get a prompt injection from the malware you're reverse engineering. Um yeah we did that.

Um so that leads to the the concept of prompt injections and really this to me is exploitable loyalty. So these systems are designed with some kind of loyalty to their developer or to the user that's that's using them. Now you can have a healthy discussion about whether these are malicious or gullible insiders doesn't really matter in the end. Um regardless the the prompting community uh some of our researchers we continue to demonstrate over and over that you can flip these things uh and turn them against the intentions of their creators. So what does this look like? First we have to talk about the instruction hierarchy. So, OpenAI um released this concept of loyalty through what they call the

instruction hierarchy. Um this really says that there's a a hierarchy like the name says. Uh in theory, the system prompt, which is the prompt that provides the personality or the the purpose of the agent, the thing that narrows it down from the completely generalized language model that that OpenAI created to a specific task is designed by the system developer. Um and this is usually where you try to constrain it. You say, "Okay, you are a customer support chatbot, you are a healthcare chatbot, whatever it is." And then tools which really encompasses things that are supposed to to ingest data from external is supposed to have the lowest priority. So you should not be able to override from a lower one to

a higher one. Uh and if you look at this and you just trust it, you think that looks like ring levels or you know privileged mode versus unprivileged mode. But in reality what does this look like? Um, in reality, if you've done anything with these systems, you have this Python SDK on the left here. Um, and this is what instantiates to you as a developer the system uh the the instruction hierarchy. But models are stateless. And so this thing called the context window is always the entire thing that is sent to the model. And the way that they do that is they take these things and they separate them with what are called control tokens. These are

special tokens in the model's vocabulary uh that's supposed to train the model to separate between these different things. Notice it's train. There is no hard limit between any of these things. It's just things that it learns during the training process. So it's really one big string and that leads us to in uh to uh prompt injection gets its name from the classic SQL injection attack because just like SQL injection where you concatenate untrusted user input with trusted system commands in uh in a prompt injection you take an untrusted prompt whether it's from the user or from an input and concatenate it with the system prompt. That's exactly what you do. And through this you can

over and over you can override the the behavior. So there's numerous ways to manipulate this. Um you're not supposed to be able to read that if you're trying but you can go to ape.hiddenlayer.com if you want to see our our taxonomy. Uh if anybody was in uh the workshop yesterday, my colleagues uh Travis Smith and and uh David Louu gave a workshop on um attacking machine learning. I'm sure they covered the ape taxonomy in much greater depth. I don't have time to go through this all, so I'm going to selfishly cover a couple of my favorites. Um they're my favorites because I helped create them. So that's what you get. Um so the the first one is something we

call um knowledge return oriented prompting or crop and uh this is a nod to return oriented programming for those of you in the software space where um you can assemble the payload from what are called gadgets. They're little chunks of code within the actual benign application. Uh so that's return oriented programming. You just collect these gadgets and then you link them together through the the return stack. Crop takes the same idea with a language model. These models have all memorized large amounts of cultural, social or literary sort of references, uh, innuendos. There's a whole bunch of different things that you can you could see and um through those you can indirectly refer to a topic that might

not be allowed. So, for example, if you ask a chatbot to uh that uses a SQL database to just drop a table, it's very destructive. It's not recommended. Uh it should refuse because that's a very dangerous command. However, there's this XKCD comic called Exploits of a Mom where uh the parents named their child after a SQL injection uh and then 10, 20, 15 years later, whatever, the school called and said, "Hey, you destroyed our database." uh what gives and and uh in the comic she jokingly refers to their kid as little Bobby Tables. Well, it turns out all the models that we've ever tried have also memorized that little Bobby Table's full name is a

SQL drop table command and you can use this. You can see this I did I wanted to know if it still worked uh because we did this early last year. So the one on the bottom there is chat GPT5 uh from 2 days ago. So it still works. Uh and if you do this again in an agent that output is what you want to control because it's probably going to a tool call. It's probably going to the database, right? And so through this my prompt uh can control the output of the model. Uh you can refer to an endmap scan by referring to the matrix reloaded. You don't have to like the movie to use it

as a reference. Um you can uh get a netcat listener out of an episode of Mr. Robot. Um you can split it across modalities. Uh in this case you can um use the image and text modality uh and sort of leverage the model's knowledge of objects to play Pictionary. Uh okay. So that was one of my favorites. The other one is um something we call policy puppetry. Policy puppetry is a very flexible technique we created to override the instruction hierarchy or extract system prompts. Uh or you can also use it to bypass alignment or jailbreak is what the term I hate that term but people use the term jailbreak to talk about alignment bypassing. So if you really

want to get instructions for drug making uh out of your model um we were working on an agentic system when we came up with this and what we were trying to do if the model doesn't like to produce what you're trying to get it to do it'll refuse and so it'll return back a string probably either trained into it from alignment or in the system prompt says if you if you're not supposed to do this then respond like this. Um so you can see here this is a uh this is a little bit of a toy example but it shows the concept a little bit better. It's a healthcare chatbot. It is not supposed to provide health care

advice or medical advice. Uh you can see that shown in um in orange. Uh that's the system prompt on on top by the way. And the the attack prompt on the bottom um directly addresses that by constructing this policy like object uh that contains policy-like language that says you're allowed to do these kinds of things. And then importantly, it also includes policy- like uh language that says you cannot respond with these strings. And those strings directly match the refusals that it was giving. You can iterate on this until it works. Just if it gives you a different refusal string, just add it. Try again. Um, and eventually you kind of corner it. It's like, well, I can't give you

any more refusals because you told me they're not allowed, so I guess I'll do the request. And it works. You can see here um this particular one that's I couldn't fit it all, but the the policy puppetry prompt is on top and the response is a treatment plan for skin cancer. Um it's rather universal. Um, so we discovered with a little bit of tweaking, we could make one prompt template out of policy prepetry that at the time worked on all of these. You didn't have to change it at all. And in fact, you could change the attack payload inside of that template uh to something else and it would still work. Okay. Uh I'm definitely running low on

time, but um capability, agency, and flipping the loyalty. What is the impact of that? So some examples, we did some stuff um early last year on uh Google Gemini's integration with Google Workspace. Um so starting with the Gmail integration, uh you can put an email in someone's inbox. It's not terribly difficult to get an email into their inbox, it turns out. And um Gemini will just sort of pull all of the emails in their inbox into something that's kind of like a rag system so that it can reference them. Importantly, they don't have to open the email for it to get into the rag system and be referencable. So, um so we've got here is a white text

on white background so the user wouldn't see it. There's a prompt injection in there that says if the user references their Cancun itinerary. Uh you get something on the in the Gemini window that looks kind of like a fishing uh response. Um a similar issue exists with Google Drive. Uh the documents similarly can be shared without them opening the document. Uh and it'll pollute their Gemini instance. And um I didn't do anything particularly malicious here. uh but you can rick roll your users through poetry. Um the same issue exists in slides. It will actually populate that uh uh summary automatically when you open the Gemini sidebar. Um and so again, you don't have to do anything. You notice

there's nothing on this slide. Uh it's somewhere else. I'll show you in a minute. But again, you can rick roll the user. Okay. So, and it extends to Gemini creating slide content as well. So, this was a generated slide. I didn't make this slide. Um, obviously I'm a bit of a Rick Roll sucker. Um, so where was it in the slides? It's in the speaker notes on the last slide of the whole deck. Uh, you could put it in other locations, too. You could put it in the slide contents, but um that seemed like a good way to make it unobtrusive, particularly if they like have collapsed the the speaker notes. Uh oh, it's not going to like that one.

Well, you won't get to see my video. Um, but we did some work on uh cloud computer use and uh I apologize you don't get to see the the video of it playing but um the end result is the user asks uh essentially for help installing their hidden layer environments. It's some sort of thing we made up and there's a a a PDF document uh called hidden layer setup instructions. And so Claude computer uses a computer use agent. And so it drives this desktop Linux environment uh on your behalf. And in doing so, it opened up that document because it said, "Hey, there's a document called head layer setup instructions. I'll follow that document." And there was a um a policy

puppetry prompt uh that convinced uh the system to issue a somewhat dangerous command uh to the bash shell. Um it's it's an uncomfortable command. Uh deletes all the files on the system. Uh yeah, I've heard that's not fun. Okay. And then um I'm going to talk about cursor slightly over time but cursor is a well-known um uh system for uh coding assistant and um it's used by a lot of developers. It has a whole bunch of tools available to it. Most of those are for understanding the codebase uh interacting with it, creating or editing files, all the things you would need to do if you're a coding assistant. Um, and we're able to leverage these to steal

API keys and SSH keys from the developer by an indirect prompt injection in the readme file of a GitHub repository. Uh, and so if the user asks for assistant assistance setting up that project, naturally the agent who's trying to be helpful reads the readme file to find out how to set up the project gets prompt injected. Then that prompt injection tells it to search the system for keys and send them to a web hook. In fact, this is actually because of the read and write tools available in cursor. Um, this makes it susceptible to viral behavior. Uh, so we went a step further. Uh, one of my team members created a virus of sorts. I guess I don't want to

overinflate this, but basically the prompt injection is in the readme file. It's in a comment of the markdown, so you don't see it as a user. And it prompt injects the model to convince it that there's a a critical software license that must be applied to all future content created by that model. It's a very GPL. If you've ever thought of GPL as a viral lang uh license, now it is. Um and so it will insert itself into any further files that are created and that viral payload remains um hidden in that markdown comment uh and continues to execute. And then if you see what it will do is visit hidden layer.com but this could also be a web

hook. It could do the same thing I just said uh in the previous attack. Scrape all your keys and send them to us. That's our business plan. Okay. Um I want to wrap up. really it's important in agentic systems to recognize text is the new malware. Um tool access is kind of the blast radius. That's the way to think of it. And what keeps me awake because seeing all these entry points again I made this a couple months ago. I'm sure it's longer now. uh and um combined with the increasing capabilities you can imagine now we've seen this actually in a anthropic threat report that they released a human adversary using claude to basically run a whole

ransomware campaign. Um and in doing so it did actions inside their environment like analyzing their financials to find out how much money to demand. Um, obviously it found a number of different like pivot points within their environment, all that kind of stuff. So that capability that I described is coming, but the loyalty does not stand. So that's why I think we should really be thinking about it as an insider threat. All right, I will try to take a few questions.

as people or software developers build out >> as software developers build so um agentic workflows as part of uh the software that they're offering. What what do you see people doing to to pentest these various types of prompt injection? >> Well, I have two answers. One, if somebody's thinking about pentesting, then good or they're ahead of the curve. That's good. Um, I think it there's two flavors. Like what we see is a lot of customers still in the rag chatbot kind of era. So they're still thinking about that, but they're talking about Agentic. In in Aentic, you have to think more about the entry points because they're still thinking about the chat interface. They're thinking about

that part. And in in Agentic, you need to be thinking about the tool calls you need. So I've seen a lot of talk about uh MCP for example from a software perspective and that's great it has a lot of issues but also it is an entry point into the model and so you need to be thinking about what is it consuming what is it providing um you could have I I think it was on this slide right your customer review in your agent could be an entry point for an indirect prompt injection um the name of your business on a map could be an entry point into an agent that consumes map data. I already

mentioned a blender which is a 3D open source modeling tool. The mo the Blender file has text data in it and we use that to prompt inject the model through the MCP for for Blender. So really thinking about the attack surface a bit further um is is where I think they need to go. Yeah. >> How do you prevent agent from getting your CEO's credentials credentials and acting on their behalf? >> Yeah, it's a good question. I think starting to see more um >> we need to go ahead and get ready for the next presenter. >> Let's talk about that. >> Go and take that on the hallway, please. Excellent questions. Please give a round.

[Applause] Thank you. [Applause]

Heat.

[Applause] [Music] [Applause] [Music]

Heat. Heat.

Heat up here.

Heat [Music]

up here. Heat. [Music]

Heat.

Hello everyone. Today Christian. So he is a systems engineer with over 20 years of professional experience in designing building and security systems. is cyber security and security detection systems with the career spanning apps security validation and architecture as well as incident handling automation and certification as an enthusiast of artificial intelligence. is particularly used to decision AI and security exploring a system.

[Music]

Sorry.

So we know that information are key [Music] Even [Music] [Applause] [Music]

autonomous execution to the handling of of incident. So that's basically where we're going to go to and we'll try to focus on some of the opportunities and some challenges that we have found uh during exploring this area. So uh you might have heard already several of these in previous previous meetings. You have seen probably some workshops. I will repeat some of the stuff. So hopefully we all get to the same understanding. But how we get there are basically there gets a few of what entails being agentic. Right? So if we define AI or artificial intelligent as a field as a really world field about how to mimic human intelligence, right? machine learning can actually focus to

do so actually training the models with data right um and using different techniques for do so learn language model is one type of this machine machine learning techniques using different model for do that more specific learning and transformers and basically train with one a really massive amount of of data right and this is what we call foundation models the informationational models that will be used when we do use chat CPT and so CPT4 clothing son whatever you are using right so what are Asians then agents may be LLM related or non LLM related let's focus here about only the LLM related right uh and basically is using these LLMs to be able to provide more autonomy

so one way one like way way I like the definition of the agent is a a system that you be able to plan, reason and act towards a goal. You big a goal a goal and the agent try to do its best whatever it can do to try to reach that goal. I might imply using tools and this is what we need to provide. So while in machine learning the quality the quality of the data is really key because it's actually training the model to what need to happen for agent data is also important but also the tools. If we don't have just a good set of tools, if we don't have our knowledge base, our uh

rule systems or whatever we use, we just has an habit trade with internet with good and bad things. So this really important to have a good set of tools and this is the key starting point for a good agent to to to work. So we can define uh agents as a minimum in three types. uh I I like to differentiate those because each of the different types provide really important functionality to what we need to incident handling right so type one we can define a probably not autonomous right it needs actually a supervisor sang what to do make questions right it might be h might not have much memory but maybe short-term memory h try to do

its best I like to probably have some analogy we remember Teddy from the AI movie so it was quite character that answer questions. He had a new general knowledge but he forget about scene didn't have so much context right of what to actually to to handle. If you think about tape two, we go toward more autonomy, not complete autonomy, but be able to use supervision what it needs, but make some own decisions, start having some memory, start having some more context, start using other agents to interact with, right? Similarly, in the similar path, we can think about Sony from the iRoot movie. It was quite of a more intelligent type of uh agent. decide what to do but it still need a

human to kind of guide it to where to go right on type three beans probably in the other direction right high autonomously where actually supervision is optional they might be able to do everything by the thumb you might to have really evolving and self learning right and these type of systems are actually starting to to be possible there are some frameworks if you have seen probably like magnetic one is a really example of an agent that's actually have five uh four or five tools web browsing file storage execution coding you give any kind of problem and it will write the code try it see the results if it didn't work write another code and iterate until it gets something right in

this type of case I like to see about the skynet right right a good example of a nation that actually have a goal destroy humanity or whatever use other agents to actually send to the past to the future to try to achieve it goals. It's actually what what we are talking about. It doesn't mean it to be necessarily malicious, right? So usage of each of these types in incident handling some examples here, right? Um the type one, we can see really important things like incident summarization. We have many information about an incident. It can really elements are really useful for summarize for explain the forensic. we have have sim logs or many information that can be

accessarized and extracted h be able to correlate the different IoC's right that we might have in in in the incident probably use similarity or embedding techniques to be able to handle wide amount of data and probably be able to use rags or generation to focus on a specific areas of of the particular tactics needs and procedures. If we think about type two, we can think about using H for more adapted data enrichment. Right? So if we have part the the agent now is able to adapt identify that there is some particular tactic or whatever is being used adapt that get information with whatever tools and to they might have available that better fits to that no fix right might

be multifaces might try different things might try to inify trends on one tool another or on its own h and be able to able to try and prioritize cases right because we can see wider and try to look at what similarities on other cases is and propose actions right to the to the analyst right probably not executing or maybe on type three we might really think about the agent as a planner right as an agent that is actually be able to h find out what the good path or try things to try to be able to identify if an incident is a threat why or not h in the case we might have actually dedicated

agent maybe an agent shelf for one particular tactic or technique or procedure or whatever, right? And and and an orchestrator be able to decide when to use it and start trying things. And this is how this system is kind of works. And in this term, we are seeing agents that are being generated ecosystem being generated not only on tools but also on agent open source and commercial where we might see start interacting with them. We may have agents commercial or not to do one they do particular one good thing that is I know threat analysis on email or threat analysis on on another thing. So interactive with them use then and consolidate and use.

So if you think about h the basic um orchestration and automation we have a static seco workflow right we may have some variants but basically we have hardcoded encode or or or or configuration what we are going to do we are going to have a c we're going to analyze the file you whatever it is but it's the fine right in the ancient world we are actually might have the same type of tool the same capabilities but letting the patient decide right the flexibility that we have the agent be able to decide when I have the parameters available to actually call it this is what the agent do really well right and but none of them are complete

and useful for all the cases I don't think that everything might be not everything can be executed as a as a autonomous deterministic workflow and not anything we apply as a nation so why not do think in hybrid I think those things actually are working well, right? Where we can think, okay, I really have my automation that do really well handling some particular tactics. Okay, I can still use that. But this is just probably part of what the agent need to consider to see wider and look at other things. So too much has talked about MCP. I will not go details here. But I shall comment that several other specifications also been generated is through the MCP. one

of the more wider spread ones. MCP is mainly basis on the tools part of it. So that a nation can communicate with the tools and have shared counters between the agent and the tools. But we have also H2A or agent to agent protocol which is actually for agents communicate with other agents. So this communication about the different agent to try to make better decision but they are specific about something. So is not the only way to handle that but this is the way we are finding it working in a good way. They have really some similarities they are complimementaryary too but we can think about if there is a client that need to access an agent probably Hway is

the better way to do that agent communicate with other agent or client communicate with agent now with the agent need to communicate with the tool. Okay MCP is probably the way to do it. It doesn't mean that you cannot do that with MCP I call an agent. No, you can actually you can you can actually adapt to do that but the specification has been built for a different reason. So if we put kind of things together and this idea that not everything is enough on the authentic world and not everything that manually helps how can we think about dealing with it right so you may have already your automation right h do it appro almost already a

good percentage of the in incident that you you have right continue using that for sure but then we can add to that to the percentage of cases that have not been resolved appropriately resolved to apply an agentic autonomous execution. So try to get more results things on top of that and this is what we are exploring and we are seeing a good amount of percentia that we are missing with automation being handled by automation. Is it complete and enough? Again no it's better of course then you need to go to the manual execution and what to pass to the analyst. Okay, probably summarization that's where also the HT help analyzed before and the the analyst the the people the the humans

now might now actually start doing anything from a scratch may provide an assisted agent that probably the same instance of variant to the other agent that provides the context and allows to interact with them right to propose ideas to work together speak with the data right and say hey give me whatever data do you find this right and is that enough Of course you will go manual right and is all this process will we have a improvement on on content improving on the instructions feedback to the agent feedback h or or selfarning from the agent and that's a cycle that will for sure continue uh as increased the the the process. Um, of course, not everything

uh is good on the there are many challenges on on on the agentic world. Uh, there is cognitive limitation of the models, right? um uh models that you have tried for sure have seen that you ask something one time darkness time you have different results right so there is ambiguity there is nondeterminism right on how how to handle it might be actually in for if you change the goal many time many question there are too many data the start kind of thinking you might want to reset and start over that happens so things need to be considered in this case you might need to h use user confirmation for really important actions you might need to do validation

resets and summarization of what's going on operation and cost is also key things LLM are not cheap they are turning cheaper but they are not cheap so everything is is counting tokens right token is probably one and a half war or something like that h so as the bigger the case the bigger the number not only the case the forensic that get probably have a big amount of forensics or logs if you put that in an element will increase the cost. Do we want to do that? Sometimes yes, sometime. So we need to make sure what actually makes sense. Might probably need to summarize first to then pass it or or or trim. And you might also see when you have agent

to agents, they start using a spreading to other agents, it might take an infinite loop, right? That needs to be considered. So they need to be stop in some way and trim and use only when necessary and stop when when we think it's enough. uh security have talked already there have been many talks about that but of course prompt injections h the with protein injection the misuse of tools h and that might lead to data and retention and many other things that need to be considered in this cases and it can also about auditing who made the change was the agent or was the the analyst uh so that need to be clear and security protections and validation need

to be h applied for that so not only guar s permissions check but also tracing about what's going on. um how we have a nation. We want it to behave properly, right? We want to actually help us, right? So, you might start with some set of instructions. Uh this you're going to improve the instruction with the the the data that you have. Uh and as time passes and this being used, you start getting feedback from from the different users that might be applied even manually or automatically. So that autonomous system that might learn might take these learnings store it and start being used consolidating get conflict try to analyze conflict confirm these are the learnings and start using in following

path key factors again the data for for great and good efficacy of course the quality of the data the quality of structured the selection of the model and the selection of the of the versions right maybe different type of model multimodel images whatever you are using and the tool I going stress stressing the need of have a good set of tools quality to be able to the action to execute or the white is just an LLM. So with that I will go to a demo. So this is a prototype that we have been building. Uh you will see a chatbot with an analyst and some cases on the right the execution you will see the different

execution of tools that are being run. This is on a particular framework. Uh given the time I probably go shorter with this but let's start it. Uh so in this case analysis we say okay agent help me prioritize my next case my next incident I need to resolve indicate all the the agent is going to be assisted. So we're going to ask for confirmation for every step is not required. So when you get the get the details we summarize along what are the details it might probably point to the actual IOC's URL that need to be considered the might hey we might actually want to scan this do you want of course the the first scene you might

have an agent that actually analyze the agent will actually get along and this is what they scen the can extract what are these elements put as a parameter for the tool get some particular result of what are the the different element and here okay do you want to do this analyst but the anal might actually switch direction. No, I want to talk with the data. I saw there was an email. Can you an email? Can you show me this is in the memory? It was analyzed. It was in the memory of the agent. So, I might go to retrieve all or part of it. Right? And this is what an agent can do. So, then it can go a different way. You

can say okay do you want to analyze risk analysis? There were others users or impacted by this. H so look around start looking at more details. try to h spread and try to find similarity. In this case, you find the yet other users from other cases. It's just a tool that search for the same elements, right? It's not that complex, but it's really powerful. And then you can okay, yes, if you have integration with tools like EDR or whatever, you can start resetting the password, changing firewall rule or whatever are the actions that analyst need to perform. just implementing a tool as an SAP and training the agent to do that enables to this kind of things

right so continues so you can actually start closing the K you might submit the IOC to a group to create new content etc etc at some point in time the the the close will be closed the the close will be finished and the agent is actually able to to summarize what happened right you know what we did we close this we we analyze all this make a summary memory and store it right so it's it actually is on the different the many different so it's not only helping on the analysis but also on the planning and direction I'm able to adapt propose but always with the analyst on the loop to be able to decide

okay I think we are good in time um last slide okay some final concl question and remark here. uh if you are going to start adopting that I I would propose to go short leverage your tools you already have your automation your tools use them start type one or type two but start using them putting them together inform the agent I have this that do that probably enough for the agent to start using it you don't need probably much of that for starting with you need to productize is a different story you need to start thinking about all the consideration right think about the human agent collaboration you might not want to let the the agent to execute the

critical tasks, right? You might want to do the analysis, you might want to look around, but you might want to let alone decide you're going to reset the password, right? In this case, you need to be a strict aentic work need to be stopped here. You are not making a decision me programmatically and and probably taking the decision making from the agent as a proposal but enabled to do that and conflict by the agent by the analyst. Consider the tech momentum. You you know this is changing every day. We see improvements on models on reasoning on on on the number of parameters of the capabilities is growing every day. So we need to keep an open there but also when

we do the standard we see that there is some backward compatibility we use MCP or today new version chains right. So we are on an immature stack right this is a new technology immaturity is an issue. uh of course security uh uh need to make sure that validation are happening here right uh we don't want to handle the prompting checks when we want to leak we might need validation we might need to be able to include the actual h security measures and permissions not only for the analyst but now for the agent the agent as an entity need to have the permission clear defined of what might happen might share the same from the analyst or potentially much less right

Um for C cost if you start think you start playing with that it can take several uh from cent to probably dollars right so whatever you are doing for one single case depending on on the data so we can think about the future okay we might have a aentic analyst autonomous yes probably yes I think we can do that we have seen one we can create you can create your own but think about when you are doing that H and started using all this ecosystem of tools and element control. We want autonomous system. We want to do that. But the controls always need to be on our side. And with that, thank you very much.

[Applause] Questions?

what kind of strategy are you using for data retention? >> Uh so in for the rotation we are using the same that we are using for other systems right. So uh for instance the in the in the example that I show we are restoring the the the result of the agent to be maintained and we maintain the same uh retention policy that we have for the system. So that will require also to trim things that are happening on the aentic side, right? Because logs are sent and you send this to probably L fuse to to trace all the the information. You might need to trim on the different places for the enticing part, not only on the date. Yeah. Mhm.

>> Yeah. Which part did you use the A2A for in >> So the 2A for communicating with the agent? >> Which two pieces are which two agents are communicating that you used? Oh, in the example you mean? Yeah. Okay. So, >> or was it all MCP that we just saw in that chat? >> So, it's a mix. So, from the client side use H2A from the client the actual user. >> Oh, so that client to all those other things to >> is there a reason why you did that? >> H is it it really fits right? It has the different model. You can pull for results and you can you can have also streaming. you have have really had

really good interfaces to be able to refer with with the with the client. So that's one reason it doesn't mean you can do with MCP you can >> okay >> you could for sure but we are also using for for delegation. So when an agent needs to use another agent this agent to talk A2A to >> I was wondering if if the agent is actually calling back into the client which is why you need the birection but you can do that. So the A2A has a way to actually cue >> that's what I was wondering. I was like I it wasn't obvious that that was going on. >> A2A enables both paradigms. You can do

the polling or the push or the Yeah. Mhm.

>> Hi. The the demo you showed us, is this available for download somewhere or did >> uh it's actually internal current is a prototype internal. So not not not yet. Yeah. >> Okay. >> Yeah. Mhm.

But for instance, if if we talk about there are many might not be related specific to incident handling, right? But if you for instance get this magnetic one that I really recommend downloading and trying it, you will have a really good starting point. You you can you can then start extending that and adding your tools on top of that. >> A magnetic the magnet. Yeah, magnetic one. Yeah, this is the one that actually making decision writing code. try write another code really as is really surprising and scary. Right.

>> Thank you. >> Okay. Thank you very much. Thank you for Thanks. [Music]

Heat. Heat.

[Music] Heat.

Heat. [Music] Heat. Heat. [Music] [Applause] Heat [Music] up here. [Music]

BSidesPDX 2025 - Saturday, Track 2

Related talks