
All right, guys. Next up, we have the next generation of web exploits. From cash poisoning to multi-level fingerprinting, why complexity itself is the vulnerability with Steven Spreer. >> Hi everyone. >> I appreciate that. I also appreciate you coming. I know it's after lunch and we're like ramping down. Is that a phrase? Uh, and so hopefully we can have some fun here and learn a few things about some complex web systems problems. And please feel free to stop me at any point, ask questions. Uh, I come from an academic background, so I'm very used to students just either interrupting, whatever, and I actually like it better that way. So if something's confusing, let's stop and clarify. Uh, and then
let's have some fun. Uh, so what I wanted to talk about first is system complexity. Has anybody heard of this OPT or content before? This is internet mapping. It's old images, but essentially it's this visualization of the complexity of the web. Basically just IP mapping and connections between uh network level things. Uh hopefully you can see from the images here just in those those seven years even all the way back in 2010 we're seeing this exponential boom within complexity of just the internet systems uh at that networking. But if we think about this when we zoom down up let's go up in the networking stack uh when we think about web developers and their choices uh they
have so many choices to build a system that does maybe the same thing you get your choice of framework you can choose what CDNs you use if you use load balancers how you develop your CI/CD pipelines you think about each one of these components is adding complexity into like the web ecosystem in general um if you're familiar and I don't know if you are and we'll talk about it in a second but getting a web request to you is a miracle of uh system coordination and uh what we're going to talk about is that complexity and how that kind of exposes a a new attack surface beyond what's traditional bugs versus no bugs. Uh so given that uh clearly this is a
little bit of outdated it's a couple months old on the 2025 things but hopefully what we see here is as this complexity is growing we're getting more and more CVEes and if you notice that the top 90% of those are web based CVEs so that things like procate scripting SQL injections uh they're just getting worse and I want to argue that the added system complexity that is continuing to grow and cough cough if we think about AI being plugged into everything as well. Think about how many more things are getting complex and getting shoveled around places. So, it's just kind of going to get worse. >> Yeah. >> Are we lucky or is that a data sour?
>> This is a data problem. So, this was uh pulled like in April of 2025. So, incomplete data just in general, but I I like it that it still shows the trend up until 2024 because otherwise we're having the best year ever. Um, okay. So again, my thesis here, my argument is that this system complexity is an attack surface. And we'll go through a couple uh bugs that you'll see not a single system is vulnerable or not a single node in a system is vulnerable, but because you put them together, you now have a vulnerability. Uh and specifically, we're going to talk about uh like I just said, yeah, there we go. Uh and they're really really hard to
identify. If you think about it, you can't scan one technology and get a vulnerability report. You need contextaware systems to be able to figure out well you have Cloudflare Varnish engine X to deliver your web content. How do they interact with each other? And there's not really tools to be able to do this for us as dependent. Uh and who's going to fix this also like uh it's not a problem with any one of these servers. So we'll get into a little bit of discrepancy between understandings uh between technologies. But who's going to take responsibility and make some sort of change? Unfortunately, enterprise or businesses that use these systems don't have a solution and can't blame anybody and
tell them to fix it. So, uh, we'll see what we can explore and what kind of tools we can poke at things and try and help. And we're going to talk about web specifically because it's it's easy to talk about complex systems on the web as we saw the web is very complex. And specifically, we'll talk about desync attacks, cache attacks, and a whole bunch of other cool things. Uh, I don't was anybody at Black Hat last month? uh if you were there's this guy named James Kettle from Portswiger the people who make Burpswuite uh gave a talk on how HTB1 must die uh he's done this five teny year saga on desync attacks on the web uh so we'll talk a
little bit about that break them down as well so that we can they're they're confusing so let's see if we can make sense of uh and then I have to talk a little bit about academic expansion just because uh we'll see that we can actually leverage academia a little in some of these situations to actually help us defend things. And then I have to talk about AI just a little bit. Um because duh. Uh but also if you've noticed that a lot of hackers are using AI now to get better capabilities, do some scripting things and I get really worried about where this is going to go of basically speeding up the attackers to pull from
let's say academia or cutting edge research where us defenders are lacking. Uh, and so we'll kind of uh, so yeah, I guess I haven't even introduced myself, so that I'll do that right next. Um, so we'll talk about who I am for a second. We'll do a quick web refresher just so we have a shared context and language to talk about these systems. Uh, and then we'll go through a bunch of different attacks basically and explain them, make them make sense. I'll show some examples from our bug bounties and some academia from where we got some CDEES and some cool [ __ ] Um, excuse me. And um yeah, we'll talk about all of these request smuggling. Expect header.
T-Rex is a tool we built to help find some of this stuff, etc., etc., and how this implication, what tools you can use right now to actually start poking at some of this. And what do we do next? And then obviously so uh yeah, I said all that and I didn't tell you who I am. I apologize. My goal is to hook you a little bit before I tell you who. Uh I've been uh doing cyber security stuff since about 2015. Uh I kind of started by working at NSA for a year. It was a internship during college. Don't hate me. Uh and I got to do a lot of cool stuff and see some cool
things and it kind of got me started on cyber and then from then I've been doing research and teaching in academia for quite some time. Uh I did my undergrad and masters at the University of Michigan and I ran the the big security course there for a couple years. Uh, and then I came here to Boston, not here at East, Connecticut, but I came to Boston to do my PhD, which I just graduated in April. Uh, but we do some cool system security stuff, but I've done the the the gambit. Malware research, ransomware research, uh, network measurement was very heavy in Michigan's, uh, censorship, web, do a lot of system security, web stuff as we're gonna talk
about today. Uh, I'm a new time founder at a company, so that's my job right now. I won't talk about that because this isn't a sales pitch or anything. just talking to have cool stuff glasses. >> See, if I had known we had polarized glasses, I would have been so cool. Uh, we're technically in stealth also, so I won't even tell you the name company, but uh yeah. >> And then I I've seen this sometimes, but like statements are my own, not any of the substitution. Uh, and then this is my dog. I'm pretty obsessed with my dog. She'll make an appearance a little later. Uh, to help illustrate some things on the web. Uh,
she's a 80 pound Great Pyrenees and she's the best dog ever. >> Okay, happy to show you the 30,000 photos I have of this dog after the talk. Uh, okay. Any questions before we get in? Awesome. Sorry if this is a little bit like an academic lecture. That's my training. So, I do love to have this be interactive if All right. So, very basic web stuff, right? You have a user. They want to see a dog. They want to post a picture about their dog. Yeah, I'm going to talk about dogs. Uh, and so they're going to send some sort of request to this amorphous internet blob, say, "Give me dogs." And this internet blob is going to say,
"Here are your dogs." Can we agree that's kind of the general flow of the internet if we super simplify things? What really happens, right, let's say you're probably more so talking to something like a CDN where you're asking in the front end that the user is interfacing with Cloudflare, Appcom, any Fastly, Google, whatever you want. uh and they're either serving you some sort of cached version of this or whatever, but you as a user have no idea what's going on beyond the CDN. This is just this black box that gets you dog pictures. Cool. So, let's unmask a little bit of that complexity because that's where we're going to talk about a lot of these complex systems issues. So,
in reality, it's not just one user obviously, right? You have two users talking to CDN at the same time. It's only two. It's never more. Uh, and what really happens is the CDN has to go get these dogs from somewhere. And we'll call this the origin server. So CDN says, "Hey, if I don't have a dog in my cache, I need to go ask the origin server, where's this dog?" Right? And for performance reasons, mo all CDNs actually use a shared upstream connection between the CDN and the origin server, which means all these requests coming in the CDN go and share one request going to the origin. So if you and I both request the same dog or
the same some request from dogs.com that origin server, we're going to share the same connection for our requests and our responses in this upstream. So we're already starting to get a little complicated and I'm not even going to mention or bring it up today, but it's never just two. It's like five pieces here in the middle. It's a CDN, then you have a load balancer, then you have some cache layer, then you have something else and it throws it over to origin server. Oh, by the way, five different origin servers. It's a complicated mess, but even with this, we can actually dig into some of these complex issues and talk about some of these. But what ends up happening here, just to
demonstrate how this queuing and process works, request one comes in, it gets queued up. Request two comes in, gets queued up, and then the origin reads them off of this one shared connection, basically bites off the line, sees one request, responds one request, sees the next one responds, and then the CDN in order spits them back out uh from the response. That's what this this part is called. Okay, cool. I kind of talked fast and through that all very quickly. This makes sense because once we have this, we can dig into using this >> this thing here. No, actually, so if we have three users trying to hit the same origin server, at least a CDN for their
performance, it's going to actually only open up one TLS connection to the origin server TCP in in the sad case uh and send everything over that one. And then the CDN has their logic to be able to make sure the right person gets the right. Uh, so yeah, complicated as hell, right? >> Then the fourth person shows up asking picture two person. >> We might sort of add a cache or or let's say maybe it's a post request. So how to work? We're going to get into all this. I love this. You guys are ready to go. Um, excellent. So let's talk about our first attack. It's called HTTP request smuggling. It's kind of this category of
attack. What ends up happening is an attacker crafts some sort of request and it's one request but it looks like two and there's something in there that causes some confusion between the systems in this node. The CDN for this specific request will see one request. It'll handle it as one through this this whole system. But what ends up happening is the origin server origin server sees these bites and says I'm confused here. These are actually two requests. So now it's going to produce two responses to one incoming request and we have all sorts of fun we can get up to. Cool. So I went back. No, I went forward. Okay. So this we're talk abstract and
I'll show you like some actual bugs that cause this this discrepancy with different server types in this chain and then I'll release you to go attack people. Please don't. Um so what happens here? Let's just run through the same sequence. So the attacker sends this one request, it's bytes on the line. Now the origin server will eat this, split it because it sees two requests and respond to both requests on the cube. So we add one person on the input and now we've have two responses sitting there. As a CDN, they're not going to close this connection because everybody's trying to get pictures of dogs. So we're going to leave this open for performance reasons.
And uh what happens is your attacker gets response for that first part and the second part just sits here in the response queue. And what we have now is response cue poisoning. So the second somebody else comes on and says, "Hey, I want my dog photos. Uh let me know if it's getting too much, I'll not change." Um and we're going to send my request to get a different dog. So it has to go all the way upstream. Uh it's going to send here. Origin is going to produce a response, but the CDN responds with the smuggled request instead. We kind of see how this could be really really really really bad in certain circumstances. Let's say there's maybe a
session key involved that gets sent back or rather this is the response here that's sitting in a queue. So the third person that comes along gets that response and if there was a session cookie or a token or something bad there that third person probably the attacker is going to now grab that. And here's the crazy part. The CDN's not at fault for this. They're processing things how they do. They're just forwarding a request. The origin server also not at fault because this is this weird amorphous part in the specification where they can do something or nothing and it's fine. So, who fixes this? Well, sometimes you yell hard enough and the origin server will say, "Oh, okay. We'll
fix that." Or the CDN will maybe try and do some sort of firewalling to make sure it doesn't happen. But it's a complex system. You update the origin server. New bugs. new obviously not bugs on the origin server because that would defeat this whole talk but bugs in the system where they now communicate a little different. Okay, so let's just talk straight up real. This is from James Kettle's talk last month. He published a white paper on this. Uh and basically as you see on the left there that top request is in blue and it says post request. So it's going to go through the CDN and get to the origin server. And what happens here
is the origin server and the CDN disagree what to do with this expect. The CDN says, "That's cool. It's all one request. I see a content length of 64. I'm just going to put it in the in the pipe." And then the origin server says, "Oh, I'm expecting something. That's awesome. Uh, I'm done reading this request." And so it'll then pull the next number of bytes off the line, see a second request, and respond. And Kettle did things like this, especially with the expect header and a few other little tiny tweaks or bugs or different headers. And I think he was talking in his white paper, a couple hundred thousands of dollars worth of bugs just
on stuff like this alone because again, who's going to fix this? But it's also a huge problem. Uh, one of the cool parts is I think he got this off by one problem on what was it? Netly, Netlifi, something like that. And basically was able to receive requests from thousands of users with tons of sensitive data just continuously. Um, so I don't know. Hopefully I've impressed that it's a little scary of what I can. Um, and this was in last month. That's awesome. And here's my pitch for academia. Uh, me and Bruce here, we found this five and published a paper about it. Uh, and boom, right here in one of our tables of we found the expect
header. This is a problem. Uh, and James Kettle didn't write about it until 22 and said I didn't find anything about it and now it's like the big thing in in the white paper in his talk. So, not to be this sounds like it sounds like I found it first and I don't mean for that. In fact, I made a meme I made AI make me a meme about that. Uh, that's not what I mean. I mean, I think we should maybe look towards academia or some situations to where we can pull and get better as defenders. Uh, and that's where this AI argument comes in is I'm kind of scared for AI to be able to go
do this. They could read our paper. They could find this bug and we could have attackers that are five years ahead of defenders for certain things. Academia is also way behind on some other stuff. I'm not I'm not a huge fan of either, so don't worry. Uh, but specifically, I wanted to talk just a little bit about this p that paper and that table there. Actually, there's three tables and they refer to uh experiments that we ran basically systematically looking at every single part of an HTTP request saying what can cause these discrepancies. Uh and that's where this incentives are disaligned or differently aligned between academia and industry. I can afford in academia to go spend four
months building this systematic search tool that either gives us nothing or something. Uh, but in industry, I hear the incentives don't align with doing two quarters worth of work that does no bugs. Um, so I think this is where we can maybe play with each other and make things better. Um, but yeah, here's a HTB request. I'm going to post this picture of my dog. Uh, yes, she's still adorable. And we have a request line there that says, what method am I doing? This is a post method to the SL dogs endpoint. And I'm going to use HT1. Uh then you have a header section where you specify different key value pairs here. I'm just saying I'm going to send this
to example.com and I I'm including a 100 bytes worth of content uh in this body and then you have the 100 bytes worth of content. That picture is not 100 bytes it is huge because it is very detailed. Uh so what we did actually is is on each part of these massive fuzzing experiments to see what little bugs can we find that cause discrepancies between certain and this is this is what we found. We found different things where you mangle the methods up here in the request line. Messing with return characters and things like that is always a big win. Uh different uh embedded request lines within request lines. These are things where some servers will either do something
different with it or respond or decide things. Uh and uh we did this for every single part. So headers here that was the expect header. Uh lots of things with different transfer encodings. uh are we kind of familiar with transfer coding and content length? These are ways for you to specify how much data and how you're going to send it to uh an HTTP fellow communicator. So if I say I'm going to send you 100 bytes, that's a content length. I could also say I'm going to transfer it to you in chunks. So I'm going to send you a chunk at a time of a specific format where I'm going to say here's five bytes. Send you
the five bytes and then send you the next chunk after a little bit. So obviously messing with that can change where these boundaries are for these requests if you have different differing opinions on it. And then we even mess with the the body of a message and we found some discrepancies there often around these terminal characters as well uh or changing chunk size and stuff like that. But we basically have tooling that you can go and play with immediately and kind of replicate some of this or pull some of these and mess around. Uh okay. But the key thing here is we've caused discrepancies and specifically with the smuggling instance I show you a discrepancy that one said something and
the other says I see two things and now we have a problem. It's not just that we can actually force this to happen with something in the middle that does a transformation that then causes discrep here for haroxy uh in 2023 and we can talk about it because it's published. So that's awesome. And this is the literal attack payload that we had. Uh essentially it's an options here and the confusion comes where we omit the the name. Uh these are key value pairs for header. So we just don't tell you what the header is. And haroxy says cool man I'm going to clean this up for you because I know you made a mistake and it just drops both headers the
content length and the other thing. So obviously a pretty big bug and that this was definitely a problem of their their parsing. Basically, they were being nice though, which I appreciate. Uh, but now you have an options here with no content length and options doesn't need body. So, you're going to have other servers and the origin server say, "Cool, I got an options. That's awesome. I'm done. Now I see a second request." So, the attacker didn't even necessarily send something itself that just went through. They sent something that's going to now get transformed into something that's misunderstood. Uh, which is terrifying. So just to link all of the language together this is the same format of one
and two and that happens right. So cool make sense so far I want to just take one second to pause. Awesome. They are complex systems issues. So I want to make sure we like understand the system and then we can capt. So in this situation, this is where we left off with response Q poisoning, right? Where we talked about this second benign user asked for something and got a a request a response to a requested address. But what happens if they were asking for something secure like we were talking about or having some sort of cookie involved or something? Well, that's left on the response. So what does hacker have to do? They come, they send any
good old request and now they're served the next request on the response queue. They now have your I can now session hijack. I can do all these ter
um the next thing I just wanted to touch on as well is fingerprinting firewall bypass. Eight minutes left and access to bypass. So if you think about this number two that this whack sees one request, right? So we're doing filtering let's say on the uh the request line or headers or something. Uh number two doesn't get looked at that way. It gets looked at like it's a body. But the origin server is going to process it like it's a request. So now if there's something that you need to sneak in, let's say a SQL injection in a header field that you have a firewall rule deployed for to look for a signature. Well, you're not looking down here. So
now we just fully bypassed your million-dollar firewall that you have installed in your enterprise and your origin server now sees a second request. Boom. SQL injection. What is it? Johnny dropped tables or something like that and things happen. >> Yeah. What is it? >> Bobby. >> Bobby. >> Either showing my age or uh it's tired. Um but yeah, so it can go all of these different places. The blast bypass access control stuff. Uh that's kind of fun scary uh things you can kind of throw at. And this is just because of this complex system. It's not necessarily a problem here or here. It's just they kind of disagree where what should happen with this thing.
Okay. And so we're not just needing to smuggle. We can do other things. We don't need to smuggle a new request in. But let's use some of these misunderstandings and cause some other problems. So here's just a normal request. It's a get for homepage with a host header. Uh CDN says cool. That's a request for the homepage. totally valid. The origin says this looks great. Here's the homepage. That sounds great. Uh this uh actually we got a couple thousand dollars for submitting bugs and this is the exact thing. It's just this header. Uh it just has to be the right setup and we test safely so we don't take things down. But essentially CDN said that's
cool. That's a request for home. I'm going to cache this response as the homepage. Origin service says I don't have whatever this type one two through nine is. So I'm gonna give you 412. And guess what? Cash server caches before 12. So now we have a single request denial of service on your entire homepage. Uh so quite literally what ends up happening we were talking about caches. So I'm just going to assume a little bit of knowledge about how caches like kind of work. Uh and let's see that the homepage is uncached so far. So the attacker is going to slip right in before it's cached. Send this problem. Origin server responds with a 412. We
cache that 412 and then things get served from the cache. user comes in, I'd like my homepage, please. And they also get this error. And we've taken down a full website. One request to each cache edge. Okay, I'm scared, too. That's good. Um, and so it is that simple. Like that exact bug. Here's some of the other ones that we we got from bug bounties on hacker one. Uh, if not match star, so it doesn't necessarily the value resulted on this one company as a cache 304. Uh, transfer coding looks all good. just added an extra dash or your server would not deal with that. Uh and so it said 400 I I can't do anything but CDN dash
this. I'm not saying what CDN but I think not too important. Uh are we familiar with XHTTP method override? It's like meant to say like please override the method as this not that. Uh and so when you say head it says oh if I respect this I'll treat this request as a head request and only respond with headers no body. And so when you do this with a get request for a homepage, guess what? You might get a response back with an empty homepage and it's all good. And so that get cached as well. We've now gotten rid of your content. Um, and I think that one I can talk about that was on Mozilla. Uh, and it was only on a
static file, so it wasn't like the full homepage. But if you think of static files like also JavaScript files that make your site work or image assets and things like that, although they're now empty, so kind of I would argue breaks your site as well. Um, yeah, and essentially lasts to the whole cache issue. Fixing this is so not easy. you have to flush your cache and you have to make sure either you write a rule so the attacker doesn't put that back in the cache right away and what we would end up doing at least to demonstrate we would actually send this but say hey here are your five cache edge nodes this is public information I
know this and here's your cache duration I'm just going to send a couple requests right around when I know the thing expires and I'm going to try and get in that cache and now everything behind that cache doesn't have access anymore uh so I guess that argument actually escalated one of our bugs like I think 10x the payout from the bug or something once they understood oh this this is actually very easy to pull off an cool bug bounties real world stuff I think we're doing good uh I had this in the title so I'd be remiss if we don't talk about multi-layer fingerprinting u just to set the tone again if you remember all you as the
client see is the CDN you actually don't get to see this upstream you don't see the technologies that are back there you just think uh steve.com is is cloud. You don't know what's going on behind it. So if we did know then we could even apply more of these issues because now we know oh we have cloudflare we have varnish we have squid I know that there's differences here so I can send specific requests and you can do it smartly not just spray and spray and plug spray. So I just wanted to talk about the method that we did for this paper because I thought it was cool. This is the the setup where the client
is sending a request and it's going through Cloudflare Squid for I think load balancing and then Tomcat's responding with the actual content. Uh and from this perspective, the client knows nothing. It doesn't know what's up. What we do is we find a request that causes an error on all servers. So that means we know that the first node's going to cause an error and we'll send that along and we'll check that error page. Oftentimes error pages will tell you what it is. It's a Cloudflare error. it's a this server type error. If you have enough of those, you can actually cluster and decide error pages that look like this or this technology this and this. And guess what? In academia, they
let us do massive crazy clustering studies and grab like millions of web pages and do stuff like that. So, we did that. Uh, so you get a request, you get an error from Cloudflare. I now know that first node is Cloudflare. So, then I find a request that Cloudflare we know is okay with, but everything else causes an error. And we rinse and repeat. That kind of makes sense. So now we know it's going to get through Cloudflare just fine, but now we're going to get an air page from the next node. Pull that squid. We do it again and then we know the last one's Tomcat. Now we have the actual technology stack. And I even for
normal attacks now you know something else and maybe there's a bug on the specific version of Tomcat that's being used that you want to take. Uh and so now you know versus having to use JOD and census and all these scanning tools to try and figure out what's going on. Uh and yeah any any questions about this also happy to talk and a lot of these tools are open source from our academic days. So like uh the only thing you might be lacking is data and I can send uh us academics by the way are like so giddy if you want to talk to us about anything we've ever done. So I can put you in touch with the first author here
and he'll like explode and be like, "Yes, here's all the data. Here's code. Let me help you do this." Like, so that stands true for any academic thing you read pretty much across the board. You send an email to a grad student about their work and that's like the best month of their life and you're going to get a lot of feedback and they will help you do anything and they're cheap because you sounds good. Okay, are we good? We kind of see how some of these complex systems lead into some problems. Uh the system itself is maybe fine. These nodes don't have bugs, but because we put them together in a certain order and they kind of disagree
on stuff, uh we can cause attacks. So I'll argue that this is kind of a different attack surface than what we traditionally look at as security practitioners. Okay. So what is it? Yeah. Question. >> Not a question. I love >> I've seen instances where good setups like lead. >> Exactly. So just somebody leaves a server header in something and now you say, "Oh, it's Tom. Oh, I didn't know that." So you can pull all of this stuff together. Exactly. And and the the dangerous part is now you know it's there. It actually doesn't matter where it is. that you know the front and back and you know they disagree on something. As long as the middles don't affect it in any way,
you can still take advantage. So yeah, very very interesting hard bugs to kind of hard I keep saying bugs a hard surface to get your head around and to do and let's talk implications real fast. I think how are we doing on this? I'll wrap up quick. Sound good? >> Oh yeah, we can hang out. Okay, cool. Um, so it's not just web and this is the the maybe let's explode the fear just a little bit since I don't like the fear-monger but like I'm kind of scared about this stuff. So now you get to be scared. Uh, and it's it we talked about web and how this is a problem. Uh, some of recent academic work is looking at
this and finding very similar problems within SMTP. So the email protocol that everybody uses, right? uh a couple papers here, posters in academia that are exploring kind of how we did with the the headers and things like that. Uh anything that's parsing data, right? Zip parsers have this problem when you have let's say two zip parsers where you zip something here and un unzip it with gzip here and sevenzip here. There's have slight different implementation issues. And so there's been some academic work exploring some of these problems. uh you can go all the way to DNS because there's a bunch of systems involved here. Anytime you think of I have something that needs to communicate and
coordinate with a bunch of different systems AI uh when you chug it into everything uh now you have this new attack surface that we as defenders don't have visibility on don't know what to do about and we need better tooling to be able to do this essentially. Uh, oh yeah, another data paper. Oh, scary. Very scary. And then just to drive home the point about how AI is assisting hackers now and what I'm scared about it, that's going to start eventually helping get them ahead of the curve. So pull from five years ago our research and start doing things that the attackers don't have signatures for, don't understand what's going on. Uh obviously there's every everybody and
their sister has written an article about AI and AI court assisted and you're getting right now it's a lot of automation assistance or uh kind of more simple tasks that repetitive tasks for AI. It definitely will be moving in. Hey, let's do some vulnerability research for me. I need this this and and it's going to scare me a little bit. Um, and on that note, actually, one of my one of our friends just started a company. They raised about $9 million to do agentic threat hunting. So, in your enterprise, plug in this box and we'll do threat hunting 247 as an an agent in your network. And that's awesome. I think it's super cool. I'm also really
terrified. uh because if we can start learning or pulling in new techniques and things like that, now you're getting somebody more advanced, then we have the capability to defend [clears throat] against a really uh Yeah. Anybody else like kind of scared about AI just a little bit? At least in the future, right now it's so dumb. I spent like three hours the other night debugging in a loop a certificate issue with C or something and it's like do this. Cool. I did it. Here's the error. Oh, actually do this. It's just undoing itself constantly. I It's got a way to go, but it'll probably get there, which is not fun. Okay, so what tools can you use to get maybe some insight
right now in open source stuff? How many of you use port swigger or sorry, burp? Awesome. So, there's some built-in extensions written by James Pedal to help with some of these things like HTTP request smuggling, RAM miner to find things so you can test safely. There's a thing I skipped over because I we didn't have time. to test safely, you need something called a cache buster to make sure you're not actually overwriting the cache for like a homepage. You're overwriting a new entry so you can see behavior. So brand minor is really helpful for stuff like that. I will warn you it's it's a fixed small subset of these bugs that it looks for actively.
It's not oh all the things in our paper they get to look for. It's here are the things that they found some success with and so you're going to scan for just these. Uh actually this request smuggler was built in conjunction this is def param if anybody's heard of def param uh I think this is the one tool on github by defam but it helps you test for httpd smuggling a little bit again the same thing a fixed number of things you can see exactly maybe x override or http method override h2c so we were talking HTTP1 I think it's just easier to depict HTB1 some of these problems exist in HTB2 um but some kind of go away so some
of these desync problems disappear but not necessarily the confusion and the system uh disagreement on different things. So, we still have that. And for research tooling, if you want to look at our research tooling, the those three tables I showed you came from this paper, T-Rex, that was drawn by Barus's daughter. Uh, and we're an academic. That's that's that's the logo forever. I love it. And it's a tool. You can run this fuzzer to find these new things and it'll continually do this. We basically specify grammar and just say go. And we did that for a couple other things. Frame shifter does this for HP2 not just HP1 and then we messed around with trying guided fuzzing. Is anybody
familiar with guided closing? Just feedback of like I say something a seg happened. Cool. Let's do more of that. But we do that for HTTP requests and honestly just use frame shifter. The guided part is not as effective. That's something we get to do in academia, right? We say, "Oh, this would be cool. Oh, it sucks. All right, cool. Moving on." Okay, awesome. So, those are the tools that I know available to even start poking at the surface. But again, it's it's scary to me. So, what do we do next? I I think because this is seems to me like the next generation or new attack surface. As things are still getting more and more and more complex,
we're having more and more issues that arise and less and less of visibility. We kind of need a new approach to tools or thinking about these problems that are aware uh of this complexity kind of aware of your whole system. What what's the book? context aware uh for like oh GitHub copilot's context aware for your whole project. Well, okay, we kind of need that for entire infrastructure uh because you have this and this and this and this plugged in. Oh, your configuration changed on layer three. Well, that causes this whole different thing. We need we need to know how this is going to affect things. Uh and for detection prevention and the data is out
there, right? Everybody's logging every request that's going in and coming out. we need to just be able to ingest this and say, "Oh, here's how this server behaves on these kind of requests." Um, and I just the small part of academia that we could probably pull from is since we have the incentives to be able to do systematic explorations, pull some of those techniques and be able to look systematically complex reasoning uh to find all instances, not just the one that we kind of heard about and and had a cool thing about and got some Uh, and measurements. I'm your guy, by the way. Come talk to me about big and measurements and data stuff. I'll
help out with that. Honestly, does anybody know what the heck's happening here? This has been happening to me for three weeks. It's just like, "Hey, insert your contact." And then I click it and it just goes away. Anyway, cool. I'm hacked by NSA. That's fine. Whatever. So, this is what we talked about. I I pretty much have a non-existent digital footprint beyond my website and LinkedIn. Uh, fairly intentionally. I don't know. Um, so if you need to get in touch with me, my email's on there. LinkedIn is fine, too. And also always happy to talk. Uh, I'm also one of those nerds who if you're vaguely interested in what I talk about, I will talk about it with you for three
years until you kick me out. stuff.