Show Me The Honey: Creating Elasticsearch Honeypots Powered By LLMs

Name: Show Me The Honey: Creating Elasticsearch Honeypots Powered By LLMs
Uploaded: 2025-12-12
Duration: 21 min 39 s
Description: A honeypot is a fake instance of a software platform, designed to intentionally attract attackers to log and monitor their behaviour. This is usually done by threat researchers, with the hope of finding zero-days or attack patterns for which they can then develop defences. This session will take th

BSides Belfast21:3947 viewsPublished 2025-12Watch on YouTube ↗

Speakers

Claire Dickson

About this talk

A honeypot is a fake instance of a software platform, designed to intentionally attract attackers to log and monitor their behaviour. This is usually done by threat researchers, with the hope of finding zero-days or attack patterns for which they can then develop defences. This session will take the audience through how to create a honeypot using Python (which is actually an LLM trained to look like a very attractive Elasticsearch instance of a banking site). I was involved in this project during a hackathon week at Elastic. We will look at the importance and ethics of honeypots, the subtleties of logging and monitoring attacker probes, the ins-and-outs of training the LLM and its latency, and some examples of the in-the-wild data that we received from potential attackers during this project. #bsidesbelfast25 #securitybsides #bsidesbelfast #bsides

Show transcript [en]

[applause] So, hi everyone. Uh, hopefully you can see me over this podium here and hopefully you can hear me. Um, I my name is Claire Burn or Claire Dixon as my married name is. I haven't changed all my names yet, so whatever you want to call me. Um, I'm a senior security data engineer at at Elastic, the people who make Elastic Search. Uh, and I'm a bit of a nerd when it comes to security and threat intelligence. So today we're going to talk about something sweet. Honeypotss. Not the Winnie the Pooh kind. The kind that attracts more than just cute bears. The kind that catches hackers. We're going to explore how you can use AI and a bit of clever trickery to build

realistic digital observatories that attackers can't resist. So first of all, what exactly is a honeypot? At its core, a honeypot is a digital decoy. It's a system that looks like a vulnerable high-v value target. So, this could be uh an exposed database or a seemingly insecure login page or even an entire application. But its real purpose isn't to serve customers or to provide int to to provide functionality. Uh it's there to observe attackers and to help gather threat intelligence so that we can better defend our systems against real life threats. Uh, honeypotss are designed to draw attackers in, let them think they found something juicy and valuable, and then log everything they do when they don't

know they're being observed. So, what's this project that I'm going to talk to you about today, then? Uh, this is a project that we did as part of a hack week in Elastic. Uh, and it was done over, it was a proof of concept that was done just for fun, effectively over three days, um, in early 2025. and the honeypot has been running ever since uh open to the public internet. Um you might be wondering why I feel like I'm qualified to stand up here and ramble about honeypotss to you. Um and that's because it's been a subject I've been really passionate about since I graduated from Queens with my cyber security masters and my thesis on

applied honeypotss under my belt. Um back then in 2016 when we didn't even have color TV yet, we had to create honey nets and the data for them in a really painstaking and manual way. uh designing, implementing, deploying each layer, uh doing all the JSON ourselves, but now uh AI might actually be able to do something really useful and help us out. And that's what excited me about collaborating with my colleague Miguel Garson on this hack week project to find what we find on the public internet. So before we dive into the juicy stuff, let's have a talk about ethics. Uh, honeypotss can be incredibly powerful, but they also come with responsibilities obviously. So, first and foremost, with

in a honeypot, we never use real customer data or real user data. No real customers, no real accounts, nothing. Second, we don't retaliate or engage with attackers. Uh, our goal is to observe from a distance like like David Atenburgh. Uh, not to entrap or provoke. We're researchers. We're not vigilantes. We want to learn from attackers and not really not become part of the problem. Uh so yeah, basically think of it like building a camera and not a trap. Um yeah, so you might also be thinking, or maybe you're not, I don't know, but what's the difference between a honeypot and a fishing attack where users are made to think they're using a real software? Well, in a fishing attack, uh

the information is usually extracted under duress or urgency. like think think oh you you've won a prize but this link expires in 30 seconds. Honey pots don't do this. Uh they don't provide duress and they don't take action against the users of the system. And that means there's a lot of pressure to design a system that looks real and feels real and responds in a way that seems real. And that's where this project comes in. So why why go do all this effort? Uh because honeypotss give us what traditional defenses often can't, and that's clarity. In a normal environment, you're flooded with alerts and noise and false positives, and you can't really separate a legitimate user error from a malicious

actor. With a Honeybot, there are no legitimate users. So any activity is inherently suspicious. And that's what fascinates me about it. It means it's pure signal. So, I'm just going to ramble a little bit more before I show you any data. Um, by their nature, most honeypotss are facads uh and don't usually go too deep. Uh, so we're probably observing attackers in the reconnaissance phase of their attack. But it really depends how invested you are and how deeply you want to you want to do the research. Um, so when we talk about honeypots, there are two main categories, low interaction and high interaction. low interaction. They're the quick and easy ones. They're mainly based on one

service. Uh they simulate just enough of a service to look enticing. So think like a little bit of fake SSH, a little bit of fake FTP or maybe an elastic search instance. Um they're mainly for catching scans and bots and just seeing what's out there. No real like intelligence or uh persistence or like deep threat research. But with high interaction honeybots, uh these are like the full stage play, not just the facade. They simulate entire systems. Uh and attackers believe they've hit the jackpot, which means they stay longer and reveal more. But the thing about these is they are very hard to maintain and you need lots of guard rails to make sure your system

isn't actually hacked. Um [snorts] and that nearly happened during my mast's project. Ask me about that at the afterparty. So yeah, both are useful. Um, but the honeypot I'm going to talk to you about today is a low to medium interaction with a view to becoming high interaction. So basically we were hoping to see uh who finds these systems on the public internet. Are they bots? Are they humans? Uh how deeply do these people interact and how far will they go to extract data? And does realism change anything? So let's get into the technical details then. Um we created what looks like a legitimate elastic search cluster uh available at the classic elastic searchport 9200.

It was made to look like a misconfigured elastic search instance for a banking website. Uh [snorts] so obviously a known target for attackers. Um here's the twist though. There's no actual data behind it and it's not a real elastic search node at all. It's a Python app behind the scenes. Uh so the back end's a Python web app uh based on AIO HTTP AIO HTTP. That's really hard to say. Um that talks to a large language model that provides all the responses. So even if an attacker runs a script that would normally compromise an elastic search instance, it can't it it's Python. It doesn't understand. Um, so the LLM will make up customer data, transaction logs, uh,

whatever the attacker is looking for all on the fly and respond to the request that the attacker makes in a legitimate looking way. So the architecture was just as I've said uh an AIO a I'm not going to say that a Python app which takes incoming web requests and we basically instructed the LLM behind the scenes uh to that you're a vulnerable elastic search instance please act like one. Uh the whole thing was hosted on a cloud provider which I probably shouldn't name in case I get into trouble. Um, and to make things efficient, we also used a little SQL like database that works as a cache. Uh, because a lot of the interactions we were seeing were from

bots. So, it means that anytime an attacker sends in the same query twice, as is going to happen with bots, they're going to send a load of the same requests. Um, we we don't have to keep bothering the LLM with that, and it doesn't it doesn't clog it up or like take ages to respond, which could be a tell for Honeypot. Um instead we hash the same requests, stick them in a table and serve it up instantly if it ever comes up again. Uh so for logging then we every request and every response gets written out to a log file. So from there we use the ancient elastic search tech of filebe to ship those logs over to an actual real

elastic cloud cluster. uh that gives us a clean centralized place to to sort of uh analyze all attacker behavior and crunch the numbers on the data like so we can see what queries they tried, what tools they used, how long they stuck around and just what they did. Um and we can we can build dashboards out of that and get pretty graphs which is what I'm all about.

So to make the data that we were returning to attackers believable, we trained the LLM to respond as if it were a compromised elastic search instance, we used Azure OpenAI um as the API and gave it a very detailed very long prompt um to that included some valid customer accounts and passwords and uh valid indices that they could query on the elastic search cluster. So like tables in a SQL database if you're not familiar. Uh we also set the LLM temperature to zero so that it generated responses that were predictable uh and deterministic and giving it minimal creative license because it could end really badly if we didn't. So uh basically if an attacker sent a real elastic search

query it would respond in a way that uh looked real and that's the point. So here is one of the tells that we probably one of the tells that it was a honeypot that we can work on but uh this is an example of the fixed response we gave it for when an attacker tries to run a cat query on our instance. So for those of you who aren't too familiar with elastic search, a cat query normally returns information about the indices on a cluster. Um although you can tailor it to to return more specific information in the parameter path. Um so can anyone tell me what might have been a tell that that uh this was a honeypot? What data

is invalid here? Yep. [laughter] And I I literally only noticed this when I was making these slides. So yeah, lesson learned. Um, also it returned the exact same data every time. So um, that's another tell for a honeypot. Uh, so don't do that basically. So here's a visualization of how much that cat endpoint was queried. 615 times at the time of writing this presentation. And obviously given more time, I would have loved to, you know, flesh that out a bit. and um like add different responses for the different uh cat queries. But um as we did this in three days, I think this is quite good. Okay, so I bet you're all waiting for what did we actually see?

So within hours of going live, minutes even, uh we were seeing scans against the system. Uh and as of the time of writing this, we've seen nearly 11,000 interactions against the honeypot. uh most of which have been bots, but we've definitely had a few humans poking around too. We also had census, who are basically a good authority for internet intelligence insights, uh scan us several times, and I'll show you what happened with that later, but let's just say I think we came across quite convincingly. So, some of the data then

Um so I'll just talk through this then. Uh so the part on the left is that left. Yeah. Um so we have the the body of the request that the the attackers sent uh B 64 encoded on the left there. And on the this side we have uh the B 64 decoded uh version of that. So the interesting bits are the first one which is that's just a normal elastic search query. So we would expect that if we were seeing maybe humans poking around at this. Um that's just a query that matches everything in the cluster and returns it. Um the next one of note is probably the third one down. Um, [clears throat] so

they're basically trying to execute a command within a script within our elastic search instance. And that would probably have worked if it was a real elastic search instance, but it wasn't. So, um, I'm actually interested to see what that returned uh, because I actually can't remember what the LLM will return in that case. So, interesting. Um, and then the other interesting one was, oh, you can't really see that. Um, [laughter] there is one up from the bottom and it just basically says that a guy called George was poking around on a Windows laptop. Uh, giving us his uh his hardware serial number as well. So nice. Thanks George. Uh, I don't know if you can see this.

Okay, you can't really see this. I'll talk through it. Um, this is another bit of data where uh we were basically tracking the authorization that attackers tried to use with this. So, again on that side is the B 64 encoded version. On this side is uh the decoded version. So, you can see that attackers are trying you know general bot things like 1231 123 admin. Um, there's some NLM handshakes which are a Windows authentication thing. Probably George again to be honest. Um, the other interesting one is uh the username elastic and the password northbank elastic. This was a username and password combo that we were using to test. Uh, we were actually using this for testing the LLM.

However, um this came up a lot more than the number of tests that we did. So, I'm wondering if the LLM like spat it out as a as a possible password given the fact that we tested on this so much. Um which is really fascinating um and something for me to look into. Um, and then the next interesting one here is the next one down, which is username elastic and password chang. Uh, so this tells me that maybe somebody plugged this into a an automated tool and typoed it uh, and then sent a ton of requests to our instance, which again interesting. don't quite know what to make of it but interesting. And then the the next thing that that

interested me was the path that uh attackers used in the query. So um with elastic search you essentially you essentially send a a curl request. Uh so you send you send it to a URL with a path. Um, so this was the last part of the path part that said part a lot there. Um, that the attackers used. All of these are genuine lookingish elastic search things. Um, apart from the the AAB and the fab icon.ico which indicates faults,

but I want to zoom in on the the search path here because what are what are people searching for here? That's the interesting bit. So again, I don't know if you can see that very well, but um we got some very interesting stuff in the search query. Um and a lot of scripts coming from certain countries. Uh and yeah, um the one on the top right here, um is definitely an attempt at compromise. And I'm I passed this along to our threat research team to see what they make of it. Um, and yeah, this this one on the top the far corner um was done a lot like multiple times a week. Um, so again

interesting and I took the liberty of uh translating it. Again, I don't know if you can see that, but very interesting. Um just yeah and then uh yeah so this is the this is the dashboards I was getting very excited about again I don't think you can see this apologies um but we were most scanned by the USA which is really interesting um and with a a region of New Jersey um and yeah you can sort of see the you can see the user agents that people used or maybe you can't. Um, come see me afterwards if you're interested in this. Um, and you can see also that we got scanned a lot by census, uh, Amazon, um,

and a lot of Chinese companies. Um, again interesting. So, I mentioned census. Uh, so this this was our report on census. Um, and as you can see, it lists us as an elastic search cluster with ports 9200 and 9300 open, which is correct. We asked the LLM specifically to act as an elastic search cluster with with these ports open. The other interesting bit is that it correctly or incorrectly identified it as an elastic search 8.5 cluster which is exactly the exactly the the version that we asked it to emulate. Uh so that's a that's a major win in my books. Um we managed to trick one of the one of the like biggest scanning systems

on the internet. Um yeah so takeaways then attackers move very really really fast you'll get scanned almost immediately obviously if you put anything on the internet secondly realism does matter as we saw with the cat endpoint like people kind of abandoned that uh but uh like yeah if if the honeypot does look valuable attackers will try things and third LLMs are incredibly effective at crafting believable fake data. They scale deception and they do it really really well. So yeah, whenever I did my cyber security masters, uh we had to spend hours making the data, but uh now we have AI tools at our disposal and that can help us out and we can iterate a lot

more quickly. So looking ahead, this technique has huge potential. Uh, imagine honeypotss that adapt in real time based on the attacker's input or LLMs that simulate like internal enterprise networks or changing error messages or even fake sock activity. This project was a proof of concept that we cobbled together in about three days. Imagine what else we can do with this given more time. I think there's a lot of research to be done here and a lot more fun to be had in a safe and legal manner. So, thanks so much for listening. Uh, if you're interested in collaborating, building honeypot systems, we're just chatting. Uh, let's connect after this and or at the afterparty. Um, remember,

best way to understand an adversary is to invite them in with a little bit of honey. [applause]

Show Me The Honey: Creating Elasticsearch Honeypots Powered By LLMs

Related talks