Collecting Threat Data using Distributed Deception

Name: Collecting Threat Data using Distributed Deception
Uploaded: 2022-10-04
Duration: 48 min 3 s
Description: Michael Edie describes the Fakelabs Project, a multi-year effort deploying honeypots across geographically distributed locations to collect and analyze threat actor behavior. The talk covers the project's architecture, automation strategies, observed attack patterns including malware families and ex

BSides Augusta · 202248:03124 viewsPublished 2022-10Watch on YouTube ↗

Speakers

Michael Edie

Tags

CategoryTechnical

TopicMalware Analysis Threat Intel Threat Modeling

ResearchCase Studies and Incidents Analysis Empirical Research

StyleTalk

Mentioned in this talk

Tools used

Cowrie Splunk

Malware

XMRig

About this talk

Michael Edie describes the Fakelabs Project, a multi-year effort deploying honeypots across geographically distributed locations to collect and analyze threat actor behavior. The talk covers the project's architecture, automation strategies, observed attack patterns including malware families and exploitation techniques, and actionable defensive recommendations derived from four years of primary research data.

Show original YouTube description

What happens when you deploy honeypots in different geographical locations then monitor, collect, and analyze the threat data for several years? The Fakelabs Project is a practitioner's materialization of this idea. This talk will discuss the project's architecture, observations, automation, derived products, and lessons learned. In addition, there will be demos and suggestions for how defenders can apply the information presented.

Show transcript [en]

all right my check audio good all right awesome all right good afternoon welcome to besides I know this is like the end of day talk so hopefully I keep you entertained and uh don't bore you too much so I'm a senior security engineer working in the dod um this talk like uh was introduced as collecting threat data using distributed deception so who am I I've got a background in computer science uh spent some time in the military doing uh maintenance specifically on tank Vehicles so then I transitioned to cyber once the Army said hey we have this thing called cyber and we're looking for people that are interested in that so because I have a

non-traditional path my nickname became the mechanic so you'll hear some people refer to me as the mechanic and that's that's the reason why um a few years ago I started this thing called smash the stack with a couple other people and I still run that to this day if you're familiar with over the wire it's kind of like that but a little more advanced so you can Google that and look that up and then I dabble in a little bit of cryptocurrency now when I'm not messing with anything technical I want to take a break from everything cyber I spend time with my family I'm riding my motorcycle or I am watching some kind of movie or fiction right specifically

anime if you recognize any of those up there then you know kudos to you all right so we're collecting threat data using distributed deception what is threat data well threat data is anything like an IP address a URL malware right these are things that we call iocs that's basically threat data another way to put this right if there's non-technical people in the audience like a spouse or or a friend or whatever right if you're not into Tech think of it as like someone's trying to kill you that's threat data right you can't really do much with it to make it actionable you need to put some context around that so I'm collecting all this threat data

and I'm storing it somewhere so we're going to talk about how am I collecting that the next part is this distributed deception piece now deception is I'm using a Honeypot essentially is what I'm doing right so there's a honey pot called kauri and I'm using that as the deception technology so threat actor logs into the system they think it's a real Linux system all the commands that they would do on another Linux system would work and I'm capturing all that information so they log in uh they're they're looking at files that see all that they're trying to X Bill things I see that they pull lender tools I'm capturing all of that right so they're

being deceived into thinking in this actual legitimate system and the distributive portion of that is I'm putting these systems um all over the place so I have systems in Asia in the United States European Union um specifically like Finland Germany California for the United States and then India right so they're distributed all across the globe and they're collecting threat data which I explained earlier so here's the agenda for today right we're going to look at what's what is this architecture how am I collecting the threat data how is it distributed we're going to take a look at that we're going to look at the automations so if I spin up a Honeypot I'm not doing all

these manually right think about having 200 honeypots out there and having to manually configure install Linux configure the Honeypot whatever I'm using to extract data I'm configuring all that manually doing that 200 times that's you know 2022 that's not what we want to be doing so we're going to look at how do I automate all of this and then what are my observations this is probably the piece that most of you are interested in right how do what do you observing I've been doing this for about four years right so I started this a long time ago as like an individual research project I'm like hey I want to see what's happening out there yes you

can go to you know malware archeology all these different sites and download malware um you can read reports but there's nothing better than like that primary research right it's like I know this occurs because I saw it right so that's kind of why I did this so it's been about four years and I said hey you know I can turn this into a talk and share the information with people that are interested so some of you may be interested in architecture and that's all you may be interested in how am I automating it and some of you might just care about hey what is he seeing and then we're going to look at some takeaways

all right so it starts with the Honeypot right the deception technology every single Honeypot uses Kyrie which is a medium interaction honey pot you have low medium and high the higher you go the mean that means the more realistic it looks think of b-sides you find a laptop laying around somewhere you pick it up you open it up oh it's Windows 11. you start typing commands and they work the attacker scene is the same thing on the honey pot Splunk is how I extract the data so there's a Splunk Universal forwarder those of you that are familiar with Splunk it's the equivalent like win log beats or file beat it's the same thing right extracts the data from some

endpoint and feeds that data to some centralized Sim so I'm using Splunk here because I have a lot of experience with spelunk and that's just what I use in this environment and then every single Honeypot is with a different provider so it's not like I spun up an AWS ec2 instance and they're all in AWS and they're in different regions it's a separate provider that's in for instance India so provider in India it's hosted there if it's in California it's a VPS provider in California within the United States um and of course like I said they're all running Linux either Ubuntu or Debian or some flavor of Linux so none of these are windows uh honey pots

all right here's another one that's in the European Union exactly the same and another one in the United States so that's what I call layer one Layer Two is the seam level so all the data gets pulled from all those honey pots that's geographically dispersed or distributed and they get fed to a Sim that's in AWS then we have layer three which is the Splunk search head I use this to interrogate the data that's in the Splunk scene and then that data gets fed back into visualizations or any kind of data that I can manipulate and then from that Splunk search head I have a lot of the automations that push out the products right so one of the

products is a CSV file with various things that we'll cover later on and then more recently ATT has this thing called otx or open thread exchange essentially what it is is if you found like malware or you found an IP address or some kind of ioc you can publish it to their site in the form of a pulse and the great thing about that is is if I found something and you found it that can get published um basically anyone that goes there can say hey three or four or five people saw the exact same indicator so that gives you a little bit more confidence that is probably malicious all right so that was the architecture

right you can take a look at that and generate any questions you might have later and then so how do I automate that infrastructure as well there's three forms of automations the first one is Splunk right so Splunk has this thing where you can write a query let's say the query is hey how many domain controls do I have in my environment you're going to get some Metric back that says eight nine whatever that number is right let's say every morning you come in and you're tasked with running that query well that's that's kind of uh monotonous right how do we automate that Splunk lets you write that query you can save it and you can schedule it so every

morning you come in it's already ran and you get your metric of course you can create dashboards and all kinds of things like that right so that's how that's the first type of automation that I'm using the second automation is is something near and dear to my heart it's called ansible right some of you may know it puppet Chef or some other tools out there that automate things I use this to deploy the Honeypot it also deploys the universal forwarder and any other configurations that I have so I basically run a Playbook and I have a Honeypot that's provisioned in a few minutes and then I have feed Generations right we talked about this a little bit

earlier from the Splunk search head I have a git repo that has the CSV from two of the products and then we have the Alien Vault otx pulse that's created so don't worry about uh uh trying to figure out what this means or if you can't see it essentially what it does is I extract all the IP addresses from all the data from all the honeypots right these aren't IP addresses that that have scanned the Honeypot that have like Ping the Honeypot or some kind of interaction these are IP addresses that I've gained unauthorized access to any one of the honey pots right so that's that's the criteria for making the CSV list the other thing that I do is I enhance the

data by adding like first time I saw this last time I saw it the geolocation um and I also looked at how many sensors have seen this particular IP address right so if the IP address is in there and it says and there's a there's a column that says number of sensors and it's three that means hey three honey pots I've seen this particular IP address and it's gain on unauthorized access to that system all right so I'm going to recap it's not scanning right it's not pinging it's actually landed on that box and performed some kind of execution of code um and then the other thing that I kind of did that I thought people would have

find Value in is I took the tour exit node list and I said hey how many of these IP addresses that have done this bad thing are also a tour exit note and I put that as a field and you can see that here I'm not sure if you can read it or not but that's what it looks like the part that I have outlined is the time that it runs which is 2 am every single day over the last few years right and it generates that and then if you want to view it you can go into Splunk and also view that report and this looks at all the data this is the other product that's

generated which is a URL feed so when someone logs into that system right whether it's automated or it's someone manually logging into one of those honey pots they reach out to pull code down well I capture that URL and that's what's that's what's generated in the second CSV product so that guy logs in they say hey I'm going to go out to my infrastructure that's in Russia China wherever it may be right could be U.S infrastructure as well um and let's say it's their you know cool C2 tool that's and GZ gzipped well I capture that URL string that they that they use I put some of the same metadata around it that I do for the IP address field so

first time last time number of sensors but I do one additional thing I look at the data the domain names that's in there and I say hey you know what there's an Alexa top 1 million domains let's pull that out and the reason I do that is a lot of people take these lists right Blacklist block list whatever you want to call them and they just put them in their IDs they don't filter them they don't sanitize them they don't look to see they said hey someone already did the work for me it should be good and you may think find things like 8.8.8 in there which you don't want to block right those of you that notice to

basically Google so I pull it out just as kind of like a due diligence for people that's going to use this blindly um it's not foolproof though because a lot of what I've seen is not domain-based they're not going out to www.bagguy.com right slash cooltool.zip a lot of times it's IP based so the Alexa top 1 million is domains not IPS so I'm not going to capture the IP addresses that are in there that's potentially like Google domain or Microsoft infrastructure and this is what it looks like

um and then now for the honey for the Honeypot deployments in the Splunk UF right I use ansible and ansible has something called A playbook so you can do individual tests or you can run a whole Playbook that has all the tasks that you want to do and there's a thing called roles that containerizes all all the things for a particular uh action that you want to do so I have akari Honeypot and I've customized the right so this is not a default install of Kyrie this is there's things like for instance when you install Kyrie it used to be Richard as a default user so I see attackers logging in and they're gripping for the Richard user to see if

it's a Honeypot uh the current version of Kyrie has Phil as a default user so of course attackers are going to adapt and they're going to grab for the fill user right so I don't have any of those users in there I have about 10 or 20 users that I just create and I throw those into the Honeypot so attacker logs in they grep they see a bunch of random users and they keep telling what they're doing so coming back to the The ansible Playbook there's a lot more code right written in yaml behind this but it doesn't make sense to show it if you want to see it after I'm going to do a

demo of it of deploy a honey pot so you can see what it looks like but this is essentially the code in the Playbook that runs a bunch of other code behind the scenes all right so we're going to go into a quick demo um of what it looks like this is deploying uh one Honeypot so it's only doing the kauri deployment right this is not the full deployment so you can kind of get an idea of what it looks like if you've never seen ansible execute before this is what it looks like there's one particular host that is being deployed so where it says beer that's one host if it was 200 hosts there'd be 200 hosts

lined up across there um those of you in artists that are like hey well you wouldn't do this with 200 hosts there's a thing called ansible Tower and that's what you'd essentially use to deploy at scale uh if you go to Kyrie's website let's say you Google Kyrie right now and look at the installation steps there's about seven steps that they set up so what I did was I went through each step and I basically created an ansible task or rule that completed that and I added about one or two additional things so like at the end you can see it's checking for the SSH Port the default Port is quad twos I'm checking to see if

that Port is active and that lets me know like hey the Honeypot got installed successfully all right so now for the the GitHub data feeds right so I talked about the csvs that are generated you have an IP CSV and you have a URL feed well those sit on the the uh the Splunk index or Splunk search head right how do I get those out so that you know other Defenders can use them because it does it does you no good if I have them and you can get access to them let's say you want to use them well what I do is I run a uh I have two batch scripts and it basically compares

the CSV that gets generated by Splunk with whatever's in the git repo and if there's a difference it it commits the change and pushes it a GitHub so every single day um around you know sometime after two o'clock in the morning a new file is going to get pushed uh so you can pick that up and then the alien bar otx pulse same thing here if you can't read the URL obviously I can give you the slides after this or you can just Google it it's called fake Labs honeynet project right the great thing about the pulse on Alien Vault is it allows you to convert that CSV into other formats this is what it looks like on alienvault

and remember I talked about being able to correlate that with other organizations or individuals that have the same findings so this is my actual pulse that I've submitted right based on the data we were talking about there's at least three indicators on here that have over 50 other people that have identified the same indicator right so I'm seeing this in my honey pot and you have 1978 57 and 33 other people that have seen different IP address addresses doing different things and like I talked about right if you're if the CSV format isn't something for you uh we have open ioc and we also have sticks and taxi so if you're using misp you can convert it to

one of those formats and and import it into whatever platform it is that you're using observations okay so this is the part that most of you probably interested in right what have I seen I set all this infrastructure up I did all the automations you don't care about the automations like what have I seen okay so the number one thing right we talked about this there was a talk this morning talking about like passwords right this thing is is Beat to Death at all the security conferences any web webinars you've watched they always talk about it and I think it's relevant because it's true right we talked about iot X iot all of them have default

passwords you buy you buy uh you know any off-the-shelf device and it says hey you need to configure it it's going to have like whatever brand and then password as a password or admin admin right so out of all the data right we have eight million events what's like the top 10 passwords that I've seen well the number one passwords one two three four five six all right no surprise there yeah so I did some research and I was like hmm I remember there was like some list some were published right about like the top 10 whatever passwords and it turns out that Forbes did a study of the 100 worst passwords in the world

and the number one password in that list was one two three four five right one two three four five six which is what I have here was number two so it's so it's still important right we had we had a ton of talks that talked about you know admin admin password password and we'll see like when I when I go on and we start looking at some of the malware that I was looking at they have these uh like basic passwords hard-coded so they're not they're not looking for complex passwords they're not they're not having like you know we talk about a billion password list or whatever they're not they're not looking at like a 10 000 different passwords they have

like 10 passwords in the list and they're searching for those and they're gaining access to systems right so I thought it was important to to communicate that here to you and if you like pie charts right there you go you know almost half of the passwords that I see one two three four five six okay all right so so what else have I seen iot malware right or x x iot if you will is basically what I see a good eighty percent of it is that and they're Mirai variance right Mariah what is that it's 2016 was discovered basically a large botnet that took down a bunch of uh different organizations krebzone security is one of them uh ovh which is

a provider over in Germany and a lot of the a lot of the attacks that happen were really collateral damage right so you have this huge IP space let's say they're targeting like the Sony network and some other company happened to fall within an IP space it was collateral damage it wasn't necessarily a Target so I see a lot of Mirai variants that Mari code was leaked and a lot of other thread actors said hey there's this very nice code base that's out there I don't have to do any research right I don't have a security team I just take this code and repackage it and that's what they did and you have a bunch of different variants

now what I noticed is there's no industry standard for naming the variants from for Mariah right so let's say you know you decide to hey I'm going to take Mariah and I'm going to make it my own because there's some vulnerability that came out there and I'm just going to adjust the code and you have some string in there right let's say you put like Sora in there and somebody discovers that variant out there they name the variant Sora it's really just based on the binary name some particular string in there or maybe some output that's generated from the binary that's the only like way that's the only rhyming reason how they name these things

and I picked an example to show you this happened uh in June of this year right it was discovered June of this year it's called the rapperbot um and the only reason it was named the rapper bot is because it drops a YouTube link to a rap video when it executes right does that look like any kind of like IEEE standards we're naming a malware no all right it uses parts of the miracode base right it's one of the variants um the research out there says it's just acquiring access right they don't see it doing anything malicious other than accumulating access and then there's a unique identifier if you run strings on the Byron it's going

to pop this out and what it does is as it scans the Internet connects SSH servers it puts the string in there right so that's that's an ioc if you will I would use more than this as an ioc but that's one of them some other observations right as you're running strings or you're throwing things into binary ninja or either whatever thing you're doing if you're doing that I see things like this right cha cha sliding across SSH RN is like right now um so I thought that was interesting to share like they do have a sense of humor and they do put some things in here and then this is another thing I see

this a lot right so there's a ton of pearl scripts they're they're basically taking the Millwork there's a site called um uh packet storm and there's a lot ton of export code there they download this code and they just change the attribution right so this was actually written by a group called millworm I can't remember the exact name but they take that same code they'll either cut out the header or they'll put their own name in there and all it does is it comes up it executes on your server connects to an RC server and then from that IRC server they can say hey attack you know Sony network or attack this other IP or let's say you're playing you

know you're playing on some game server Quake or whatever it is and they know the IP address this is one of the reasons why people say hide your IP address they can have their 50 000 different zombie computers basically attack your computer and that's that's essentially what this does um and then this is kind of like indicative of Mao uh Mariah and the different variants um essentially what this does if you can't read it don't worry about it it's trying to find a directory that it can enter and then it downloads every single architecture so it's saying hey um as I'm going to compromise these IP cameras you know the refrigerators the thermostats the whatever it is that's

out there right that should or shouldn't be connected to the public internet I don't know what architecture that is I don't know if it's a Windows laptop right like as I look across this room like I don't know if that's a Mac laptop I don't know if that's Windows laptop that you convert it to a Linux laptop like I don't know what the OS is so if I have one piece of of software that I'm trying to execute it may or may not work but if I want to compromise everybody in this room I'm going to say okay what's what's what architectures are out there you have arm you have Intel right so x86 so what I'm going to

do is I'm going to take every single architecture I can and on the screen is basically says power PC arm if you have a Raspberry Pi that's going to be arm architecture the M1 the new M1 Max those are arm right and I'm going to execute I'm going to download all the different versions that I have and execute them till one works and that's essentially what they're doing then we have another malware called DOTA malware anybody take a guess at why they call it DotA because it's called DotA the the tar gz is it's dota.tar.gz that's the only reason why they call it the DOTA malware right uh so when you extract this it's going

to have I actually wrote a blog post on this so if you we're not giving my blog later you can look at it and I dived a little deeper but there's two binaries in there there's a 32-bit and a 64-bit binary called TSM that's responsible for going out and scanning other systems and compromising them then you have two other binaries called intercon and and they change the names up so as you get these different um tar.gz there'll be different names in there but they essentially do the same thing there's going to be two that go out and scan and capture other machines and there's going to be two that's responsible for Monero mining all right

Monero is basically it's supposed to be designed to not be traceable and then shell scripts there's a ton of shell scripts that basically do the housekeeping responsible for executing the different binaries and setting up the crypto mining and then I found like this random obfuscated file that was on the system was basically basic store encoded and then once you decoded that it was Pearl obfuscated if you're not familiar with obfuscation uh think of like a string of code that says print hello world but instead of saying print hello world there's like 50 lines of code that essentially does the same thing right that's what they're doing is they're making it really hard to read what the

code does and then uh here's just an example of a Huawei exploit right so they took the miracode base took the exploit for the Huawei for Huawei uh device and they packaged it up so what they're doing now is instead of using the password to get into a system they can say hey let's see if we can export a system using this exploit and gain access that way so if you have a really good password right you said hey I'm going to change the default because I heard I shouldn't be doing that and you put in a good password well they can still export your device this way and gain access and then scan the other

systems here's another one I had no idea what total link was at first but then when I I Googled it I saw it was another like you know basically Wi-Fi router you can buy um what was interesting about this and kind of some feedback for Defenders out there that are looking at strings within binaries um so this is all plain text but if you xor the binary strings using the default Mariah code which will there's a default key so xor I'm not going to get into details of that but if you're if you're familiar with that basically don't just look at a binary that reveals strings as okay this is not encrypted there may be

encrypted strings in there which there is this particular one revealed the payload for the total link exploit and then here's another demo right um so people think that a lot of the attacks that happen on the internet are all automated whether there's actual individual threat actors that land on systems and don't realize that it's a Honeypot and here's here's an example so the great thing about Kyrie is every single uh attack that happens on the system once they log in it records what they're doing and there's a tool that allows you to play back all of those attacks in real time or you can speed it up so just take a look at this and I'll

kind of narrate as it goes through

you'll you'll notice the uh command line when it comes up it'll say AWS that's that's arbitrary so I can make that whatever I want and I kind of made it look like AWS so they think that hey this is probably an ec2 instance or whatever notice that the there's no default uh fill or Richard user they're trying to basically on set the system up so it doesn't record their history they're checking the architecture okay what kernel version is it vulnerable some basic reconnaissance right nothing nothing about what they're doing right now is is Nate you know Advanced essentially they're checking to see you know how many CPUs do we have maybe I can mine crypto on here right there's only two

cores on there so nothing nothing too fancy uh looking at temp directory to see if there's other files maybe they can mingle their files there and then they decide to go to Dev shim and this is where they're going to pull down their files so what's great about this is I now have a copy of that that package right whatever payload that is whatever tools they have in there I now have a copy of that that I can analyze I can build detections around and pass and share that with the rest of the community uh the other good thing about it is they can't use that to attack other people right so let's say they they pull down

whatever nation state tool or whatever tool it is that they have they can't now take that and attack other systems so the great thing about running a Honeypot if you're talking about what are the risks um that's one of the things that Bobby design you cannot attack other systems so you'll see they'll try to execute um different code and they'll get frustrated like why isn't this working it's executable they'll try to download it again it doesn't work and then something that's pretty great is they download another set of tools right it's called fresh.tgz so I now have two copies of different uh different payloads if you will that this particular attacker has right and while

we were going through this you notice that it's it's not automated right this is actually someone typing these commands um there's where they download the other tool and I said thank you very much all right they try to extract it extracts now if you're not familiar with Kyrie I'm just going to explain it real quick if you try to cat like a text file or any plain text file it'll actually show you the output so the attacker is thinking okay this works but why doesn't the binary work if they had a binary there's like a standard like output that it shows just by Design which is another signature for Kyrie or a Honeypot but it won't execute it and so that's

that's the frustration I see a lot of times is it won't work they try try and they give up and leave right the reason he tries putty or she or whoever it is uh they're trying putty to see if okay is this a honey plot is it going to run any command I type it doesn't work so they they still believe it's a real system there you go they're getting output from when they're playing text files so they're like okay I can see the output so it's probably not a Honeypot and then it just eventually give up because they can't get anything to work cool so H2 minor this is not a Mirai variant this is a different kind of

malware it essentially scans for vulnerable Reddit servers right um and then what I've noticed with a lot of different malware is they use Mass scan to scan other systems now what's really cool about this is everyone else is trying to scan the rest of the internet to find vulnerable things but if you're not familiar with those IP addresses those are basically what's called RFC 1918 or basically internal your internal non-routable IP addresses right meaning you can't connect to these on the Internet or you shouldn't be able to connect to these on the internet so what they're doing is they're saying hey I got into I got access to this organization what other internal systems are also vulnerable

right because most people I had a conversation with a system administrator um and I was he was asking me hey why do I need to secure my server when we have a firewall IDs all this stuff on the outside and it's protected right basically you have like six dead bolts on your front door why do you need to put your your jury and your money and all that stuff in a safe and all the other stuff why can't I just leave it out on the kitchen counter and I said to him because attackers once they do get in or if they're already in or Insider attacks right we talked about Insider attacks and some of the other

talks they're going to scan inside because they're like hey your security posture within your organization um is probably not as as uh fortified as the perimeter so just another example of why it's important to to either practice zero trust since that's the buzzword we're using that nowadays or the assume breach because once they get in how protected are you there right and then this is another example of the passwords that they're using so this was hard-coded in a shell script for that particular exploit right if you can't read it it's basically root redis Oracle password password with the Ampersand in there and a zero for the for the o um and it was only about 10 passwords in

there right not 10 000 not 10 million right they weren't reaching out to some server and pulling down a password list it was only a handful and they're getting access this way so that's that's telling you something and where are they scanning internal servers other observations so like I said a lot of the binary strings if you're looking at malware samples and they're Mirai variants a lot of the attackers aren't changing the default string which is basically dead beef right or when you're looking at disassembly and binary ninja or Ida or whatever you're looking at it comes up as zero um uh 22 basically right so you'll see 22 if you can read C code that's that's basically

the function for decrypting it encrypting it and then here's what it looks like when you see it in the strings right so the first one uh uh P mmv that usually decrypts to root so in about four or five lines of python you can essentially Brute Force this key space to figure out what um what the encryption what what the key is for it another observation is basically crypto mining right I saw this particular config in one of the binaries and I was trying to figure out like what does this user thing look like so I went out to the GitHub and I saw it was the wallet address and this is what it looks like

in the actual binary so I was like okay I got the wallet address I found about three unique wallet addresses but like I said Monero is designed not to be traceable so if you take this wallet address and you plug it into like the Monero Explorer like there's a for those of you into crypto there's different explorers right so you have eth Explorer you have a a Bitcoin Explorer you can plug a wallet address in and you can see like how much funds are in that particular wallet well with Monero when you do it it says hey you're trying to see how much funds this dude has in his wallet sorry you can't do that right it has some quirky message

like that um but the other thing that you can take away from this is you know by default TLS is disabled they deploy it the same way TLS is disabled so if you're trying to look for it I believe it's a 95 character string so you can write some snort or or cerakata uh rules to kind of detect this on The Wire and then a lot of the pools if you do want to find out how much funds they've accumulated over time some of the pools have a website tied to them you can go to that website and then plug in the wallet addressing it'll say hey they've made like you know ten thousand twenty

thousand since they've started using that particular pool it's not how much they have in the wallet in aggregate just how much they've mined on that particular pool and then another observation is hard-coded user agent strings right I've been a part of organizations where they say hey let's Baseline the user age strings and just look for this or Cobalt strike has this user user agent string let's look for that right the the danger with this is let's say your organization baselines your user agent strengths but your user agent strings in this list then this is going to fly under the radar right so when you're doing things like this just have a lot of other things that enhances the confidence and

whatever detection you're building so don't just look for a naked user agent string as as like that's good this particular malware what it does is every time it goes out and scans and tries to connect to another system it randomizes which user agent string it uses all right so what's the so what what's the what's the takeaway right like what's the point of all of this um if you're a non-tech person you should be able to get something from this the whole point of all this is there's a lot of noise on the internet right you plug a system a server let's say you spin up a VPS um you take your Raspberry Pi and you

plug it into some public facing internet thing there's gonna be a ton of scans most of you know this all right so that's the first part a ton of Internet noise what are they trying to do they're trying to build a botnet they're trying to take over your computer to to basically leverage it as a part of their army your computer can be an IP camera uh your smart your smart thermostat it can be you know your VPS in the cloud it could be a Honeypot whatever it is right their goal is to capture as many computers as possible and turn them into zombie computers that they can basically press a button and say hey all these

computers that I own I want them to attack XYZ system the other component to this is they want to be able to lease that out right so right now you're not using your vehicle what if while your vehicle while you're here at b-sides you could say hey I can make money with my car because while I'm at b-sides for three four hours you can drive my car and I get paid you know a thousand dollars for the time that you have my car right that's essentially what they're doing here they're saying hey I'm not using this botnet I don't need it for anything right now but if you want to use it for a week pay me 10

grand and you can have it I believe the business of the talk a couple talks before this they talked about it was like 30 an hour to rent a different malware or whatever so they're basically building the botnet and then making money off cabinet the other thing is mining there's a ton of like mining like most of the malware you see have a component to attack other systems they have a component to export other systems a ton of things but the the like second second thing that I see the most is um kill every single other Bitcoin or crypto Miner that exists on that system by name I believe uh the there's one one malware I saw in it I had like probably

50 different names for other competing uh malware things that could be or miners that could be in a system and it kills all of those processes before it runs its own so that's the second thing that I see in there specifically Monero I have not seen any other types I'm not saying that there aren't but I haven't particularly seen that I mostly see Monero and the XM rig which is a public uh it's an open source minor and then the last thing that you can take away from this right is opportunistic threat actors you saw when I showed you the example of like a the log of someone logging in because they know there's a lot of noise out there

and there's a ton of scans most people just ignore it they're like hey it's just noise everyone gets scanned it's Port 22 they're going to scan it 23 they're going to scan it 80 they're going to scan it like what's the point well attackers know this and so what they're going to do is they're going to fly in under the radar and they're going to say hey within this noise I'm going to try to like use the same password right admin password or whatever the default passwords are and I'm gonna see if I can compromise a system but they're typically going to be after things like intellectual property right they're going to be after whatever thing that's

valuable there maybe they're looking for you know maybe right now you're connected to Wi-Fi right everybody has the same password but you have open share on your Windows computer and you have a bunch of pictures on there you probably shouldn't have on there well anyone in this room that's a part of that Network can see that share and connect and download those pictures right that's a true story happen to me during a deployment I had a soldier he didn't know anything about security I wasn't in security at the time I was still a mechanic but I was deployed and yeah someone basically got his pictures of his wife that he had on his computer and he

didn't know how he just said hey someone hacked my computer because my the pictures ended up somewhere right so these are things that uh that are happening out there when you go back home or when you're conversing with your colleagues this is essentially what I want you to kind of take away as of these three things here's my contact information if you want to reach out if you want to collaborate those Splunk queries that I show if there's any Splunk experts in here if you want to you know tell me how you can I can optimize that a little bit better or um tell me how jacked up I am because you could do it in like one line hey by

all means you know reach out to me on Twitter or come down after uh because I'm always willing to learn and grow all right are there any questions

okay so the question is um there's things like VM Escape right where you they can exit the VM so with the Honeypot can they escape from that and land on a system sure there probably is I'm not aware of any right now and there's not been any published how I mitigate that is there's there's really I pull there's nothing nothing of value they're going to get from basically doing that right so if they escape they'll land on the VM and they can maybe wipe all the logs but none of this infrastructure is there's no attribution right so it's not attributed to my my company it's not attributed to um you know any particular organization and I'll be able to detect it rather

quickly just because it's the detections that I have in place right so the answer to your question is is anything is possible Right Tavis ormondy who's the guy that's like hacked a lot of things that people thought wasn't hackable and there's a ton of other folks that go to pone to own in these competitions and and you know Apple releases a new product and then the next day it's owned right so we know this happens does that answer your question okay there was another hand

yes so there's a ton of stuff that Kyrie does by default that's signature ball one of them is the process list if you don't change anything it looks like it's I believe it's June uh 2021 or something like that so I you can edit some of the files you can basically edit the source code it's all in Python and I change that to a different date and you just have to kind of go through the source code and look at it you can log in yourself and see what some things that are static and I believe some people have written some blogs about some things out there the file system by default is is doesn't show a lot so you

have to you have to I would look at the docs and I can talk to you after this if you want but there's a there's a ton of stuff that's signature but some of it's not really worth the effort because a lot of people that log in they're not looking for that they're doing the easy things does it have the Richard user does it have the fill user is the date uh you know June because when you do the process list it's exactly the same every single time I don't know why most people don't see that but it's pretty buddy and then of course like the um the the when you do you name if you don't change the

default Kyrie has a default um like kernel version that's in there so I changed a lot of that and I automate it so I don't have to go through and like fix it so right now if I wanted to change the versions on all my honey pots I would just go into the ammo change that particular config push it out and it's done

um so that's done automatically yeah if someone logs in I mean you can have concurrent users logged in so it's not a one at a time type thing right so if they log in they download like a gigabyte file and they come back again 10 hours later two days later it's hands off I don't have to do anything um uh no I don't I don't refresh it other than the normal updates to Kyrie so if Kyrie has an update I'll do a get pull I'll look at the change log to see what's changed and then make sure there's no disruption and I'll do a pull and so I don't know how many people know about git but basically you can pull

down the code it's like Download update in it so other than updates no I don't I don't change anything now if I see there's new things that are signature I will change it right so if there's if I see a trend of like every time they connect they do a a grip or something and then they log off right away I'll look and see what that thing is and then I'll adjust website and two questions what what are you spending on the architecture so this is one of the questions I predicted would be asked it's less than a hundred dollars right the the biggest cost for me for this you can you can there's a there's a company called Sky

silk you can run a VPS as a part of the beta program it's like four bucks for one VPS right and that could be a lot of money for some people um but the biggest cost is storage and so one of the things I'm trying to do is move to something called smart store Splunk has a way for you to push the data within Splunk to S3 um and that's that's really optimized for things that you don't want to search long term if you're doing a lot of uh searches across a long length of time like a month or something like that you don't want to do that but if it's only like 24 hours 48 hours then that's fine

yeah definitely um in different like in the Asian region there's a lot there's a lot the attacks that happen over there in malware that I see there I don't see in other places like U.S or or the EU right I don't have a large sample size that's just what they've attacked and what I have obviously cost is a limited factor for me I'd love to have more but you know no one's funding the you know my personal resource this is something I'm doing on my own time

ah oh um so that's easy so Splunk allows you to get two types of license you can get a developer license that's I believe 50 gigs and then one that's 10 gigs the 50 gig one has like this big banner across that says like you're emailing all that stuff and then there's one that's 10 gigs so I basically applied a Splunk and I say hey I'm a developer because I create apps for the for the platform and that's how I get the license the amount of data that I see doesn't exceed the license that I have

not has been using to cleaning up their action downloading all these things uh no Kyrie by Design doesn't allow that to happen yeah so the whole point of it is to capture those artifacts and it wouldn't really be good to do that because then if you created commands to prevent that then they could probably bypass that so yeah by Design artifact gets on the box that's stuck there's even mechanisms to push it to S3 and these different platforms if you look at the config there's like a ton of outputs you can you can send it to misp you can send it to virus total you can send it to the honeynet project I mean there's

just a ton I just used a Splunk UF to extract the data because I'm doing some manipulations with it right like I'm basically normalizing it and creating different event types and all these things tell the box that would maybe try to bypass ING with notice that you're sitting stuff to support or notice you know maybe that's just the process uh no I've not I have not personally seen anyone um escape the the Honeypot onto the whole system um I mean if they did I just didn't notice it but I haven't seen anything suspicious with any of that obviously there's no login generated if they land on the system but I log everything and there's there's you know access controls

in place so that only certain IP addresses can access them right so for instance my home address is the only thing that can touch the AWS the Splunk indexer where all the data goes and the same thing with the with the honey pots the only Port they can do is the basically the index Port right which which is uh whatever I configured it to be in the back

uh so what kind of complexity are you suggesting

yeah that's a good idea I haven't done that so some of my next steps is to add like web-based honey pots to this right so there's snare and tenor as a product that's out there that I'm kind of evaluating um but that's that's a good idea as well um I just haven't done that yet right any other questions

they don't know so essentially all the honey pots have Port 22 exposed so all they know is that Port 22 is open and I used admin and password and I got in or I used Root and root and I got in and then most of it's automated or if they're doing it manually they're they're doing their reconnaissance to figure out what it is right um there was there was someone I met last night that talked about adding honey tokens inside the honey pots right so attacker lands on the box yeah that was your Chris basically putting like AWS tokens in there or putting like brocade configs or anything that would kind of entice the attacker as they're looking around to

take those credentials and maybe you can you know figure out what they're doing with that and kind of do some some uh analysis there so that was pretty good idea I think we talked about collaborating on on seeing what we could do there good question any others it's end of the day everybody's ready to go home parking lot Dash cool all right so my name is Michael Lee once again thank you uh oh yeah swag swag let's see okay uh the question I have pre-prepared uh why do I filter out the Alexa top 1 million domains from the URL list I saw that hand black shirt go up first no no not really no uh

oh is that Daniel who's who's talking oh

no let's go back in the back yeah you um

yes there you go I mentioned that in a talk so essentially people take the list they'll dump it in your IDs they won't check it they won't look at it and so I'm kind of helping them like avoid those mistakes right so I filter out the Alexa top 1 million which is a list of like known good or supposed to be known good domains right so Google is going to be in there Microsoft's going to be in there you know whatever whatever you can uh download that site and see what it looks like all right another question um who remembers the IEEE RFC I mentioned about those internal IP addresses the 10 dots the 192. are you

all part of this okay go ahead yes all right good all right uh I just want to say thanks to bsas for allowing me to speak and thank you to the audience for coming and listening uh have a good night

Collecting Threat Data using Distributed Deception

Related talks