Now You C(&C), Now You Don't

Name: Now You C(&C), Now You Don't
Uploaded: 2022-07-19
Duration: 42 min 21 s
Description: Amichai and Stav Shulman demonstrate how threat actors build resilient, cost-effective command-and-control infrastructure using legitimate platforms like Spotify, Discord, and others. The talk explores botnet construction techniques across difficulty levels, showcases proof-of-concept abuses of clou

BSides TLV · 202242:21321 viewsPublished 2022-07Watch on YouTube ↗

Speakers

Amichai Shulman Stav Shulman

Tags

CategoryResearch Technical

TopicMalware Analysis Network Security Threat Intel

ResearchCase Studies and Incidents Analysis Technical Deep-dives

StyleTalk

Mentioned in this talk

Tools used

ConnectWise Control Google Docs

Platforms

Discord OneDrive

Service

Amazon S3 Dropbox LinkedIn

About this talk

Amichai and Stav Shulman demonstrate how threat actors build resilient, cost-effective command-and-control infrastructure using legitimate platforms like Spotify, Discord, and others. The talk explores botnet construction techniques across difficulty levels, showcases proof-of-concept abuses of cloud services and social platforms, and argues that traditional IOC-based defenses are inadequate against this evolving threat landscape.

Show transcript [en]

[Music] [Applause] [Music] hello everyone welcome to our session now you see and see and now you don't my name is amica schulman this is sufficient you can see the resemblance related we'll be talking today about how to build a potent infrastructure that is very resilient yet at the same time very cost effective and affordable so we'll discuss a little bit what brought us to do this research what was the motivation behind it and then we'll dive into describing how botnet infrastructure is built today and how researchers are tearing it down once we understand that we'll go and show you a few examples that we're able to pull through our research of abundant infrastructures that are very resilient very cost

effective very cheap and we'll end up with discussing further research and some conclusions that we had from this research so without further ado why did we pick on botnets because every real large-scale cyber crime operation today relies on a stable functional botnet if you want to do carding if you want to do credential stuffing if you want to do brute forcing denial of service or even if you want to pull off a large scale sql injection attack you need a functioning botnet so there are these organizations and personas that they grow and cultivate botnets and they sell pieces of them to individual organizations to carry on their cyber crime operations there are these organizations who like to build

their own proprietary infrastructure and use it for the cybercrime organization but in any case this is like the first step the cornerstone of the cyber crime operation so when researchers and companies and law enforcement are tearing down or neutralizing that bond net it means they are destroying the cyber criminal operation so the question we ask yourself can we build abundant infrastructure that could survive the current practices of neutralizing botnet and can we build such an infrastructure that is cost effective enough to be used by your everyday hacker and of course the biggest question of all is would we survive this exercise so take it from here stuff so i would like to begin with actually

describing the methods and approaches that are currently available for thread actors in order to build and maintain the infrastructures for these botnets i think we can divide them into three main difficulty levels with the first obviously being the most basic one and that's when an actor manually maintains a pool of domain names it means the actor must acquire dedicated domain names or it can use a bunch of abused compromised servers on which it puts some malicious content to help them operate as part of a c2 infrastructure using this method the actor must somehow initially deliver this domain name to the victim meaning we will find it embedded inside the malware binary or or found inside a dedicated

configuration file that will obviously be in proximity to the malware now an actor that would really like to walk an extra mile using this very basic approach can maybe have a list of backup domains but these as well would have to be somehow delivered initially to the victim meaning they will be in a configuration file on the machine or they would have to be delivered via an already live and functioning c2 channel moving to some mid-level practices this approach i would say is for actors that really want to feel like they're using some sort of cutting-edge technology but are actually still relying on domain names so that's when we come across binaries containing dgas so instead of

one domain name per sample we will find the domain generation algorithm in it based on a random seed or a set of parameters and a prefix it will generate many many many domain names that will be registered as required so instead of one server per session we will see many precision which might seem confusing at first but these algorithms are embedded inside the binary so once i have that i can pretty much predict how it's going to look like and what's going to come next now some actors would be very creative very thinking outside the box and would leverage um some platforms tools and and infrastructures that are not per se were meant for communication

and we'll use them as c2s so we can see the usage of profiles in social media platforms so things like facebook instagram twitter we see that cloud-based file share services are actually gaining great popularity right now so things like onedrive and dropbox and then we can see some actors that take pre-made um whole utilities that have a legit purpose in life and then attempt to use them for malicious traffic a great example i would like to give for this one would be the great iran-affiliated apt muddy water that had a huge scale campaign all relying on a legit i.t utility called screen connect to have some malicious traffic looking like i.t in the network i have here some more examples for the

usage of cloud bill file share services so we have dropbox and we have onedrive and we can see in the highlighted parts that for these methods the actor has a set of limited api keys that has to be embedded in the code and a set of pre-known folder and file names that are in the code and are then exposed in the url for malicious requests so these are resources that are available to thread actors moving on to our side of defenders security researchers security vendors and whatnot and what can we do against these resources so i think that we have a pretty predictable workflow that has four main steps so we first must identify this

such malicious uh resource we will then analyze some samples that are in our position we will enrich the data to make sure we didn't miss anything and will then have some sort of response and remediation process how do we even identify such melissa malicious network resources well we have the classics of network anomalies or high on standard ports we have the use of strange domain names which can be anything in the spectrum between a funny looking name that doesn't match normal traffic and all the way to ridiculous mimics of legit services and we can sometimes even identify a high volume of traffic between different endpoints that we know are not supposed to have so much data

transferred between them we have ids alerts we have anything that is based on snort ruling and then we have the more advanced edr's that will help us dynamically detect any malicious or strange communication attempts from processes in the network so we now have a bunch of network identifiers in hand what can we do with them we can hunt for malwares so we can opt for publicly available repositories such as virustotal or hybrid analysis but we have these identifiers from alerts in the network so obviously they didn't trigger themselves so we probably have binaries of our own we can now begin with the most fun of malware analysis which can be anything from the simplest sandbox execution and

can escalate all the way to full reverse engineering but doesn't matter how difficult the analysis was because the outcome would usually be the same which is more network identifiers so because we're now such suspicious beings we will go ahead and try to enrich all the data we've collected so we can do things like look at unique registrar data so if i have an actor that is utilizing unique name servers i will find additional infrastructure that uses the same ones if i have an actor that is using abused hacked servers it means that it has some paths on its own on these servers and some malicious content on it so we can identify its naming conventions and

filenames he likes to use and then uh enriched based on the content itself and find additional servers like that and we can even look at digital certificates so these can be certificates forged by the actor or ones that were stolen by it and then reused now that we've really exhausted all possible leads of investigation we should have some sort of response so sadly this phase today is still mainly based on iocs and specifically iocs related to outbound communication we can see that it's pretty much the convention these days we have some alerts from israeli search us cert and and all big vendors use iocs meaning that everything we've collected in the research is translated to one so

hash value for files domain names uri paths and parameters we would expect organizations to take these iocs put them in a deny list and block we can take response a step further and we can sinkhole known infrastructure so we can register malicious domain names ourselves and make sure that all live infections will now turn to us and if we are familiar with the back office of this actor and the commands that are supported by its malware then we can go ahead and have a pseudo server of our own that will send a disable or a kill switch to all infected machines when it comes to remediation we can try and maybe team up with hosting providers

to take down malicious infrastructure or we can team up with social media platforms and have them closing all uh threat actor owned profiles and we can really try and clean up infected machines from all persistent and malicious binaries and we can sure try that so if we're looking at the resources available to threat actors compared to our defense workflows that even though they're not perfect they're not so bad so you might think that we might have a shot of winning this war against crime because domain registration is expensive even if a domain is as cheap as ten dollars it's cheap if i have ten not two thousand um and we've mentioned that we can track some

registrations when it comes to social media platforms and cloud-based file share services actually creating the accounts for these profiles is very laborsome because they will require things like verify email accounts and then active phone numbers for two-factor authentication purposes which really limits our actors and also these are platforms that are trying to fight really hard against fake accounts and bot accounts so again not in the favor for actors we have great technologies of edr so yeah we can identify malicious network patterns we've got great personnel capable for moral analysis that can analyze captured samples and identify patterns of situ registration to block further infections and of course we can block and take over known infrastructure

so when do we lose though usually when we're up against nation-state-sponsored actors these guys have great resources and just a great abundance of headcount and budget and they can have as many dropbox accounts as they can dream of and as many phone numbers as they like and generally can do anything in a gigantic scale this brings me back to our original research question and to our motivation can we create a botnet infrastructure that is mega robust and resilient that is available to our everyday friendly hackers we now know that it must be based on publicly available infrastructure it must be indistinguishable from normal traffic in the network cannot be ioc'd one that if someone were

to catch one of my malwares or one of my requests then it will not affect the rest of my infections and my bots and it has to be cheap and cost effective so we're experimenting with a few ideas and we'll start saying yeah this is going to work and somebody no i can i know how to tear it down and then we'll take another one and we'll turn it down again but eventually we came up with a scheme that we're able to reconstruct with a number of infrastructures um and first example i'll talk about is spotify and the reason we picked on on spotify is because spotnet which is a spotify-based botnet seems like a cool thing to do but

also because spotify is very common today and you would see a lot of normal spotify traffic within organizations as a regular thing and the other advantage of course is spotify has a very nice api that can be used to interact with the service um so we took spotify and now the questions we have to answer are how do we encode the data using spotify uh how do we make the botnet spotify traffic similar to regular botnet traffic uh how to ensure that the individual bots are resilient and talk a little bit about the registration and how the reach and how the infection and registration process is also resilient so first question how do i encode botnet

data in spotify and we found the easiest way to put data into spotify was using podcasts everyone can put on a new podcast with episode on spotify we're doing that through a platform called castos it cost us 19 dollars a month cheap enough uh if you want to go cheaper than that you can do it by using an amazon s3 bucket for example you put all your content audio images descriptions and the xml feed into your repository customs in this instance and then you go to the podcasters.spotify.com and you just put a link to your xml feed now everything that you upload to your source customs for example would be updated in spotify and published within

20 to 30 minutes so we have a way to put in data and how do we encode our commands in that data well one obvious idea is to encode our messages within the audio files or the images that are accompanying the podcast and the episodes of course there is a lot of room there and you can put binary data the problem is that when spotify takes the information from the original platform it transforms it so you cannot just embed a binary within an audio file there are ways to do that uh you could do audio modulation i think that there are some people here who remember the sound of a fax machine or an old modem

uh so you could use that you could use ocr in images to encode data not too complex not that simple um but turns out that if we want to deliver short messages which can be text encoded for example by using base64 we could use the description of an episode in podcast and it's long enough to include short commands and instructions it is long enough to include uh the url for further download of a binary or more data and it can include the identifier of a next message uh one other thing it may include i'll mention it here talk about it later it can include a digital signature of the message so we've decided that we are going to

encode button messages inside the description of individual podcasts and each individual port in each individual episode we need to podcast would actually contain the identifier of the next message which is again a podcast episode could be of the same podcast could be of a different podcast could be of a podcast from the same publisher or a very different publisher and the way a bot is accessing the information is by accessing the url that you see there and as you can see it's a generic spotify url with a unique identifier for each message so clearly if you try to put this identifier as your ioc you are going into a race you cannot win because each message now has a different

identifier and the episodes contain the identifiers of the next messages so what we have here is a scheme that makes it very hard to put iocs in place in fact even if you are fast enough to block a few identifiers of messages the bot can go back and ask for older messages using their identifiers and the bot master at the same time would go to the source of the data change the description of older episodes and the bot is now up and running again so it's very very difficult to take down this in terms of iocs and then later even when you found out that you're infected and you now want to identify all infected machines within

the organization you'd have to go to all the machines that have spotify on them look back at all their internet access history and try to find out whether they were accessing spotify with specific identifiers that you know that are part of the abundant messages so so so this is nice but then you have these dedicated researchers who are saying well we're able to discover all the identifiers and then we took down the podcast itself and we actually made spotify close the account that was used to publish that podcast what do you do in that case so very simple what we're going to do in that case when the bot feels that communication is down we're

using the search functionality of spotify and we search for some specific keywords okay that we can change from time to time and we can choose whether we want very strict search keywords or loose keywords if we choose strict one like google here we get relatively small set of results we don't need to go through many results by the way most of them are legitimate okay if we do a loose search then we get a lot of results it cannot be distinguished from normal queries in the network and what the bot is doing now it's going through the results looking at the description of individual podcasts finding a specific pattern there again this pattern can be changed from time to time

and once the pattern is found we know that we have the new channel for the botnet and in order to prevent researchers from sinkholding this we're going to use a digital signature within the message so the bot knows that this is an original podcast which is part of the botnet and not just a researcher trying to tear down the network and again these patterns can change from time to time so it is very hard for someone to go and say well this bot has no communication anymore so probably the only question now is how do we bootstrap all that how do we register and find the first message in the botnet channel and clearly we're going to use the same

technology we're going to use the search functionality with some search terms that are embedded within the malware distribution we can change them whenever we change the malware distribution we get a list of podcasts okay we search for podcasts with a specific pattern in their description we check the digital signature and then we know that this is a channel we can start working with notice that even if someone captured our sample and knows the keywords that we're using they cannot block infections using this same sample of course as long as the search terms are not super unique okay so we're very resilient here uh and and and quite frankly what i've showed you is is that

we have a great scheme you cannot use your normal iocs inspecting requests to disrupt communications of existing bots deny the addition of new bots to the network removing accounts working with spotify to to clean everything does not help you tear down the network because it would be up and running in seconds and and finding infected machines is very difficult because you have to go through all the machines that have spotify in them and look very closely at all their spotify activity in order to understand whether they are part of the spotify botnet or not and and if we actually piggybacked on existing spotify accounts in the infected machines that would have been even harder for a researcher to find

so it works perfectly and and we could stop there but i realized that some of you people today here would say ah but that's spotify and and we will not allow spotify inside our networks that's social networking that's not for business people so we looked for a business application and we chose discord because discord is becoming very popular and because it's a business application and it has an even simpler api and the questions we have to answer are the same questions how do we deliver data and apparently in discord this is very simple discord uses websocket in order to get data into the clients from the server each client opens a web socket to discord gg

and text messages are just poured into this channel from the server if you want to convey binary data no problem you attach a file to a message then the message contains a url to the file which is actually stored in glue in google cloud storage with a unique url for each and every file so i can send text messages i can send binary data i can of course use digital signatures to make sure that no one is think calling me and just to make it even prettier the data going down into the client is compressed using xelib so client is constantly pulling the channel as part of the normal functionality saying do i have messages do i have messages do

i have messages anything from any source gets poured into the same channel and and what is sent back is compressed you can see that there are well you cannot see with this the the actual pulling request is plain text the message itself is compressed so even the same message doesn't look the same uh in two different instances the request of work contains no identifier for the source of messages response contains of course the identifiers but you need to decompress the traffic in order to understand it which is exactly what we did with our own small and simple botnet client here the thing with discord is that you cannot work anonymously so you need some sort of account context

in order to work so we have to work a little bit harder to create abundance on top of that uh and the scheme that we came up with is that every malware distribution would have what we call a bootstrap account credential in it and it's the same for everyone who is being infected by the same malware distribution we change it when we change the distribution when infection starts the client connects to discord with the bootstrap account in our back office we made sure that the bootstrap account becomes a friend in the discord network of the botmaster account so when the client is up and running it starts receiving messages from the botmaster account and the first message

would be the credentials for a newly created dedicated account for that bot this account is created in our back office so they're no problem overcoming issues like bot protections and recaptcha and whatnot and from that moment on we're connecting to the discord service using this dedicated account for each and every bot so there is no single ioc you can use in order to identify both accounts and from that moment on now these bots they are holding on for dear life because now there is no ioc to detect the individual or certainly not to tear down all the accounts uh if someone for some reason was able to identify the bootstrap account and put an ioc for that

existing infections are not affected and i will have to issue a new malware distribution okay that's a lot of work if for some reason the botmaster account was taken down by working with discord i create a new botmaster account in the back office i make it friend of all the individual bot accounts and in seconds they are up and running again receiving messages now yeah i know that some of you people here are very dedicated to this so you probably work with discord and find each and every individual both account and take them all out in which case what we'll have to do is create two accounts for every new infection one that is used currently and is friend

of the boatmaster account and one that is just sitting there when the network is taken down again at the back office i will create a new bot master account make it friend to all the backup accounts that were unused so far and then the network is up and running again and what would be the first message i would send to each and every block a new backup account of course so you can see this is very pesky you cannot tear this down easily and of course identifying infected machines you need to go into all discord machines in your organizations and and hope that there is some log somewhere that would give you a clue as to what are the

messages that they were receiving okay very simple very resilient extremely cheap so now what is next ahead for us well first and foremost we have to produce a functioning full poc for the entire thing rather than just small pieces of it we would like to experiment with abusing additional platforms so currently we have um linkedin and google docs and that's mainly in order to prove that this is not a one-time thing and that this is actually a very broad phenomenon and that dozens of platforms can be abused we are currently testing the possibility for having botnets that are based on ad services so we have this kind of infection network that actually finds you rather than you registering to it

and then we would also like to test the possibility to piggyback on existing accounts for these abused platforms on infected machines that we own what did we have so far um we've proved that multiple publicly available platforms can actually be used as c2 infrastructures and can be a very resilient and robust botnet infrastructure these platforms all have very easily to use great documented apis and mainly that the costs for threat actors to have these mega resilient networks is decreasing what did we learn from this research um we've learned that the toolbox that is currently used for defenders must change that these classic defense mechanisms that are basically um based on iocs are failing to protect us

from these types of malwares and infrastructures that the cost for the threat actor to have this kind of infrastructure is very cheap and getting decreasingly cheap but it's actually becoming increasingly expensive to us because it opens us up for new infections and as an organization being infected is very costly and also changing our difference mechanisms as organizations and replacing our tools is expensive and we've mainly realized that we would like to see a new breed of tools ones that are based on content for both requests and responses rather than addresses and uris and parameters and such things and that these tools must be platform agnostic because we've now shown discord and spotify and tomorrow someone will abuse something

else and there are dozens and dozens of options and it is super costly and i think impossible to change your entire defenses every time someone comes up with something new now i know that this is not something we're currently seeing in the wild yet but i think that this is a great opportunity for us to prepare ahead and actually be one step ahead for threat actors rather than being reactive later this would be it from us and we're open to any questions later on thank you

question from the audience

so this is what i said earlier i actually created it's very easy to create a discord account no no no no no not for discord uh you don't need a phone number not even an email address actually the question is how do you how are we able to create this many discord accounts don't we need a special identifier for creating an account and the answer is no actually in discord they have this concept also of transient accounts uh for which you don't even need an email address but again because we're doing that at the back office we can create non-transient account those that are based on email but we can base them on an email server that we're using we can

actually use manual labor cheap manual labor to create the new accounts in the back office okay so it's it's totally viable we tested it yeah other questions

so so the question was why try to use something new when you already have a by design anonymous distributed botnet based on blockchain or whatever crypto mechanism we have this is not is true this is this is a good infrastructure to look at it is a good infrastructure to look at uh my personal opinion uh not a very efficient one okay just because the way the networking design it's not an efficient one certainly not a cost-effective one much much more expensive than the examples we've seen but but you've got a good point there you've got a really good point there uh i think main reason is it's it's not as cost effective as the alternatives

questions

so the question is why didn't we use slack uh just scoping things um write it down we'll do it uh yeah we again we picked on spotify i said because it's cool we picked on discord because it was very easy to create accounts uh slack i think has a very similar account mechanism to discord so it could be a candidate as well

why do you need

please repeat the question so the question is if dns is going to go over ssl and it's going to be encrypted end to end why do we need all these schemes and i think you could ask the question with respect to any other protocol that is ssl based and you could ask the question you know once it's ssl encrypted who would know that it's a botnet and the reality is that most organizations today terminate ssl at the gateway

again unless an organization decides to override that okay and and pin their own which which again i i don't know i there are things i want to predict there i think i want i don't know uh reality is that as of today most organizations do open up ssl and and inspect the traffic again could well be that that dns over ssl while bringing a lot of security value but at the same time create a new set of threats so it's a complex world it's a complex world i think they are waiting for us to go off the stage so we'll be extremely happy to take more questions outside later thank you [Music] [Applause] you

Now You C(&C), Now You Don't

Related talks