← All talks

Into the Vo1d: Hunting a Botnet Hidden in TV Boxes

BSides Lisbon · 202530:56288 viewsPublished 2025-12Watch on YouTube ↗
Speakers
Tags
About this talk
André Tavares recounts his team's investigation into Vo1d, a massive botnet infecting 1.6 million Android TV set-top boxes worldwide. The talk traces the technical journey of reverse-engineering the malware's domain generation algorithm, hunting for samples across multiple data sources, and leveraging DNS intelligence and passive feeds to achieve visibility into infected infrastructure. It demonstrates how malware analysis, threat hunting, and collaborative research reveal the botnet's role in powering residential proxy services, ad fraud, and potential DDoS attacks.
Show original YouTube description
In late 2024, Russian antivirus vendor Dr.Web uncovered a massive botnet targeting Android TV set-top boxes, which they called Vo1d. The malware was found on roughly 1.3 million devices worldwide and acted as a hidden backdoor, allowing attackers to silently download and install apps or updates in the background, often pre-installed on off-brand devices! By early 2025, follow-up research by XLab revealed the botnet had grown to about 1.6 million infected devices, and uncovered some of its techniques and capabilities like the domain generation algorithm, the potential anonymous proxy services, ad fraud, and DDoS attack. This talk follows our team’s investigation into Vo1d’s botnets, picking up from XLab’s findings and breaking the botnet apart. In this presentation, we will share the technical journey alongside the investigative mindset that drove it: curiosity, persistence, and the ability to connect malware analysis, DNS intelligence, sinkhole telemetry, and shared research into a single coherent picture of the threat. We show how DNS intelligence can complement malware analysis, how collaboration between researchers can break investigative deadlocks, and how storytelling helps make complex security research accessible. “Into the Vo1d” is both a deep dive into a resilient IoT botnet and a broader lesson in threat hunting. About the Speaker: As a Senior Threat Researcher on the Threat Research team and a researcher at BitSight since 2018, André has been specializing in tracking malware botnets, by employing a combination of open-source intelligence gathering, malware analysis, and reverse engineering techniques to shed light on threat actors’ tactics, techniques, and procedures. Committed to sharing knowledge, he has been contributing to the infosec community through informative blog posts, providing key insights, indicators of compromise, and detection signatures to support the defense and threat hunting efforts of fellow professionals. Blog: https://tavares.re/
Show transcript [en]

Hello. Uh good afternoon. Uh so this talk will be a short talk about uh this botnet that is uh uh pre-installed on TV boxes. Uh and it's a it's like the it's it will be like the journey I I I I did from uh reading starting on the research from the malware until I we got full visibility over the the infected bots. So, so uh I'm a senior thread researcher at Bitsite. I I do malware uh I I track malware botnet since 2018. Um I in my current my daily job I do B analysis and reverse re reverse engineering and uh one of my main goals uh at my team is to collect malware IOC's uh

which is usually uh is done by uh domain calling uh C2 domain calling uh malware emulators I also do some uh sometimes do some data mining on on various data sets. So in today's talk uh about void I'll start to introduce what uh void is. So review some previous research on this malware. Then I'll be uh um just showing briefly some uh key parts of how the malware works. Um and then uh at least some some uh the key parts that we that we'll discuss in this presentation. Uh then I'll u show some techniques to ant for samples and for infected devices. Then I'll uh mention some families small m families that are related to uh this

uh botnet. Uh some takeaways and if if there's time some questions. So let's introduce a void. Uh see this is first seen botnet by Dr. Web Russian IV in September um 2024. They saw it pre-installed on uh at least three TV boxes running uh old versions of the Android open source project. These are uh uncertified devices uh which is uh different from Android TV devices which are devices certified by Google. Um the Dr. Web saw 1.3 million devices infected with this malware into other countries and 30% of of them were coming from uh Brazil. uh and they uh mentioned that the malaras uh basically the full the main purpose of the malware is to um to drop

other payloads into the into the devices other other um uh components modules uh to perform various tasks whatever the tractors want and uh in the last February um the Shinshin XLBs the Chinese company released some in-depth research on this smaller and they saw a a growth in the infections in the number of infections. They saw it 33% growth uh to 1.6 million devices. Um and they saw that the malware was was being used to drop payloads uh with some specific functionality. Uh some of the functionality the most interesting functionality was to uh turn the device into a proxy. So uh basically allowing the the to be part of a anonymous proxy network uh to to power up anonymous

proxy services. Uh also uh they saw some evidence that the malware was being used to perform advertisement in fraud. Uh the the researchers also learned that there are other possibilities for for use of this botnet. Um one is to perform denial of service attacks since this is a very huge botn net that can have great impact on on many uh on the targets. Uh also they mentioned briefly about uh that this model can be used to broadcast misleading content because the basically this they the device has has access to the the TV so it can broadcast whatever the the actors want. Um and the these uh researchers found lots of infrastructure which makes uh this botnet uh more

resilient uh including 27 uh command and control servers and also they they mentioned that the malware uses a domain generation algorithm uh so DJI to find C2 servers and this got our attention but let's just uh just define quickly what is the domain generation algorithm and how it it is used. So it it's basically code that is used by malware to generate uh med pseudo random domain names. Uh this helps the bot locate uh their command and control server. So so it can get tasks uh payloads wherever. Uh it this this algorith this uh technique here helps the malware avoid detection because the domains will will uh rotate frequently. uh for instance if the uh actors if the malware authors use

the current date as seed for for instance every the list of domains uh will change every day so it will be harder to to detect uh and take down uh and so one basically uh since this technique generates a lot of noise and a lot of domains that will be queried and are not registered one can register available domains and get request from compromise systems and bitsite does that uh since 2014 uh from uh and reduces domains from many different families that use this technique to get infection telemetry and xlab did that. So they they single voids by registering uh their uh DJ domains. So that the this research here was basically uh uh a way

it was basically my journey from uh uh from uh reversing the DJ and calling void that's what we are going to do today or at least I will show you how I did it. So let's reverse void. Um so the main goal here is to find the DJ code so we can implement it and generate the domains. uh so we need to understand how vit uh uh the vit's execution flow uh it's it's the j part is on the third stage let's say of of the malware I will not go into detail of the other stages it's just uh we would just go straight to the to the binary that we want uh in this case the malware authors authors uh

called this malware the this part of the the malware at this stage called it void that this this is the name of the the actual malware that the researcher gave it. This this binary uh decrypts and runs an embedded payload which in in fact has the functional the DJI logic and functionality. And so the very briefly the the J is used to find a C2 through DNS of course. Um this is not the like the actual uh C2. It will be a redirect C2 for the real C2. So that IP will be contacted using a a a custom MMTC protocol RSA encrypted uh in order to get the real C2. Uh and and after uh

uh the malware gets that real C2, it will start communicating with that real C2 uh also using a a custom TCP protocol RC encrypted uh in order to get tasks to download additional payloads and so on. So basically our goal here is to find uh uh is to have a look at this sample right. So uh when trying to find this sample on virus total uh it was not in there. Uh this is the sample that uh uh the XLAB researchers were were researching were reversed but this was not on virus total. So this probably the reason why is that this is pre-installed malware. So it's hard to get samples uh uh easily uh public samples. I mean so

we basically asked um the the XLAB guys for a void sample sample and and then promptly shared one. So thanks Alex for that. H and uh once we get the sample we we still need some work to do of course. So we need as I as I mentioned the sample has some payloading encrypted payload. So we need to sort of unpack this sample. Thankfully the the XLAB guys already researched that part. So I just needed to to apply uh their learnings. Uh and so very quickly um if you open the sample on the detected easy there's a uh the dot data section has a very high entropy above 7.8. So that uh usually hints that this is encrypted uh

uh data. So uh if we uh open for instance this sample on G the disassembler for for instance and go to that data section uh just search for the biggest uh blob of data with a cross reference it cross reference we can uh just extract those bytes um and input it into the algorithm they use in this case it's a modified version of the xxt algorithm uh and yeah the the the key is hardcoded uh so in the sample so we just equip that and you get the the the actual uh main main uh part of the malware. Uh so we are now on the main part of the malware and but we still need to find

the J code right. So um usually we start with the strings just look for interesting strings that are related to domains or whatever but there will not many uh strings and this is weird and usually that in uh that probably many of the strings will be encrypted um and that was the case. So if you go into the main function um uh one of the first functions that is called is s32 cross references and the first argument is a a global variable which look like random bytes. So this is a a by quickly checking uh checking with also the research that xlab guys did this is quick this is definitely a custom string decryption algorithm that they already

reversed. So you just need to use that. It's not that easy to to understand as well, but uh we just use what they rever reversed already. And so we start decryting all the all of these treated 30 32 strings that were um being decrypted and and eventually we found these strings that are related to the DJI. So uh the TLD stop, com which have all cross references for the same function. So just uh we probably found the the DJ function. Uh we we think uh one important thing to mention here is that also close by to these strings that were decrypted. There were some um C2 some domain that was also there encrypted but it was actually registered

so we couldn't uh register it. So yeah. So let's go into the function uh that has those three TLDDS and yeah so we find we find the the DJ code here on the left is the pseudo code from from Gedra uh the compiled um and yeah it's just a matter of implementing this in in in Python for example for I find it easier for me um and there's one thing left here uh in order to generate domains which is uh the the the J needs the seed and usually the threat actors use for instance the current day or the current month to gen to gener to use a seed to generate domains. But in this case uh

the threat actors uh used um strings encrypted strings uh on the on the binary to to use as seed. So all the domains that they generate are will be always the same for uh for a single sample basically. But yeah, we still needed to to uh extract them. So it was not art. They were encrypted as the strings were. So you just extract them, decrypt them and use them to generate the domains. We did that. Generated a bunch of domains and and check which ones were available uh register a couple of them and we got uh telemetry infection telemetry uh from about 250k IPs daily which is good and not bad but it's far from 1.6 6 million uh the

values that the XLAB guys were seeing. Uh so we thought maybe we need more seats. So uh let's hunt for samples. That's usually the case. So let's ant voids. So the first approach for to this malware uh that we we did was look on virus total and specifically using a y rule based on the string decryption loop. Um, you could you could also have used I could also have used some strings on on the binary, but I I chose to to do this because usually this this encryption algorithms don't change that much and so I I find this a very good way to to to create good y rules. Um, however, we only were able to only find

uh 15 um samples on virus total which is a bit short. uh but still we uh download those samples, unpack them and got the the the seeds the DJ seeds from them. Uh it's a manual process uh as I explained and okay we were able to find a little bit a little bit more uh infected IPS so 350K daily which still good but far from 1.6 million. Uh but then we we thought wait but u because we were we were out of it ideas. So but we thought wait but we know uh how void downloads payloads uh because the xlab guys uh did the reverse to protocol and they show how to how the model does that. So maybe maybe we can

get void samples uh directly from the C2 server. So that's what we went doing. So basically understanding the the communication protocol the let's say the loader protocol of the malware which is not very hard to understand. Uh although this is a bit uh the slide is a bit crowded but it's not that that hard. Uh one key aspect of the protocol is that the so the the bot receives it receives a task. It's like a JSON and and on on that JSON as like a file name that it will download download from the C2. Um and so we basically on the protocol we need to pass a file name to the C2. So so we can get that pillow. Uh and it's

usually um a file a short file name. So it's five characters or around five characters or less. And so basically we implemented this protocol and tried to brute force uh the file names until we get some some some void samples uh using this uh yeah and it worked. We eventually found a couple of samples uh that we could uh extract more seeds. Uh yeah so just combining in the as a f first approach combining these two techniques from so the the virus total retenting and the uh lot of protocol emulation uh we were able to find 11 void samples uh that is unpacked samples uh which contain the the J logic decrypted um and in these 11 samples we found

seven seven uh different groups of seats which may correspond to seven botn nets. we are not sure and by registering all these um uh seeds uh we were able to to get an increase in infection so 400k IPS daily but it was it was again uh uh still far from 1.6 6 million. So basically we need more seats. That is clear. Uh so we should probably continue brute forcing the file names and and get more seats. But in the meantime uh we noted that uh the XLAB researchers uh were mentioning that uh they saw void domains in in the in lists of top domains such as the Tranco top 1 million list. And so so this got us thinking

that maybe we can find more domains in in in list of stop in lists of top domains and and also in maybe on DNS logs uh because we have access to DNS logs. So in in basically in passive DNS feeds that's what uh we went to do um because this is all a tiring process to implement all of this and and unpack all the samples and extract the seeds. This is all a lot of work. So let's hunt. Let's try hunts on passive DNS and make our lives easier. So most of the um void domains uh are not registered. So the DJ generates a bunch of domains and most of them will return NX domain when you do a

DNS uh record query. H fortunately the bitsite has access to passive DNS feeds with the with the NX domain queries. And uh one key factor in this is that this all these void domains all these void domains match a very uh specific uh reax pattern. So all the domains start with five lower character uh letters. Then it follows by 11 uh characters of the seed. Usually the seed is a x string. So it's it's a tof 0 to9 and and finishes with one of three tlddstop.net.com. So this is very specific uh uh we thought and then we just uh thought maybe we can just use this reax against our uh data sources our passive data

sources and see what we get. Uh the the pros of this technique is that it's it will be way faster than hunting for samples as I showed and it uh it may be future proof uh unless they the the tractors decide to change uh the the DJ algorithm or or the seat format for example. So these are the findings from uh uh the two let's say two sources. One is free which is the tranco list. It's a list of top domains like combines many different lists of top domains. Um and and it's mostly registered domains, but still we found 113 domains that match these rejects. Um and actually you see there in the in the the third domain

there is the act the actual domain that we saw in the sample that was hardcoded. So this is a good hint that we are on the right way on the right path. Um and also you can notice that there are all al all all just com tldd and no.net net or top and maybe and and maybe we can leverage this uh to find available domains because if the domains use different TLDs uh three uh TLDD so we can maybe we can find available domains on the top dot top the top or the net and that was the case there were available domains that way so we can find uh v domains that way to register

uh also on the uh paid sides the the That's DNx passive DNS sources which are mostly unregistered domains. We found a bunch of them to 328K in a single day for instance. So yeah, it's great uh results so far. So let's see let's let's register a bunch of the minds and see what we get from from this. Um and we finally uh reached the values the the counts that we were hoping to to reach. So currently we are seeing 1.4 4 million IPs protecting our singles uh infected with void uh and with peaks up to 1.6 6 million as the XLAB guys uh see saw uh and this didn't require any malware reversing it. It was just filtering on

some uh passive DNS sources and top and top domain lists basically. Um, if we combine all the results, uh, so we were we registered a bunch of DJI domains, but we also, uh, caught a lot of domains that were hardcoded in the samples and that, uh, um, and that were available for some reason, uh, and maybe the threats forgot about renewing them. And combining all of uh these domains we we currently finding the sync two main IP addresses contacting our sinks daily. Uh this actually includes the the domain that we saw previously because that domain was was registered at the time that we first saw it but eventually we added like to a watch list and

eventually it became available and we snatched them quickly. So we basically starting seeing uh a huge chunk of of infected devices that way also. Um and most of these are uh if we met map to uh sectors it's mostly the the telecommunication sector although there are some interesting sectors there like the government sector but it's it will be it will be mostly telecom uh residential IPS and uh by the way in in in a week it's 4.5 million IPs uh which is great results I guess. Um so now some relations to uh to the void. Uh the biggest relation I could find was the bedbox relation. So it's the largest non botn net of uh

internet connected TVs. Uh Google um mentioned that they they saw 10 million devices infected with this uh and they range from TVs to TV boxes, digital projectors, video infotainment units, picture frames, tablets, uh all the all these devices uh running uh old versions of the uh not not necessarily old but versions of the Android open source project. So all uncertified devices uh all of these malware all of these devices have the malware pre-installed in them and actually void is one of the was seen as one of the back doors used by this pedbox operation um that is the connection the connection is is there and it was made by human security uh and actually uh Google took

legal action last May also in coordination with the FBI and and the the FBI released an announcement to warn the public about this this this botnet. Uh they uh they sinkled part of this uh botn net and uh they they they use the shadow server foundation to to do that and the geog geographic distribution is is very very similar to what we are seeing on void. So on void we see uh mostly Brazil uh Turkey uh south south Africa uh India, China and Indonesia although it's it's widespread and on the on the bed box side it's it's also widespread but it's mostly Brazil uh India, Indonesia, South Africa. So and also I see Russia there. Um,

one of the things that um, the XLAB guys uh, saw evidence was that this malware was dropping payloads related to proxying web traffic. Uh, and actually we started recently doing some investigations on um, residential proxy services and and and we have evidence that the void ips are showing up as exit nodes of of several residential proxy services. Uh we are not sure if this this botn net is being rented. Um uh we're not sure if all of these service some of these services are resellers resellers of of others. This is all currently under investigation by many parties. Uh yeah finally some takeaways from this. Um so IoT but nets clearly on the rise. uh in this in this specific case

it's uh it's device that are uncertified and and are running uh so cheap devices that are running uh Android opensource project project not the the certified Android TV uh software um some so botn nets of millions are are powering criminal ecosystems uh such as anonymous proxy services uh uh probably also denial of service attacks and clearly also add fraud. Um, DJ is still very much in use today. It increases botn net resilience uh to take downs to um to to many different types of of attacks. Um, also collaboration is key here. Uh, it was very nice from from the guys from XLAB to share some intelligence. It was crucial for this research. uh so all the

findings all the samples the analysis are very important also coordinated action uh that is disruptions and stuff it's it's it's vital to counter these resilient botn nets um another thing is to explore different angles in this case the reverse we did some reversing in the beginning to show that we could get uh visibility that way but it was not the only way to get that visibility uh we pivot some some more uh uh simpler uh process and we got great results. So sometimes you have to take a look back and and and and try other approaches and it will pay off. And most of all be careful with what you connect to the internet especially if it's low cost of

brand and uh uncertified devices. Thank you.

Great talk. Thank you. Uh here. Yeah. Uh I have a question. Were you able to fingerprint the infected with the with the malware that was pre-installed so you can link to a manufacturer that may be compromised or were different hardwares or is a software uh were you able to find the root cause for for this? >> So I didn't dig into that. Uh most of the devices were Chinese manufactured. Um and it's actually uh it's hard to get um information about the device the actual devices through the DJI because the way that it is implemented it will not uh um it will not uh give any education about what the the IP what is behind the IP

basically. So, I did I didn't get into that, but it's it's mostly Chinese uh devices that you can find on AliExpress and other other stores of the like. >> Have you asked Petite to go to Brazil and investigate yourself or No. >> What? What? Sorry. >> Have you asked Petsite to buy the trip to Brazil and investigate yourself? >> That would that would be a nice trip. >> Yeah. >> Yeah. So devices that um are uh ship will uh will be uh used by people that do not have that of develop in developing countries that do not have access to uh or that that of have the knowledge to do better. Uh so yeah that's probably why that's affecting a

lot of of those countries. Yeah, >> Andre, great talk. I I know I'm biased, but it was a really great presentation. That's many many millions of of of devices infected. Um, when the when the attackers have a foothold so high on the supply chain, uh, how can do you have an opinion like how can the average Joe protect himself when he's, you know, buying some piece of equipment? There's probably companies putting TV boxes or TVs actually inside their facilities, you know, providing an attacker a foothold into the internal network. Um, probably half of them will crash in 2038. But if you want to protect yourself before 2030, how how what advice do you have >> to protect against this?

>> Yes. when you're precision purchasing stuff like >> um >> we stand a chance as consumers. That's my question. >> I'm not sure. I I know how to answer that question. Uh I I would advise not to to be careful what you are buying and from what where you are buying it. Uh if it looks cheap uh if it looks uh suspicious uh maybe have a pie hole uh uh in your house if if if you buy stuff like that and have a look at the logs. Uh but I would advise not just for do that just for research purposes but do not uh buy stuff uh from AliExpress that you connect in your home

and uh to the internet. um at least not at least not not from uh brands that are unknown. You can buy of course from big brands but not smaller ones that uh you don't even understand uh you don't even know about it. U I I would advise against that. That's my my advice. >> Great. Thank you. If there are further questions, of course, I'm sure Andre will be around for you to to reach out. Let's let's also continue kind of almost on schedule so it's great. Thank you Andre once more. We'll have