← All talks

Bitsquatting dot gov.au domains – exploring network data bitflips in DNS traffic

BSides Canberra30:52208 viewsPublished 2025-12Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleTalk
About this talk
BSides Canberra 2025
Show transcript [en]

Um, I'd like you to welcome Matt Belvadier who's talking today about bits squatting.gov.au domains exploring network data bit flips in DNS traffic. Thank you, Matt. >> Test test. >> Oh, thank you. Uh, can you hear me? Does this work? >> No. Do I hear Do I talk like this? Does this work? >> Is this better? Okay, awesome. Cool. So, um, I've got a AV set up. I think it works. Uh, let's Oh, awesome. Let's start then. So, I guess we'll start with the obligatory, uh, who am I, like a sort of a preamble of what decisions we made about our testing. Oh, sorry. Uh, the approach we made. Uh, we'll do enough 101 to explain

how DNS works. Uh, I'll go over our setup and what that looked like. Uh, we'll go through analysis of what the data looked like. Uh, some of the offensive applications, and we'll wrap up with a handful of conclusions. Uh, awesome. Cool. So, uh, this is the obligatory my slide. Uh, my name is Matt Belvier, no relation to the popular vodka brand. I've been a penetration tester for over 10 years. Uh, can be described as a professional cheeky individual, which roughly translates to I get paid to be a pest. I'm a co-founder of a small penetration shop named Practive Labs based in CRA. Unsurprisingly, we do pentest. Uh, this talk isn't about pentest or penetration testing, but if you could get this put

in scope for a gig, I think it'd be pretty funny. Uh but starting off uh before we go anywhere, it's important to know that we didn't exploit anything in this case. Uh there's no OAS. Uh we did as much we could with as little interaction as possible other than observing a response to DNS. Uh this is a key decision to not mess with the end machinery. Uh given the domains, I think that might be a defendable position and an understandable one perhaps. Um but in short, uh the too long didn't read edition. Uh, bit squatting was disclosed in 2011 by a guy named Artum Dynberg. Uh, it's kind of like typo squatting except the machine is the source of the

error, not a human. But the end result is a machine makes an error when trying to resolve a domain and instead results a different domain. Uh, the error rates higher than what you'd expect and you see flips often enough that it's actually worth noting at some sort of scale. Uh, what our goals were for this was fairly simple. We wanted to recreate or at least validate part of the prior work from Dynabberg. uh we didn't want to catch connections to content or interact with sort of machinery at all. Uh that means we have some limitations which I'll go over later but ultimately we wanted to get some real data and see what would come out of the process. Uh

prior works so again we didn't discover this uh it's been the public literature for a while now. Bit scotting was initially coined by Adam Dynberg and Defcon 19. So that's 14 years ago. Uh his initial proof of concept he was catching all kinds of goodies. So that's things like core dumps for Windows machines, uh cookies, creds, a handful of other sort of interesting bits. Uh Schulz had a follow-up talk in 2013 which had a heavy emphasis on HTTP. Uh so that was doing things like taking advantage of flips and URLs. So like a O to a forward slash. This is before SSL was sort of ubiquitous. Uh let's encrypt wasn't until slightly later. Um although

SSL doesn't really matter for these cases anyway. Ed Miles in 2018 which he had about 30 domains and was getting 25,000 requests a week. just to idea of scale. Uh he noted a lot of the requests appear to be from Google and he's suspected that's from the quad resolvers. Makes sense. And then more recently uh two individuals uh one I can't pronounce his name but stock sto and juna at black hat 2024. So they did quite a lot more offensive stuff where they decided to register domains for a bunch of common SAS platforms and captured a lot of creds. We didn't do that. So uh we wanted the more passive approach. Uh, don't add mess.

Um, cool. Easy. Cool. So, we'll go over some brief technical 101 this talk. So, things like like the internet or at least enough to understand how the architecture works, uh, DNS, bit squatting, and some general overviews of bit flips. This is the previous slide. Give me a sec. I see. So internet a bridged uh internet as a whole is a bit too big to cover especially before launch but the key points are it's a lot of interconnected networks there's a lot of different tiers and providers they're mostly interested consumers who are using tier freeze in most cases those providers have DNS servers which their customers use and thanks to cloud collab for the diagram because it shows this pretty

easily so practically speaking you've got a customer using an Australian provider such as Aussie broadband uh you've got some public documentation which on the what their name servers are ending in 142 and 242 and we can verify these are users are now testing setup by reading the applied configuration practically that's me ss hing to a router and reading Etsy resolve uh so given that we've got a given a network of a given provider and we know what DNS servers it uses and we have logs for upstream servers and we know what we we know what our bit flips are and we know that we've queried at least once manually to verify our logging works we should be able to see this in

our logs, right? Well, no. Um, which is interesting. And why is that? To understand that, we have to understand how we sort of understand how DNS works. So, in the early days, we there are things designed and documented in request comments or RFC. RFC in this case, it was made in 1987. There's amendments and further documents, but they're mostly relevant. Uh, but in short, DNS initially had records, resolvers, and name servers. records are what you would understand pretty typically. uh think you a cod like MX records name servers so that's the thing that serves your domain that's the thing with zone file it's bind etc uh there's a lot more nuance of the sort of server

part that they've described initially uh practically though uh an application in machine tries to resolve an IP address so that's something like making a call for get out info that usually talks to a local resolver in client most of the time that's going to be systemd resolved d on like a modern fedora box um this is s onto recursive resolver usually your network also your router in most cases that's fored upstream to a series of recursive resolvers usually provided by your ISP uh there'll be this will query a DNS root server there'll be a top level domain server and then that'll query your authority domain server for your domain so there's a handful of little bits pieces going on

but

awesome a lot of this doesn't really matter for what we care Just know that the thing you point to points to something else and eventually it'll talk to a root server, top level domain server and then our control server. Uh again, we know that our resolver is using other resolvers. So we're going to figure out resolver is the last hop so to speak. So we want to figure out some level attribution. So privacy nodes have been on this sort of stuff for a while. Uh if you're familiar with DNS leaks, same sort of thing. Uh there's a lot of websites we can use to figure this out. So one picture there. Uh we don't see

any of our routers upstream resolvers but we do see some other upstream upstreams if that makes sense. Uh and there's three IP addresses of note that we care about 119 uh 1019 and 180 what we're starting with at least ignore the IPv6 ones and ignore the Google provided ones because that's systemd resolve default falling back to Google DNS. So double checking uh one of those IP addresses looks like it's own by broadband that's in reverse CNS that's sensible. Um, do we see this in our data? And practically, yes. So, that's a good sign. So, the reason I'm covering this is because we want to understand more about what's actually making the bit flips, but given

that we're only capturing sort of the last thing that made the request, you might not be able to tell what actually was the origin of it. Uh, you might be able to get a rough idea of what the pro provider is in some cases, but if it's using CO 8, you might you're out of luck. Um yeah again attribution kind of matters in this case like we want to have an idea of what the requesting software was or in some cases we wanted the ser if you wanted to serve as a different result or process data differently in some matter. Uh if you're doing this against an organization and you didn't want to like steal creds from third parties to use

it. You'd be out of luck in this case but practically speaking the origin is a bit hard to determine unless you're getting requests. Cool. So, moving forward and recapping again, the bit squatting is just passive DNS hijacking and the machines at fault really not not really an end user. How this happens? Well, uh major culprits mostly memory errors are usually bit flips and RAM. Uh single bit errors are the probably the most common ones that we care about. Uh changing a zero to one ends up gives you a different output. Uh issues come down from heat, power, radiation, etc. Uh we've got some data that showed correlations between solar activity and bit flips which I thought was kind of

cool and other events that have been dozen bit cause bit flips radiation in general. So that's um if anyone remembers Kiwicon maybe 2018ish with what's his name? Uh Peter Goodman, Dr. Goodman I guess uh had a demo where he had a source of radiation next to a Android phone. The thing crashed after a couple minutes. Uh other things that we care about uh a hardware has a strong correlation of faults. Uh this came from a study called DM errors in the world which is a large field large field study which came from Google. Um that image on the side is a bit cheeky but uh knowing that the age of machines generally has a correlation with how many faults it has.

Uh if you're using Froid, uh I believe that was a yeah, they had they had an issue where the newer build the build system couldn't build software anymore due to it not having an a certain instructionless CPU set uh which was made in like 2016 2006. So those are probably old enough to maybe be a problem depending uh voltage also causes memory issues in a lot of cases. Uh this is another sort of a cheeky slide, but in Adelaide mains goes up above 260 volts. Uh also seems to go a bit north of that sometimes as well. Um the maximum accepted voltage should be about 253 volts which gives you an idea of how bad that might be.

Uh but a quick primary bit flips. So let's say we got a given string ending in god. Let's say we do a single bit flip error. uh perhaps in the bite that's holding their O uh that has changed at least significant bit from a one to a zero but it can happen in any part of the bit obviously or any part of the bite. Uh so in this case that little red text there from an O to an N is what we care about. So applying that to our use case uh the result of the string ends inv.au. uh that memory is then used the lookup happens and then but it's the wrong domain and these are this is the

majority of errors that we care about at least for the purpose of this talk so uh in 2022 the AULD was made available so that meant you could register domains such as food.au AU or BAU. It also means you can register domains such as GNV.AU or G06.AU. So that's what we did. We registered two domains. Uh there's a couple of flips that I've just listed up there of what you could possibly use. But uh they were already taken. We didn't report that, but I don't think anyone care. But that's uh I guess it's not recorded for a reason. Uh so in our approach um we can't really determine where the errors really happen but in earlier approaches such as

Dynabberg in 2011 uh he put forward there were two paths to this happening. So either a DNS path or a content path. So that was a either an intermediate resolver misbehaving or content issues where it was most likely DM DM error on an end user device. Uh but practically speaking the way they compared it was if the host header matched the original domain. So given a inbound HTTP request uh the corruption happened on the red path which in their case is the original. Uh but if the host host header matches a bit squat domain the corruption occurred in the blue path which is the majority of these things. But practically speaking if the host header is also bit flipped then the call

to verify the SSLert is also for the bit flip domain which could be a validert that you own as the attacker. So might be handy. Uh practically where it goes wrong for blit flips again basically anywhere in it's most most likely going to be DM on your end client. Um it also can happen on things like cing intermediate servers. So CDN's there was a really good incident of this happening I think for a Facebook CDN and it ended up uh serving I think it was part of Dynamics research as well but the point being DAM on devices intermediate cing things as well kind of like a good cing sort of poisoning attack but you don't really

control when it happens. Uh what is endups looking like though?

Yeah. So from a defense point a defender point of view or at least if you're elastic search or logs or something like that or edr perhaps your client application needs to connect to food.gov.au. The bit flip happens in DRAM. Uh the client performs a a request uh an a client performs a DNS request to fu.go.au rather than gov. The request goes out for j6.au attacker sets a record for that. A victim pass out the a recury has an IP address and then it will connect that thing depending on what what the application is actually attempting to do. Um in this case the attacker could serve content or just catch things depending um the client apps may or may

not be aware of their their connection to the wrong endpoint. So about search pinning or a few other little things, they might not even be aware. Again, that's kind of interesting depending what your app is because you might now have creds for something interesting or other effects. Um, but from an attacker's point of view, you register a domain for flip. Well, actually, you find a domain you care about, you coip it. So you sit there and just ones and zeros. You filter all the non-usable ones. So that's things that either registered or have non-asky printable or not valid for DNS. Uh you set up a name server, you handle that, you handle requests, you say responses, and depending on what you

want to catch, you set up appropriately for that. So again, depends on what you're sort of looking for. If you think that you're going to get webish traffic, you'd set up a web server of some form or you just blatantly run responder depending. Um it's sort of important to note here that you don't get to pick the clients. you just get connections and you probably don't know what the client application is or how the data is going to be used. You can make some guesses but you don't really know and again TLS might not matter here because if the flip happens early enough SSL ev validation passes too. Um our setup was pretty simple though uh

as the attackers uh we didn't really need anything too special. Uh we just want we didn't want to catch creds or anything like that. Uh we were just trying to understand how often this happens really. So we use Amazon RAF 553 as a name server. Nothing custom given we just want the logs and we pump the logs straight into S3. Uh so we're trying to do very little with uh I guess that part there. Um and the logs out of route 53 if you've seen these before look pretty simil pretty simple. Uh you get a date uh in ISO 88601. Uh you get a query name a type resolver IP and the EDNS client subnet. The EDS EDNS client subnet is

kind like a performance feature that practically gives you a better idea of where the end client might have been from. You end up getting like a sl of what the request was, what made the request. Uh we did pass these if we saw if we saw them and you got some pretty good ideas of where things are coming from in some cases. But pra what I was just talking about um but practically speaking we just have regular logs much like what a normal resolver would have because this is a normal resolver and the bit flips are the sort of the interesting part we care about um thanks Jupiter notebook um so here's some real data so this is

what we end up having uh we filtered out a few different fields but practically speaking it's just a big request a big table of requests responses and asns Um, you may recognize some of those domains. Uh, you can probably figure out what they are. But again, noting that ASN in this case isn't particularly useful. It's that's probably an end user trying to connect to a website that does something tax related. No comment. Um, so eyeballing the data, uh, the first results we had were in April, uh, 2024. Uh, that kind of makes sense given that's when we set up the data collection. Um the last record was in the 24th of July in 2025. Uh so we got

about 226,000 requests and before February 2025 this is at 139,000 requests. Why it's going up I don't quite know. Um monthly breakdown. So uh April being a very short month. Uh again we collect started collection on the 23rd. Um the early months seem kind of uniformish, then the later months have a bit more variance. Um but eyeballing it, you're looking about 10,000ish requests a month for just flips for two domains. Um cool. So one of the busier months was October 2024. Uh so there's a large spike roughly near the middle to end of the month. Uh, even the leadup for this was a bit larger than normal, about 400 requests a day, then climbing to about

1,600 requests, which is kind of funky.

So, it turns out that on the 24th of October, there was a solar flare, which is an X X3 class. And the way the classes work is that X is the sort of scariest one, and that X2 is twice as bad as an X1, and X3 is twice as bad as an X2 for an idea of energy. Um but you end up with maybe eight of these X series events every 11 years which is 11 years of the solar cycle. Uh other sort of interesting dates that we had. So June 2025 had a large spike as well. So during this period there was a rough average of about 400 requests a day again but then it peaks to about

7,000 which is kind of funky. Yeah. Again like possibly unsurprising. Um, that was a weaker flare, but I guess we were sitting in the right position as opposed to being on the other side of the world. I don't know. Um, I think at this point in my analysis, the time zone was incorrect because I think it actually says a 17th rather than 16th. Um, but if you go back a bit, he actually says massive angry error saying that I've dropped time zone information. Don't do that. Uh so request types. So we've got a pretty simple table here broken into three sort of segments otherwise it long horizontal this long vertical thing. Uh we observed 89,000 a record requests. So like that's

name to IP name to IP. Uh most that's probably most that was the most popular kind of request we observed. The second request type we saw was MX. Uh so that's serving mail server entries and then after that it was quad a IPv6 and DNS key in text. DNS key was kind of unexpected but don't quite know what the deal with that is. Um for those who have a network defense background this might look a bit odd mostly because that's not the sort of un distribution you're expecting for what DNS should look like. Um so there are public DNS servers that publish their results of what their requests look like. So in their case, Vicana publish their sort of yearly

summary of what they get. So they put forward that most over half their records were a record request, but ours was less than half and then they were thinking quad A is 19% but ours is at 10 and so on and so forth. Um, I suspect this might not be a I suspect this might be from the zones that we're interested in may have a slower uptake to IPv6 for their edges maybe. Um, but I don't think this results I don't think this really indicates certain types of requests are more or less susceptible to flips. I think it just happens to be whatever it flipped. Um, quick hourly breakdown of this. So this time I've got time zones

right which is good. Uh then I made the graph in a so uh the rest of Australia exists but I've just treated it as a EST. U we see a nice little peak at 10 a.m. and this sort of second hump which is a bit weird. Um but if we start filtering out things that are what we think from Australia again like if you remember the part earlier where I described finding where the clients originating from it's actually kind of crap. So this may not be terribly useful, but we do see a lot less data coming from Australia versus from overseas kind of um but we also know a lot of exits for Australian infrastructure

geoloccate to the wrong country which is kind of funky. So that's stuff like uh zcalers which seem to come to geoloccate to us but the you know latency time you're looking at like milliseconds to Sydney. So, I think it's here. Uh, origins again. So, like breaking it down by country. Uh, US at the highest, uh, followed by Hong Kong, Ireland, and Australia. I don't know what the island one's about. Um, but the geocation is kind of flaky, so I wasn't too interested in figuring out what the actual deal was here. Uh, by ASN. So, grouping by ASN's a bit better. We find a lot of datas from Google, Amazon, and Cloudflare. Cloudflare is kind of weird

uh because we couldn't tell if this is due to Cloudflare DNS or or perhaps uh more interesting would be if it's a if it's the origin servers perhaps trying to to connect back to an origin. If it's that it's the latter that's kind of interesting but it's probably the former uh weird origins. Uh so Zcales get used quite a lot in Australia. Uh I only recently learned that they're a sponsor for the conference. So I'm not trying to bash them. It's >> they just they just get used. That's all this is. Um, so we're trying to look for queries that look for that have a strong chance of being internal. So like that's stuff that like comes what we think is

coming from a client on a network. And the way I do that in this case is if you're familiar with active directory. Yeah, you're familiar with active directory. You might have seen this part of a DNS request internally if you're logging domain controllers. Um, external clients don't usually make those requests. This is usually service discovery things like LDAP or Kerros. So we're seeing machines that are on domains that end in.gov that are bit flipped and looking for active directory related domains. So we think this is service discovery for internally malfunctioning boxes trying to resolve authentication systems. Uh if you're a pentester you probably have some ideas of what this could mean. Uh and I think the intended sort of

breaking it down a bit more. This is what AD sort of related traffic looks like. So you notice the underscore things like Kerburus and LDAP or site screws. It gives you the name of a given site. Uh those names are intended to be human readable which are often locations. Uh you also see the victim domain names here which I won't read out that you can probably read them. So yeah. Uh what I care about the origins again like let's say you want to exploit this thing. You want to have an effect on the right system. Uh if you're operating under a tightly scope model testing uh attacking end users so like third parties you're probably going to get

banned. Uh but if you're not operating under a tightly scope model, you probably still want to affect the only machines you care about, which might be internal or external ones depending. Uh I think if we break this down we end up with traffic with either general internet users uh so it's AO calculator people general scanning stuff which is probably explains why we see some non-resolving hosts uh internal active directory systems doing service discovery uh and I think that this might not have been covered as much in earlier works so sort of applying bit flips to coercation scenarios or system to system or uh defensive apps. So just gone one. So lovely. Uh so good candidates for things uh

getting creds. So again stock Juno 2024 Defcon uh they were catching quite a lot of creds. Systems I think system credentials are probably better to catch uh because there's probably no MFA or things like that we care about. Uh Dynabberg was catching crash dumps. Stock was catching a wolf tokens. Uh I think logging systems, metrics, telemetry, all sorts of stuff might be of interest if we can start getting clips of those. Uh or things that might get you rce uh so unsigned unsigned update services again TLS doesn't matter the actual blob that you're delivering is uh existing tradecraftraft. So again uh certainly exists uh that was 2024. Uh they got quite a lot of success. So that

was 6.2 million requests over 4 months. Uh they reckon they caught about 15,000 emails which is kind of handy. Um we didn't go down that path again because that'd be bad. But like obviously this works for clippings. Uh the non-obvious applications I think are things like uh turning these leaks into usation. So such as determining internal services layouts uh so from active directory active directory sites etc. Uh I think there's a lot of things that generally sort of unsafe internet protocols which would be kind of interesting. So catching all that bind creds from shitty appliances. Uh so like Wi-Fi radius boxes etc. Uh those embedded Wi-Fi access planes aren't using ECC, so they're a bit more susceptible. Um

printers, etc., all that sort of embedded crap.

Awesome. Yeah. So from a recon angle again, uh you see a host, a domain. Sorry, where are we? Data center. Yeah. Awesome. Yeah. So a site named data center. Not a fantastic example, but sort of ideas of what's going on. Um, in this case, the ASN's kind of useful to know where it came from. I think that's probably higher confidence that data actually come from that uh sites recon. Again, this is probably a better example of uh a request for Vity Lakes uh which looks like it might be related.

Um, who said that? Um dear um so I think uh learning of product affiliations is probably also a good angle here of perhaps giving sort of stronger evidence correlate things. So in this graph again we're looking for AD related coms uh for LDAP and we see a lot of things coming out of Zcala. It's kind of interesting uh sort of making it more practical. I think for Windows authentication you know how uh if you're pentester you've heard of co authentication so nm relay etc. Uh I think there's a lot of angles where you could probably make a slightly better responder for bit flips. That might be interesting. I don't think this has been done before like we didn't do

it again. That would be bad. Uh might be some future stuff that's interesting. Uh yeah, you could make SPF pass or other things like that that might be of interest, but I'm getting a bit too close to the end of the report end of this. So I'll hurry it up a little bit. So uh we did actually tell someone or some like places. Nah, that's right. Uh again, other questions that might be of interest. Uh are there sort of common common platforms that still get deploy common tools that get deployed on shitty hardware? So like things that aren't using ECC which bits could bite. Uh replicating TLS enabled services like the stock work from 2024. Uh

are multiple bits more interesting? It's probably more researching. How many places using Zcalers? And what if we started handling requests more gracefully? That'd be interesting, I think, as well. But frankly speaking, they happen enough to be to be exploitable. You only really need one interaction to make a cheap domain worth it. And I think there's a lot of room for new and weird attacks. But we did have one other quick sort of observation that was funny. Uh so uh if you're familiar with how get out works, uh you may note that that says E not found. Uh so one one one e thing after we're doing as we're doing this research I got pinged on Slack with an

alert saying that our primary domain was down. So it's a company website uh I didn't really feel the need to redact this cuz that's exactly how I felt when it happened. Uh so I went to investigate it and yeah it's a picture of me and in the background of me it's just on Slack but fight me. Um so you know we you do a quick host like uh quick DNS request for our own stuff and okay we get results but then we use our upstream DNS as opposed to our own servers hosting our our name servers doesn't work and what we noticed eventually was our domains not just our bit domains but our corporate domains got seized

and the only thing they had in common was the We're not trying it's not like a nonattribution thing like this is they're real domains like they're registered to an ABN like the same contact details but practically speaking uh if you do this sort of research you might get your domains yanked for every your ABN owns which isn't great uh especially given that ABN are public so I wouldn't say that do this But yeah, exactly. like you if one was to register some domains with someone else's ABN and start violating the to

thanks