← All talks

Plight at the End of the Tunnel

BSides Charm · 201822:4113 viewsPublished 2021-05Watch on YouTube ↗
Speakers
Tags
About this talk
DNS tunneling is a covert technique for sneaking traffic through restricted networks by embedding data in DNS queries and responses, avoiding direct connections to attackers. This talk presents methods to detect DNS tunnels by analyzing subdomain length, resource record size, and query patterns, and introduces an open-source implementation for scanning network traffic.
Show original YouTube description
Plight at the end of the Tunnel DNS is one of the most ubiquitous and yet least analyzed network protocols. DNS tunnels are frequently employed to sneak traffic in and out of restricted environments, without ever making a direct connection to the attacker's remote endpoint. This talk discusses a holistic approach to detect DNS tunnels, and provides an open source implementation of these techniques to scan network traffic. Presenter: Anjum Ahuja (@jack8daniels2) Anjum is a Threat Researcher at Endgame, working on problems related to network security, malware, and behavioral analysis. He has a background in computer networks, routing and IOT security, and holds multiple patents in these fields. Anjum holds a Masters in Computer science from Johns Hopkins University.
Show transcript [en]

yep

hello hello how's that

better there you go all right thanks for coming in uh to the talk uh plight at the end of the tunnel uh we'll be talking about dns tunneling uh but hopefully a lot of things that we learn here can be applied to any malicious pack channel i'm anjum i'm a threat researcher at endgame i work on problems related to network security that journey means i work with large data sets of logs and pcaps and such to build anomaly detection models and so on the same tune i've been working on dns tunneling uh so what is the dns channel let's take this toy example uh imagine oh man the reds are quite dull right now imagine browsing to

www.test.com right uh your browser would make a dns query for an a record for www.test.com and over a recursive dns query you you end up talking to the name server for test.com which will inform you hey test.com is actually www.test.com is an alias for test.com and that's the ip address it resolves to and that's the end of the dns transaction at that point the browser goes on makes its own ipconnection to the ip and you know that's that's it in case of a dns tunnel can you guys see any of this it's a knock-knock joke that's my son's favorite talk not joke but in case of a dns tunnel uh the domain names don't have any intention of resolving to

an ip address they can just keep going back and forth could we increase the red somehow reds okay it's bright red here uh so the domain names have no intention of ever resolving to an ip address they can just keep exchanging data back and forth arbitrary data and uh that particular knockout joke is actually a valid dns tunnel what happens in real life though is that you can end up exchanging binary data on in much large volumes if you wanted to so let's define it more formally a dns tunnel is an overlay network on top of dns it allows you to send and receive arbitrary data between the client and a name server for that sub domain

it basically means stuffing the data into a dns packet such that it remains like a valid dds packet and gets routed through all the build boxes uh dns runs on top of udp port 533 so it's unreliable it's connectionless we'll soon see how that matters you can use tns tunneling for command and control for exfiltration uh they've been bunch of malware families that have been using c2 for it as a c2 channel channel the original dds protocol that came out much before when i was born uh had the limitation of a packet size of 512 bytes and that gives you around 800 kbps some some more than that different applications have a different bandwidth but you get around 800 kbps

in 2013 uh e dns was released which is extensions to dns and that allows you to uh that removes the limitation of 512 bytes so it allows you to sort of negotiate hop by hop how big the packet size can be so in theory that gives you a much bigger hand bandwidth uh the bandwidth 800 kbps isn't a lot to talk about uh even though fcc might decide to call that high speed broadband uh what is it good for right uh as compared to any other tunneling say icmp or there are a few other sort of lower level l2 and three level tunneling that you have you could use so let's look at some of

things that dns tunnels does that stands out firstly uh the picture we saw earlier which you barely saw earlier uh had the dotted line of a connection between the client and the name server that the dotted line is actually that recursive dns query your recursive resolver makes a bunch of iterative queries to uh these name servers in the hierarchy before it ends up talking to the name server what's interesting in this picture is that there is never a direct connection between a client and the name server which is the attacker server in this case so if you were to use if you had anything that did ip level monitoring netflow or something it would miss this channel altogether

if all that you were interested in was upstream traffic and in such a scenario the server can actually just let the queries fail it can do format error it can do surf fail or whatever just fail the queries so that if there is any dns based analytics running on the on the machine on the server it will completely ignore that there's failed queries so you can just let it fail if you wanted to and all the traffic doesn't really have to go over a single domain you can have multiple spreader traffic over multiple domains as long as they have the same name server so in this case example.com example second example.com and so on you can

have as many domains as long as they have the same name server and taking this idea further you can actually spread the data over multiple name servers over the same connection uh so that makes this dns tunneling quite sneaky now it's sort of low and slow but goes under the under the radar quite easily setting up a dns tunnel is quite easy takes took me like 15 minutes to do that you get a domain you have it's all its name name server pointed to your attacker's ip and you run one of these softwares on port uh udp port 53. there's this at least dozens of dns turning softwares out there that sort of vary in

how they encode the traffic how they encrypt the traffic uh what level of encapsulation do they provide some do tcp some do uh ip level encapsulation uh some use eds so there's a wide gamut of applications to choose from based on what you want to do with the tunnel now dns has been around for many many years i'm not going to do a dns 101 but i'll quickly highlight a few things that are relevant to this talk so that's a dns packet format the question is carried the question section carries the query and the response duplicates the query and gives the answers in the answer authority and additional section the one thing i want to highlight here

is that even though the packet format allows you to have number of questions it's actually the protocol doesn't support more than one question in a single packet the reason is this uh imagine if you had two questions in a single dns query the response would be two sets of responses in the uh in the response but they still have to share the same sort of flags and that can lead to ambiguity ambiguity in terms of how do you interpret those results so the protocol limits it to single question per uh per packet the downstream the responses can use any of these three sections so uh the upstream data has to go through a dns query which is a domain

name the arbitrator arbitrary data that that we want to encode goes into the sub domain section of the query and there are some limitations on what you can encode in that field the domain name cannot be greater than 255 256 characters it must be made up of labels that are each 63 characters or less and most importantly they have to have they have to have to have a character set of a to z zero to nine and a hyphen underscore i mean name servers uh domain names are case insensitive so capital google.com is same as lowercase google.com so given that we only have 37 characters to work with in order to encode the binary data we

have to use some encoding that converts into a character set of these 37 characters so each application has a different way of encoding that but base 32 is quite common uh the pseudo code down there is how you would go about encoding your binary data you would paste 32 in 32 encode the data you may as well add a step of encryption at this at this point and then you split the data into 255 minus the base url's length and take the chunk and split it into 63 byte long labels join them together and that's your query to make it looks something like this that's from dns to tcp uh from from a packet capture of dns to

tcp the downstream has a lot more options it has three sections additional the answers additional and authority section to choose from each of those sections can have multiple resource records or rr records and that's how a resource how a resource record looks our arbitrary data will go into the field the our data field and how we encode the data depends upon the query type there are easily 50 or so query types in dns these are the ones that are most commonly used the choices the choice is based on two factors what's the length limitation of that field and what character set can we use to encode the data uh null and text are the most flexible

they let you use 2 to power 16 characters if you want to text is actually case sensitive so you have lot more characters to work with you can actually use base64 encoding here and you know get more more bang for your buck cname mxns they all add a new domain name as a reference so that has the same limitation as a domain name we saw in the question section but these are the only the ones that are used more commonly by the applications i'm sure if you look there's a lot more query types that you can sort of abuse to send data across and the response looks something like this the that's the our data part of the

question we sent earlier

let's quickly review the prior work in this field and see why i had to sort of revisit this topic back in 2012 2013 dns tunneling was a hot topic there was there were a bunch of uh malwares that were using it for c2 channels it was running keynote speaker at rsa spoke about it and there was a lot of research at that point about dns tunneling and they all sort of narrowed down to use these features to detect the instance the entropy of the domain how many subdomains are under single domain and how what's a total data transfer per domain unfortunately or fortunately depending on how you look at it the internet architecture has changed

quite a bit since then so content delivery networks you know manage their dns not using a static zone file but they manage it dynamically they generate machine generated sub domains for their customers on the go you know kill them on the go so they generate large number of high entropy domains under their under this uh their domain names then there are non-traditional use cases of dns that's been growing since then you know call home mcafee uses this way to call home look up file repetition spam house can look up reputations of domain names uh whether they're saying spam or not uh g static double click use all sort of tracking mechanisms that's in there so there's a lot more use cases and

there the traffic looks quite weird and lastly the size of dns packets have ballooned up over the last few years and that's due to sort of this content replication high availability you find a dns packet will have like five name server records 10 ipv6 ipv4 records so the total size of a packet has really gone up and total data transfer per domain doesn't really work anymore and if you thought you could whitelist these cdns and so on it we have some more trouble there so normally when a cdn in this case akamai takes over managing content for a particular website it creates a cname record for the base url and then takes over everything else from that

domain onwards that's a typical use case there are some other use cases out there there's this company called in start logic i do not completely understand what they do but they create these sub domains under the customer's domain the random looking subdomains and not just a few of them tons of them i think they're tracking every html element under the customer's domain maybe for ad tracking or whatever and so it's all over the place i mean this if you look at traffic there's so many such a subdomains under completely valid looking domains look at the size of these labels in there so whitelisting doesn't really scale very well anymore so that sort of led me to revisit the

problem and see what else can we do so firstly filtering by query type still works there is no place for null and private queries in your dns traffic that's a complete no no the text on the other hand has a lot more use cases and it's just free for all domain specific use cases that you can provide originally it was for spf started using it now document sign uses it domain verification uses it so text queries are all over the place but what still works is looking at the number of text queries under a single domain and the size of the record secondly the upstream traffic has nowhere to hide it still has to go

through the domain name that's been queried and so looking at the subdomain length of a single domain name is still a good indicator so if you look at this graph under that on the x axis we have the maximum subdomain's length for each domain and on the y axis is the frequency of seeing that value and that's on a log scale blue is benign reduce dns tunneling traffic and so there's a very clear boundary at around 80 where everything beyond that gets into the suspicious zone if you add another feature to that which is how many labels within a subdomain have length greater than 52 that becomes that reinforces that idea that that's probably suspicious

one thing to highlight here if you look at the graph in the red uh there are some dns queries for small subdomains in uh malicious traffic as well and that's because in a command and control session for our dns tunneling you still see a lot of small packets control floater packets that are generally keep alive so they do show up but when the data transfer happens you see these very large subdomains in the traffic for the response looking at the packet size doesn't work anymore but you dig a little deeper and look at the r data length for each record in the response the story similar the graphs on the x-axis we have the maximum rr length

the resource record length per domain and on the y-axis we have frequency of seeing that value on a log scale and 100 and above starts to get into the range where we where we are suspecting the traffic to be is turning look at the graph the red again there's some values above thousand that's when e dns was used and you can have much bigger packet sizes there so these two features i'll be first to admit are thresholds and if somebody wanted to they could sort of sacrifice a bandwidth and use smaller packet sizes to you know get under get under the radar so we need to add another layer of detection in that uh before we talk about one time queries

let's just take a step back and think from a perspective of our dns tunneling author dns is connectionless it's unreliable right so imagine you have two customers behind two clients behind the same uh nat what you would end up seeing is packets coming from same ip address with from a different port each time it's impossible to differentiate who sent which traffic so adina's stunning protocol has to add some sort of home brew protocol on top of it to track a connection similarly to be able to uh reassemble the packet it has to keep track of packet id or offset or some sort of control photo sort of reassemble the packet that comes in so there's some connection tracking

tracking control flow that's added to the dns header to the dns tunnel data secondly a dns tunnel tunneling channel doesn't need doesn't want to use any stale responses so many times they use some sort of random field in the query to make sure that it doesn't get cached in the resolver so due to these two sort of feature problems or features uh none of the dns dns telling queries are made again the ratio of number of unique queries per domain divided by number of queries that you see is really high in my experience it was 0.98 and for benign it was much lower the really wide gap between number of queries divided by number number of

unique queries divided by number of queries so that's a really good indicator if you have if you're getting your data from uh internal dns resolver you may also have access to the data of which ip made the query for what domain and so uniqueness of a domain in your own network is another very good indicator for in this particular scenario and this last feature it sort of knocks it out of the park uh sort of lack of blue records so we saw this earlier as well the cname example where when a domain sets up a cname record it sort of says this domain is an alias for that domain and then either proactively or actively the client queries for

that's the alias domain again and resolves it it never happens that a client says well i'm done here i just want to know it's an alias and that's the end of the transaction so you always have these subdomains getting resolved in one way or the other but that doesn't happen in a dns tunnel if dns tunnel has to check this box they have to sacrifice a lot of their bandwidth to add let's sort of you know tie all the loose ends and they have to sacrifice almost half of the bandwidth per packet so that's something that you never ever see see that if you track all the cdm records all the records with uh domain name in there and see how many

were you know abundant in the middle that becomes a good indicator it works with mail server records it works with server records any such uh record type that has a domain name in there it doesn't work really well with name servers because name server records go in the additional section which are proactively added to the records and they are not always resolved but for everything else it works really well there are a few other ideas that i'm pursuing and so they're sort of half picked and i just wanted to share it with you guys and see if someone wants to take this further and you know uh make this better i but in general i don't like

blacklisting something i like to whitelist preferably whitelist based on the customer's environment and so i don't like the idea of sort of looking for specific record types and say well these are bad to see in your network so to look at it differently i started segmenting the data the domains by a distribution of query types what i mean by that is if you look at this class half the queries are a records point for name server 0.1 or cname that's one segment and i would group all the domains that have that sort of characteristics and if you start grouping segmenting your domains in this manner the malicious stuff stands out their signature signature is so different from

anything else and that's that sort of you're not specifically blacklisting a particular type you're just black understanding what's normal for your network and what's odd in your network so that works really well in catching good stuff but it still has a long tail end of domains that still get classified individually so i'm sort of working on that and see how we can merge some classes secondly uh if you think about it dns tunnels uh only a dns client can initiate a query so if server has something to send to a client there is no way it just has to wait for the query to come in before it can respond back so normally dns applications do one or

two both of these tricks one either have the client poll frequently or they would hold on to the last packet a last query for the longest they can so that they can piggyback any incoming data from the os into back into response and so these two features of a dns tunnel make it prime for timing analysis the polling is straightforward you can easily dft on on the the query query time and that shows up really nicely the delayed response shows up as high standard deviation of the latency specifically for the last packet in a query in a chunk in a sliding window chunk so these two features are looking very promising i'm still continuing to work

work on that and hopefully you know add those to the application soon the future isn't pretty uh there are a lot of dns privacy protocols that are on the horizon there is dns over https there's dns over tls that provide encryption from the get go so in future uh it's going to be hard to detect dns channels if they are using any of these encryption encrypted dns tools we'll have to probably rely on the provider of these applications to do it for us final thoughts uh i'll be releasing an open source application that you can sort of throw a pcap at and it'll show all the weird dns tunneling stuff out there it should be coming out next week

i'll work on that on the way back uh the two lessons that sort of reinforced my idea about you know actively looking for bad stuff is it's really easy to hide in the bites but it's really hard to sort of look normal in the behavioral way of look normal in the benign traffic and secondly layered approach one single way of detection never works when you have multiple layers that even if one tries to evade one like we saw with thresholds case there's something else that catches the bad guy and that's it questions all right

thank you