← All talks

Pay $2 shipping to receive your free iPhone!

BSides Sydney22:37121 viewsPublished 2023-08Watch on YouTube ↗
About this talk
"After discovering a subdomain takeover that was used to serve an obfuscated JavaScript, I went down the rabbit hole to figure out what it was doing... A typical response to a subdomain takeover is to remove the dangling DNS records, however few teams bother to investigate further, report it SaaS/Cloud providers to coordinate takedown of these malicious accounts and infra, share IoCs, etc. JavaScript especially seems to scare many and is rarely talked about in the InfoSec community, yet the web is an often used avenue to deliver malvertising or malware to users that we all use every day. By telling my own journey of discovery, I hope to encourage others to do the same and educate attendees in various techniques used to obfuscate, deobfuscate, or hinder deobfuscation of JavaScript files." Andy Vermeulen joined Rokt, an e-commerce marketing technology start-up, as a JavaScript developer almost 10 years ago. His role evolved quickly into full-stack web app development and as the company grew he started getting involved with Cloud infrastructure and Dev(Sec)Ops pipelines. For the last 6 years 'Sec' has taken the forefront and Andy now manages the Security team, where he is responsible for application, infrastructure and endpoint security controls.
Show transcript [en]

[Music] like if you think you're going to get a free iPhone you're probably out of luck but thanks everyone for joining the talk today um so I'm Andy um I hope each other on social media so Twitter or Mastodon and can find me um I'm the security lead at rockt um and we're happy to be sponsoring today because we are hiring so come and have a chat to us at the booth during the break and you can win some Lego as well um I started at Rock several years ago as a software engineer and it's kind of my background I do full stack web development you know front-end back-end and play with JavaScript which as you'll see soon enough will be relevant for today's talk I have two little monkeys who take up most of my time but if I'm lucky enough I get to do things like a bit of bug bounties or other Explorations and that took me to search.h back in December last year so if you're not familiar with certain sh it's a website to search certificate transparency logs so you can see every TLS certificate that was created by an organization uh in you know the last few years since certificate certificate transparency became a thing so I looked for a company let's say example.com and I got back a list of all the you know certificates that were created and one in particular stood out to me which was John doe.dev.example.com or org um and it's kind of weird for two reasons one John Doe you don't really see someone's name in a TLS certificate on a subdomain but then two it had Dev in the name and Dev is usually really interesting from a security perspective right because you can kind of get stack traces or other details that may not make it passed into production past the security team so it's like I want to go and have a look at what's happening and see what they're doing and I was presented with this which is a little bit unusual for a western company I have no idea what it says but thankfully Google said this is you know related to Saudi Market where example.com does not do any business and then it also talks about best practical laptop you know where so on lots of computer parts and things which again is not the vertical that example.com is in so I'm very sus and I almost missed it but I kind of saw this link number two and if you click it it takes you to a very similar page that again has random content on a different topic in a different language and a link to page three which has random content and so on and it's got about 10 000 of those um so being a developer I go and have a look at what's under the hood and I spot this particular script being loaded from a Russian domain so I go like you know Western company Arabic content Russian JavaScript domain something's fishy go and have a look at where this is hosted so we do a DNS lookup and it tells me this is a c name to GitHub um gives me an IP reverse DNS lookup tells me Yep looks like GitHub for real looking at the IP address that Yep this IP address belongs to GitHub and their as number now I do a curl it gives me a little bit more information when I curl for that cname that's there and it tells me this belongs to GitHub Pages which is good news because now we know GitHub Pages as a product we can go look up the documentation see how you set up a website see how you set up a custom domain name and so on but it's even better news because everything that's on GitHub pages is also on GitHub so we can go and do search for some of that Arabic content which brings me to steppard this user who's slightly suspicious in that he has a lot of repositories with very similar looking descriptions machine looking generated names and he's got about 157 of those that all have as I said 10 000 pages of HTML content in random languages you know so you know I want to figure out what's going on start poking at it and I figure out that each of these repositories actually includes the domain name of a particular website they're all different subdomains and they're all pointing to GitHub so I write some bash and discovered that they're out of the 157 I think about 87 we're still active subdomain takeovers leading to Pages like this so we report that to GitHub they kind of agree this is not really what GitHub is for um so they go and shut that down so happy days they also now have domain validation so those subdomains are now safe and they can't be taken over again but I'm still wondering you know why what's the point right what's this separate guy up to you know what is he trying to achieve with this and that takes me back to the JavaScript that we discovered and it's just a single line as you can see which is kind of normal most groups are minified these days and will just show up like this but luckily we get the curly braces at the bottom there so we get a bit more white space injected into that and make it a bit more readable but this isn't exactly readable if you agree with me so but we can get a bit of a feel for what's actually going on so we see a bunch of scripts or strings router at the top of the script here that are obfuscated we can see there's one function here that's referring to that array of strings and it's pulling out one element by index we can also see what looks like base64 character set being used there and we can see a bunch of character manipulation so you know this seems to be base64 encoded Springs probably if they're using base64 it's some sort of encryption so they end up with characters that are unsafe in a JavaScript context but I can't really work out what the rest of the script does there's about 16 kilobytes of this um so I need to find a way to actually de-obiscate this and learn what's going on so plan number one is the lazy way I'm just going to debug it we're just going to inject a whole bunch of debug set a whole bunch of breakpoints in the Chrome debug tools maybe add a few debugger comments maybe add some console.log statements inside the code so it prints out what's happening as the code goes through looks something like this you know has many breakpoints as Chrome will let me set I stepped through that for a while start getting a bit of a feel for what's going on but what I'm seeing does not warrant 16 kilobytes of JavaScript so I'm not really confident that I'm really seeing what's going on I feel like it's just sent me down a path somewhere where it knows I'm debugging and therefore it's just kind of getting rid of me and it's not really showing me what the real code does so I want a better plan um so plan two we kind of saw before we use Chrome Dev tools to pretty print Omega script more readable so we're going to follow that same approach but we're going to try to use a few other tools like amplify.js or JS nice that will actually try to rename variables for you and they use statistical inference to try to do that but the goal is again to simplify the code to make it more readable so we can make sense of what's actually going on and get confidence that you know we're really seeing the attacker's behavior there so we set up a local Dev environment we do exactly that pop it in Chrome and we get this which is not really what I expected so Chrome just shuts down the tab no idea why I assume I made a mistake I try this 10 times over try different tools same outcome every time but if I run the original uppercated single line JavaScript it loads just fine so I can run it locally I just can't mess with it it took me a while but I managed to kind of pause Chrome as it's doing this and it dropped me in this function which as a for Loop typically does starts by just setting some kind of counter to say how many counts have you you know gone through this for Loop so typically I we see it being set to some random number there I'm going to assume that zero so well that's less than sorry well that's less than a Target value we're going to keep looping increase the counter by one and you know just go through that for Loop a couple of times we can see it's going to do that until we hit the length of something there which is this variable 30C length means it's an array or in a string and then it's going to you know break out of this Loop and that's it but what's unusual is that length is actually being reset at the end of this loop as well and even more unusual is that we can see that it's pushing an extra element into something so pushes means that this is an array so on every iteration of this loop we're going to make the array one size bigger which means we can never actually hit that stop condition because our count is increasing by one and the array size is also increasing by one so effectively what's happening is the array keeps growing chromes are out of memory Chrome kills the browser tab easy answer is I try to get rid of this function just does the same thing puts me somewhere else in the code similarly similar Behavior I spend an hour or two trying to work around this and I'm hacking away at this code and I kind of lose confidence again that I know what's happening right like I'm changing the code I don't know if what I'm doing is impacting the actual behavior of the script or not so I want a new plan Plan Three which is we're going to do the hard way the slow way and we're just going to have to you know spend hours and hours trying to debug this so I'm going to start with the original raw script and I'm going to make small surgical changes and after every change test that it doesn't crash my browser and then it still behaves the same way it did before I'm going to use regular Expressions to do that I won't share the regular Expressions today because you'll just be staring at the screen for a long time I'll post some links at the end if you want to go deeper into that specifically but I'll walk you through the general process in a very simple example is the raw string just got split up into two here Constructor ER um so we're just going to write a regular expression to that get rid of that make that just back into one string along a similar vein the script was using a hex for all the numbers my brain just doesn't deal with hex all that well especially when you're doing maths operations with it so just converted this all back to decimal But ultimately we know what this is there's no variables in here there's nothing Dynamic here you know we can pre-calculate what this is so again write a regular expression just run Evol over that and now we come back with a number 327. a similar change that I did was JavaScript supports two types of weighty calling functions on objects you can have this bracket notation or this dot notation and part of it is just again easier on the eye but if you change the dot notation your IDE will actually recognize that as a function call as opposed to just a string being used to index an array um but also it just kind of reduces you know some of the noise of the brackets and the quotes and so on we also go and rename a few things very simple but we know kind of you know this is an array of strings we know this seems to be some decoding or decryption function so it just kind of helps us understand the code a little bit better as we live through it and as we do that we start seeing behaviors like this where we have two different functions but both of them are doing exactly the same thing even more there's four arguments here to these functions but only two of them are actually used so two of these arguments are just bloat they're just dead so we can just get rid of them and finally this is just calling decode and it's just adjusting that first argument you know with a very simple operation subtracting or adding 866 yeah so everywhere that we find a call to you know one of these two functions like 5D whatever with four arguments we're just going to simplify that simplify that down change that straight into it called to decode drop two of those arguments adjust one of the arguments by that value that we saw earlier but as we do that we also see the same behavior in the decode function now we've got two arguments there and we've also got the first argument being offset by a fixed value take my word for it the second argument's not used but we can apply exactly the same logic to this every way that we find a call to decode with two arguments we drop one of them adjust the other one but now decode 71 is pretty straightforward right we know that's looking at that array of strings it's getting out the 71 element and then doing that the obfuscation so we can just go and run that in a JavaScript Ripple and go and pre-calculate all that and get all the strings back out so we just go and you know substitute effectively all the calls to decode with whatever the the resulting string ends up being in the end so after doing that that's 16 kilobytes of JavaScript you know comes down a lot in size all that obfuscation slowly starts disappearing we can start seeing what's going on and if we look back to plan one we tried doing console logs because console log didn't work because the script was actually replacing the browser built-in console object with a mock that had the same log one info and so on functions but replacing them all with empty functions that didn't do anything so console.log you could still call it it would be successful it just wouldn't print anything so you can't see anything so we just get rid of that entire block of code all together nice and easy similarly if we can now kind of figure out why plan two didn't work and where we had some trouble with the out of memory you know the all snap Chrome page what's happening in JavaScript it's pretty cool is that when you have a function like the simple function here that returns this you can call tostring on it and when you call two string order it'll actually give you the code that represents that function but not only that it's actually the original code that was written as it was defined in the Javascript file so the first example is what the obfuscated script was expecting it's minified there's no white space so it just returns you know functional return this with no spaces or no new lines or anything like that in it whereas if you try that in all the automated tools that I used would include extra white space and new lines to make the code more readable but as a result of that the string representation changes and what the string was doing was what the script was doing was simply checking what's the length of this string if the length of the string is not that bare minimum 12 or 13 characters then somebody's messing with this and if somebody's messing with it then just go on a completely different code path and stop doing the malicious stuff that we're trying to do at the end of all that a beautiful two days of time it turns out that all this does is redirect you um so 16 kilobytes of and two days of time just to say did you come from a search engine and had about 20 search engines listed if you did set a cookie and then go and redirect the user to whatever that domain is there it also if you haven't come from a search engine checks whether you have that cookie set so if you try to press back in the browser or something like that it'll redirect you anyway because that cookie is there so as a real user you're never supposed to see that Arabic content that I discovered by accident so you know the obvious question is what is it redirecting us to which is this um and you've probably seen very similar pages from cloudflare or something else to say hey are you a real user are you a bot it sits here for about five seconds before it redirects you again and I'll save you the pain but similar JavaScript obfuscation um that I went through luckily most of the regular expressions from the earlier exercise worked on this one as well but this only took about an hour um but we discovered that it's taking 91 Fingerprints of your browser your graphics card your CPU whatever they can get their hands on your screen size and so on um it takes all those puts them all in adjacent object stringifies them encrypts the whole lot sends it to a server and then the server decides where to redirect you and again I don't have time today to talk you through 91 attributes that it's fingerprinting and how it's doing it but I'll share some posts at the end that you can go and have a look at if you want to dig deeper into that but to give you a flavor um two unusual ones I guess for what you would expect the normal or Enterprise script to do one is it's gonna go and look for signs of browser automation so any browser automation tool like selenium or Webdriver you know will kind of leave some certain fingerprints in the Dom that you can detect so call Phantom underscore Phantom and so on so then they can actually change their behavior so if this is about if this is an automated script they're going to do different Behavior compared to if you're actually a real user more suspiciously they're also looking for web antivirus so if you're running Kaspersky or a VR or semantic or something else they can detect that again because they leave some fingerprint inside the actual Dom inside the web page of what they've scanned at the end of all that it redirects me and it takes me here maybe this is Tick Tock maybe they have some new com um you know customer acquisition strategy here that's a bit unusual but I doubt it um so I try this on Safari and Firefox which gives me you know kind of what I expected which is hey congrats you know you've won a free iPhone it's that easy all you have to do is pay two dollars shipping and we will happily deliver that iPhone to you even better it's all covet safe no personal contact we're just gonna leave it at your door so you can be really you know confident that we have delivered your iPhone um this is about a year ago I tried to get some more slides unfortunately they've changed tactics if you're in Sydney right now you get a 750 Amazon gift card instead but when you click through on that it actually takes you to a different website altogether again and this is being run by a company rewards giant or a group um and once you start saying you're not winning an icar gift card anymore or an iPhone anymore now you have to complete a whole bunch of purchase requirements in terms of conditions and all kinds of other stuff and fill in tons and tons of forms of personal details and questionnaires about how often you go shopping and where you go shopping and so on so that free iPhone's probably not going to come anytime soon to kind of recap you know the whole point of this actor was poisoning search engine results so they have 10 000 garbage pages in different languages with different content to try to pollute what actually shows up in Google and Yahoo and all these search engines so an unsuspecting user would kind of get these results in their top 10 would click on it that JavaScript that we saw at the beginning would detect that you've come from one of these search engines redirect you to that loading page which fingerprints your browser redirects you again or post that to a server which redirects you again then depending on who you are or what you are so if they suspect that you're kind of messing with the traffic or if they're traffic that you're they're not interested in they'll send you to tick tock or something similar um most of the time that I've seen it it sends me to that survey page congrats you've won an iPhone and so on but I ran this on a Cron job for several months just to kind of see what else it would send me to and 99 of the time it sends me to a Dutch hosted IP address for congrats but every now and then it sends me to a Russian IP address and does something completely different I haven't gotten there yet but if virus turtle is too believed they're actually dumping uh agent Tesla mouthwear from that IP so my research kind of continues I'm trying to figure out you know what else they're doing I'm trying to figure out how much bigger this is the domain that this is hosted on is changing every 30 minu