← All talks

Passive-ish Recon Techniques

BSides Leeds · 201822:4715K viewsPublished 2018-02Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
DifficultyIntermediary
TeamRed
StyleTalk
About this talk
Abstract: A run-down of (mostly) passive reconnaissance techniques; some well-known, some not-so-well-known. We'll look at Google Dorking, scanning GitHub profiles for dotfiles, building website maps with Archive.org, and introduce breadth-first scanning to avoid detection. Speaker Bio: TomNomNom is a software engineer from Bradford. He likes to fix things, explain things, write code, and dabble in bug bounties.
Show transcript [en]

your own thank you for waiting thank you for coming all the way up to the second floor thank you for waiting we're slight technical difficulties it turns out my laptop is not very good external display is not working I'm Tom and I'm done I'm gonna talk to you about passivation techniques so first leave me I'm just a guy who wants books for fun sometimes I don't work in security professionally I'm not a pen tester I'm not a security researcher which used to be on everyone's Twitter bio these days I'm a technical consultant describe it in gaming one of the platinum sponsors I do a bit of training and teaching and I like anything on Twitter I'm turning on

github so first off I see each I think we should define that upfront so really I'm going to talk a little bit about a couple of things I've come across for reconnaissance where you either don't connect to the target directly and so I passive from the targets perspective or at least they kind of look a little bit like normal browsing traffic in particular like if you're looking at traffic graphs I don't want to put big spikes in them and want to like trigger intrusion detection systems that kind of thing so often when we're talking about reconnaissance for a target people are often talking about asset identification so what's their online footprint domains of the GAR subdomains IP ranges

third-party accounts that kind of thing of truckers get hub accounts maybe what employees of the got depending on a little mission you've got for your target really for me it can be anything some wise dude once said knowledge is power and I absolutely agree and partly to be honest I just enjoy it I like to pretend I'm a private detective and know it's board with all bits of information bits of string on pins joining all the things together I really like to imagine myself that way it's not the way it really is though so one of the things people often kind of leap to when it comes to asset identification that kind of thing is finding all the subdomains

like what hosts of this company online this target what can I find there's a bunch of different ways to go about that so there's some fairly obvious ways brute-forcing there's a bunch of tools for that quite like recon ng myself you get whatever word list you like dns dumpster don't if it was familiar with that threat crowds fairly good as well although often includes things that are not part of that company just of loosely affiliated census do my showdown as well which is not this done there if you've got a fair amount of disk space rapid7 have some forward dns logs that they publish on scans to i/o and they're actually released what it's basically all of the

domains and subdomains all the DNS entries that they know about you've got enough disk space you can download that and get through it'll take a long time but actually that can be fairly effective certificate transparency logs as well very useful actually found something using those the other day google I think people are fairly comfortable Google these days also Bing I know it's shocking sometimes you get different results I nearly put that go up on there as well but actually most of those results come from Bing my experience and also one thing I have not seen people uses CSP headers so you get these content security policies headers on web applications these days that say what things can be included to the web

page what things can run and sometimes you find little extra bits of information so this is just taka ones thing I'm a bit of a walking Huckle one billboard because they seem to like giving out free clothes and sometimes you'll find things in that it's worth looking just pay attention just ever done a Google docking so you've got the Google say I'm looking at yahoo.com I want to find your arse that ending dot PHP because PHP things are obviously insecure right everyone knows that that kind of thing is fairly common I don't see so much of is why look at the target directly employees are terrible for putting things in placement have a look at pastebin instead if you

know in this case this is a search for the target I was looking at the part of a bug bounty program I found out they had all their internal stuff on there no net which i think is pretty common Oh for that and here's an employee's SSH config just helpfully paste it into paste bin for me Google indexed it lovely so the production in conflict for some java application very helpful very kind of them thank you very much obvious thing if the source code you can read boxing stuff it's hard I'm not a good black box hacker or tester or anything like that I find it really difficult I like tall white box things so if you've got take that source code

on their github page and don't forget about like gitlab okay everyone kind of oh there's nothing on github so they mustn't have anything they might have things hosted elsewhere and not just code that they've written look at their folks repos as well I'll tell you maybe what things they're using internally if you found for example a necessary service type request forgery vulnerability might give you information about the kinds of things you should target we're just looking for information client-side JavaScript we've been given this incredible gift that pretty much by definition the source code for a web application is sent to your browser and you could read it which is sort of back before people wrote web

applications is sort of like I don't know how anything works whereas now no one wants to write server-side applications these write an API and they just shove all of their application logic down to your browser for you and you can read that find out how it works find client-side vulnerabilities like tommix assassin dom open redirects that kind of thing but also all the time they'll include information because most web applications have got admin panels and who wants to really split their JavaScript bundle into two anyway and send one half for customers and one half for admin so I'll just pull it all together no one reads all that stuff anyway I do and often I'll find your

admin API endpoints and sometimes maybe notice from there's no cross-site request forgery protection on it and maybe I can send you a link that will do horrible things in your application and also for this crap is your friend some code bases are really really big think what would I like to find in this well then what keywords could I use in order to find those things things like password API key authorization more on that one later another thing that can help you is to shuffle hug anyone heard of triphala hug cool triple hog is great it's very very clever it's fantastic so travel hog will take any repository look through all of the source code all

of the commits or the history deep dive into it and look for what it calls high entropy strings which are often secrets SSH keys that kind of thing passwords encryption keys all those kinds of things it goes and snorts them out for you advance them so this is a target a friend and I were looking out a couple of weeks back and we found just out on github password buried about several directories deep when maybe not many people would think to look Android applications this is actually one of my favorite things I don't talk to that many people who bother doing this and it will write Android applications no so talk to a couple of people who do and

it's not really in the forefront of their mind that when they build an Android app on I was getting x2 what's called the Play Store the Play Store yeah the people might be able to go and decompile that and read their source code they're not really thinking like you do when you're on client-side JavaScript ago obviously you would never put a password in client-side JavaScript but apparently some people will put them in Android applications so this is the target I was looking at a while back and they were doing turns out book reporting and my slides are going ever so slightly askew raphe never mind if using their own JIRA instance for book reporting and they just bake the

basic authorization header into the apk that's kind of a bad idea so this is a screenshot of their JIRA instance which was public facing on the internet and the little bit highlighted down here so it's a little bit difficult to read so I'm going to go a bit CSI on you and enhance this was a JIRA administrator account for their bug tracker which would presumably include all of their other books security bugs things that weren't fixed yet issues with customers customer data it's an administrator account so I just created my own back to life it's a horrible horrible thing when I five never been so disappointed that a program didn't pay bounties is where I

found this although I did get quite a lot of chocolate and a t-shirt which was two sizes too small one of the really useful things I found when trying to map out like what URLs are there on an application is archive.org have this wonderful wonderful thing with wayback machine and will never used it till I go let's take a trip down memory lane and look at the website that I made in 1999 that I deleted and then fringe really hard about you emotional vlogs that you posted particularly with with bigger targets and things that been around a long time you can find some really interesting stuff so this is a little tool I wrote called way back URLs

addresses the bottom share sides layer so you'd have to like write it down or anything which basically just goes to the wayback machine API pulls down all the URLs it knows about for a particular domain and any subdomains of that domain and often you'll get a lot but then you can grep them look for interesting thing so this grep pattern I've got down here I'm looking for basically any query string parameter that has a slash in it because my hypothesis is PI suppose is maybe there's something where I passing in a file path and that's exactly where I found so if you saw this URL down the bottom can people read that story underscore get CGI so it's old story

name equals something with slashes in something something died HTML I don't know about you but if I saw that I'd be straight and looking at maybe I'm going to look at Etsy password instead oh not that I'm going to look at Etsy hosts because I'm a nice guy and I'm going to steal people's password files just for the sake of it honest something I've been looking at more recently which kind of sometimes works is link shortener archives so there's a URL just off the bottom of the screen I fixed it right so URL team are a group of people who basically brute force link shortener URLs so all the random digits kind of thing and see where they go and then

they keep track of them and they're doing it because they worried that if say tiny URL go bust like half the links on the Internet are going to be dead and I think that's a worthwhile cause they put them up on archive.org also means I can download them and in particularly a case of a very big target where people are quite likely to link to them I can use that to map out the application and find interesting things it's the kind of thing the tools like dir Buster and those kind of things they're never gonna find because they're not looking for query string parameters it's just yes this file exists but that doesn't actually tell me that much about what it

might do what parameters it might have so the u.s. they the Ministry of Defense yeah batty program so there are actually pretty good use case for this basically anything dot mil I'm looking for here and similar again Marines doc now slash pages for details aspx item URL equals some URL so I don't know about you but I'm looking at like thinking I have a look at that and maybe put in my own image and maybe at a minimum I can make my own images despite in the page maybe somebody's going to be an escape to maybe I can effect something internally instead which would be particularly bad in case of the US military at a minimum

turns out people will paste all kinds of things into URL shorteners and using it to build password lists that kind of thing any useful for that so we've already mentioned get up a little bit around finding source code and that kind of thing and it's kind of useful but humans tend to be the weak point for cooperations disease and very helpfully companies these days trying to get their tech brand on point have a nice github account and a whole bunch of the people who work for them will be listed there and you can click on their profiles and see what they have work-life balance maybe not so great in some places so people share conflicts

between things that work the whole time and one of the things that people tend to do these days is store their dot files like configuration files in a github repo because they've got machine at home they've got a machine at work or I share them between the two so they don't have to maintain two separate bits of config files and that's fine it's less your get ignored discipline is not so great so this is a commit for someone removing their batch history from their dot files which is kind of a problem if at any point you've maybe say exported an API token into an environment variable in the case of this person kind of an problems and if you're really

clutching at straws trying to figure out like what tech to even use like they've got not any server headers they're not a file extensions you govern any information about them so I have a look at that careers page see what they're hiring for so we're using no jess react HP my sequel Robert and Hugh chef and docker I actually I know a fair bit about your infrastructure now there's things I can assume so sort of using RabbitMQ okay you doing some message passing tight stuff and they're the kinds of if I'm looking at line payloads in particular they are the kinds of things I might try and target or they might even be just host names I might

try and guess I'm in huge or domain.com or something like that I know you've got these kinds of things internally so that's kind of for the passive stuff and I wrote a not really a great story but a little story and how dir Buster makes me sad and I want to set out and try and do something about that so instead of attacking one target and sort of going I'm just going to flood this with requests I'm going to hundreds of hundreds of tons as requests and they're going to notice it in their graphs they're going to ban my IP I don't want to do that so instead I'm going to try and look for one thing in lots and lots

of places and to test this idea I decided I was going to echo Berlin's hack the world competition this year set myself the challenge of using these these breadth-first techniques only i want to place in the top 30 if you face in the top 100 you get a t-shirt so spoiler alert at least in the top 100 I needed some targets thankfully hacker won provide a graph QL endpoint for which you can write a query that looks like this which will return all of the in scope domains and URLs for all of the programs I have access to and like a one and then I wrote a tool still gone Meg it's on github and what it does is it

takes a path or a group of paths and a group of hosts and it will crest that path for every host in a row with a bit of protection so it doesn't call one host more than once in five minutes does it very very quickly about 300 requests a second but each server is only seeing a request every now on the gun it's not going to show up in the grass does have a proper user agent because I think that's a good idea you could override that too like a chrome user agent if you want to be really stealthy I don't recommend that kind of behavior a little bit of a problem with that often you'll get

responses to things where it says it's a 200 ok responds but actually it's not so what I've taken to doing is using this little thin config at the bottom so I can open up lots and lots and lots of things in lots and lots of them buffers a nice go bang bang bang bang man and cycle through the buffers really really quickly so you probably can't read that text you can see the shape of it right it does that file look interesting kind of and guess what file that is package.json you don't have to be I read the text like I can see my mile off and fruit sing out I can see that they they

use grunt they use Karma so I'm going to use phantom Jaso I'm going to be looking for those files now some stats for you so I pulled about 40,000 hosts and looked for a whole bunch of files on them so package.json showed up point one five percent one by one percent at the time gold files point eight four and these are small percentages I add 45 thousand hosts or something like that to go out so some of them just cropped up and the one at the bottom particularly interesting and meaning I was trying to do reconnaissance with this tool but it turned out I could actually find real problems that I could submit on a book

grantee and get cash flow for example slack API token in Travis Tamil file so Travis Tamil is a file you used to control the continuous integration service trommel Travis and this particular company wanted a message firing into their slack incidents when they build was finished and I could also send them messages to do things like I don't know very convincing phishing attacks or maybe CSRF against the internal services should I happen to know what they are from internal reconnaissance flash config so git config file it's a really useful thing you can do with get reposed if you don't wanna put like try it private SSH keys everywhere on all of your servers you can just put a nice API

token in you get config and it will pull things down from a private repo the problem is that if you then expose that on your web server then so can i yeah turns out one of the worst things you can do is have your doctor who can forget to be the same as someone's home directories because all of their files will be available including their SSH private key I'm just yeah we'll just leave that one also requesting slash can be useful like and I was doing this just to sort of wait I'll get a home page for everything and then grab for some interesting things and see us around it turns out one of the things

you can find is on dangling see names for example pointing up on register three buckets which can then take over so I have like a little shell script runs a bunch of reps for the kinds of very messages you get for those unregistered accounts find a few of those found another one another $600 for a what we also turns out you can look for vulnerabilities directly with a tool like this and lots and lots of us places at once so I'll just pick kind of an obvious payload so I'm looking for path based cross-site scripting so the path was reflected in the page I'm just gonna go pass in percent three C which is the

less fan yeah less than and then maybe a double quote I'm just gonna grep for it and all the results in the files I've saved and chances are I'll find one thing out of maybe 40,000 but a family thing crawl off injection or a line feed injection they aren't familiar with line feed objection so the gist is you insert a line feed character is reflected tunes to the response is part of the headers now you can insert your own headers like saying cookies for session fixation attacks or maybe a location header to do a redirect or something you find us directly open movie like a rat Jimmy that's kind of the same there's a bunch

of fairly standard generic payloads you can fire anywhere and again maybe 0.1% at a time you'll find something kinds of things where if you're looking at one target by itself might not bother if you're looking at 40,000 you should totally bother course config errors I think actually pretty common these days people are familiar with cause it's basically a mechanism you specify headers so you can say this site can access my site and also send credentials and that kind of thing you can send headers with Meg so you can say for origin HTS evil com what returns and access control of our origin evil calm header that might be vulnerable to me stealing your customers details so the

results are in I place 17th so it kind of works I made maybe a few thousand dollars in the process nice this has been really really fast so while I'm for keeping up spy on your ideas aren't your questions silence so I just use tackle one in this case I've been really really lucky and got invited to a whole bunch of private programs on hacker 1 which gives me a real big big incentive to to hacker and cause me a far higher up the leaderboards I actually deserve to be and so I asked just to talk on in this case thank you very much [Applause]