
looking for a job find those people okay uh presenting Kevin bottomley give him a round of [Applause] applause hi okay can you hear me does that work okay
Closer Closer Closer sorry there we go good okay hi I'm uh Kevin bottomley I'll be presenting about fighting Fishers fake front today we'll be looking at a couple different methods for doing this so if you're ready let's go some topics I'll be covering would be uh what fishing is fishing techniques commonly spoof companies that you see um how these sites are made fake website uh deployment detection methods and protections that you can employ to help prevent these attacks a little bit about me I uh studied computer networking and information security at San Francisco city City College of San Francisco with a emphasis on network security I'm currently a security analyst here at Open DNS where I Monitor and track uh
various traffic patterns and other malicious attacks that go on across the internet um I'm also an avid rock climber and the proud parent of a 10y old black labador so what is fishing it's a form of social engineering where people try and Garner the trust of uh the cinder or the that Cinder tries to Garner the trust of the receiver of us a malicious email um it's an attempt to uh gather usernames and passwords and other credentials to use the login various websites and Al an attempt to get credit cards or money and other various uh items it's also for Gathering personal identifiable information such as your address your birth date social security number um anything else that
they can use to emulate who you are uh a recent estimate by the RSA at the beginning of the year put the estimated loss in December of 2014 alone at 453 million dollar which is a quite substantial amount of money so a couple types of spear uh fishing that we see is spear fishing which is where the attacker will learn as much information about the person or the company that they're aiming to uh fish and figure out who works there what they do and send a targeted email that usually spoofs uh that it came from inside of the company to to add more realism to to the look of the email uh there's clone fishing which is
actually what we're going to be taking a more uh extensive look at here in a little bit it's basically where the attacker will um take a real website and clone it and the emails that go along with it send that through the receiver uh and emulate basically a perfectly realistic realistic looking uh website and then wailing where where you have um the they target like CSO or CEOs and CFOs of companies uh because usually have more access to the monetary aspects of the company as well as uh usually Insider information and other intelligence that the attacker would like to gather you basically have two types of servers that are used in these attacks the first is a dedicated server that's
like this where the domain name is actually meant to spoof uh something that looks like so this is meant to look like oh sorry this is meant to look like the Apple website you can see in the URL or the domain is uh UK hyphen Apple ID verify.co uh clearly not an actual Apple website but it looks very real and then the other type of website is usually a compromise domain where where they will uh use various vulnerabilities in the website to uh Place malicious uh Pages at the end of the URL usually it's a little bit more hidden than this one this is actually a website for some calligraphy ink place in the Philippines both of
these domains are actually still active by the way I just pulled them out of fish tank.com yesterday um apparently the site owner of this has not been notified yet uh recently there was the big Anthem breach um while this in itself I'm not sure of the attack Vector but it led to a mass emailing of fake Anthem emails uh playing on on the breach and basically using is a scare tactic to get people to click on the link to either send them to a spoofed website or uh malicious or download malicious content and then more recently you had the breach of the state department that led to a fishing uh attack on the White House where devolved some while not
classified information it was very sensitive in the fact that it contained uh the the president's itinerary which is somewhat public but it had an exact time of every place that he would be in foreign intelligence agencies tend to really enjoy this kind of information so the anatomy of a fish for you interested in biology we can talk about that later and then we have the anatomy of the fish that we're covering today so basically starts out with the sender will craft a malicious email to appear to be a valid uh email from a legitimate companies such as PayPal or Bank of America or any one of number of of legitimate corporations the user clicks on the link
and is directed to the fishing page and then the user asked for the credential setes usernames and passwords now here's where it kind of splits off depending on the site design the user can e to be redirected through the use of like usually PHP scripting and then uh the credentials will be cached and then they'll be logged into the legitimate site or the uh the user will be redirected to other pages that have been also crafted by the attackers to try and Garner more information such as the uh addresses and the birth date Social Security numbers uh stuff phone numbers and all that which we'll see an example of here in a little bit and then
sometimes you even see things where they ask for uh more information such as like a passport driver's license uh State IDs or any copy of that then the criminals can use this for either their own personal gain or more than likely you'll see it bundled as a a package for sale somewhere on the depot web um for extremely cheap like $5 gets you hundreds of people's information and then you get redirected to a legitimate website again that little example being PayPal down in the little picture does anyone here work for PayPal okay because I'm GNA be using you a lot and I don't want you to I don't want to block my stuff before I'm
done uh so how many people have gotten a fishing email here yeah how many of you clicked on that link yeah wow it's way more than I expected to answer that actually for research of course it's all it's all for research it's all for research we we'll get into that here in a minute too um so some of the more commonly spof companies that you see are PayPal Apple Google Gmail Yahoo Dropbox and any number of financial institutions wals Fargo Bank of America HSBC um the list there can go on so some of the commonly spof companies again and so the numbers for these data I should clarify so what I did was on if you go to fish tank.com
which which is a community repository that people can uh upload fishes and verify them that they find you can actually pull down a CSV of all of the verified fish that go on so there was about 28,000 that were currently sitting there that were online and verified when I pulled the CSV a big chunk of that has no attribution to it so it means it didn't have a company name that went along with it so I kind of just stripped out that because it didn't add much value to it so that left me with roughly uh about 9,000 domains with attribution so I took the top five which is Paypal AOL this post Italian which I believe is
the Italian postal office and then eBay and Google so the 51% of PayPal is the 51% of the top five of those and so the total of those five is 3,138 with PayPal accounting for roughly 1,600 of those so just to give you an idea of where I got the number from but either way yes uh amazingly enough like it at least on the fish tank aspect of it there weren't a lot of online verified fishes now that's not to say you know that there's not some that have been taken down but that's just what the current online was from like four or five days ago I believe but they are in the list but the the numbers are very
small so how are these Pages developed the costs are relatively low uh a server cost you about $5 and then to buy a name if you choose to set up your own cost you roughly another 10 bucks so it's about $15 usually it's not actually the Fisher's money because he's already fish someone previously to you so he's actually using your money to pay for his new stuff and then a couple other the commonly used uh tools are the social engineering toolkit which the name is kind of self-explanatory it's a toolkit for social engineering it has a whole swath of stuff in it from uh targeted fishing attacks to um Mass mailers the the list goes on I've
left the sorry I put the link up there it'll be in the slide deck that I put out later and then how many of you have seen HT track in Action a few of you yeah it's a pretty amazing tool it's meant for offline web browsing as they like to say but it's actually will copy a web page completely uh it it'll mirror all of the links and all the URLs and all of the CSS and everything else that goes along with it which we're going to to see here shortly in action so what's the deployment well it's easy enough that I can do it which means that pretty much anyone can it's just a few clicks uh well not clicks of
a button if you there's a Windows version for HD track but who uses Windows uh how long does it take well we're going to see here in a minute it doesn't take very long at all to deploy so our detection methods for those of us that remember this
game so basically we employ three main detection methods here or you employ three aspects of the detection method the first is the ASN uh of the company the second is the IP address or the net block that the companies use and the third is using what's called natural language processing the ASN is basically part of the autonomous system so it's a collection of all the routing prefixes for all of the IPS that basically tells your computer how to get from point A to point B on the internet and as you can see if you go to the uh bgp he.net website if you just put in PayPal uh or any company it'll give you a a listback
of basically every ASN and every net block that deals with that company so it's a really good source of information for building up the data set that we're going to look at kind of here in a little bit Internet Protocol address kind of speaks for itself it's basically a phone number for the name on the Internet um so basically you don't have to remember remember that 67. 215. n2. 211 is open
DNS so the question is do we track the the ASN of the fake website
sorry we'll get to that actually shortly but um so basically what you do is you you back trce from the initial domain and then grab its IP and then you basically go and get the ASN from the IP so you back Trace all the way back from there which yeah is that well we'll we'll see here in a second we should I should have that slide in here and then you have the main aspect of the model is actually using natural language processing so basically what natural language processing does is it extracts patterns from Text data using stemming and we'll employ the natural language toolkit which is a python library and also something called edit
distance so stemming is basically when you take a word and you break it down into its root form so in this example you see update updater updates updated and updating when you stem it down the through form it becomes updat so you can use this to kind of start breaking up your strings into uh like a one common word and then using the natural language toolkit you can kind of see this you know uh in in the example you import the library you create the stemmer you put the words in the list and then when you you know iterate over the list you end up with updat upd that upd that upd that upd that for each one of those five
words in the words list update updates updator updated and updating so you can easily figure out what what the stem part of the uh string that you're looking for is for your for your data set here that we're using a little bit second thing that we're going to employ I guess it's actually the third uh is edit distance so it's how similar are two strings and you can think of it the same way that like spell check and Auto uh autoc correct work when you start to type in a word and you mistype one of the letters it'll be all like do you want to make it this word instead because it assigns it a penalty score
which is basically the minimum number of editing operations it changes it takes to change the string that you typed into the string that it believes it uh believes it you want to actually type and so there's actually two kind of numerical penalties that you can assign uh the one that we usually employ is this one where you have insertion deletion and substitution with each one having a a rank of one point and then there's a lein model which assigns uh substitution the the number of two instead of one but we we use a really low threshold so it doesn't we're not going to cover the 11 steam model right now um so an example of the penalty
assignment right it's assigned based on the number of operations needed to change one string to the other so we'll take Open DNS as an example for this so when you try and change open DNS to open DMS as you can see nothing happens for the O the P the E the the n and the D but when you get to the N it's actually being substituted for M and then the S is the same so by substituting the N for the M you assign at a penalty score of one because it took one operation for the string to change going a little bit deeper oops ahead of myself uh when trying to turn intention in the execution you basically delete
the the I you substitute the N for the E you substitute the T for the X the E doesn't change you insert the C where the aster is up there and then you substitute the N for the u t i o n so all the same no change you have a penalty of five so the penalty is greater so as I said before we'll have a little demonstration here about what it takes to actually deploy one of these fishes just to kind of set I kind of just made a video because it was a little bit
easier so what I've done here is I've created kind of a fak SP PayPal website I mean uh email address who's going to act as my Victim and then this is my malicious web server that I've created so you just type in HT track you give it a project name call it PayPal path that wants to store it so basically what file directory you want it stored in the URL that you're giving it that you want it to mirror so we want to mirror basically the entire PayPal website and you don't have to go all that deep into it but I'm going to anyway so you click number four so basically it'll Traverse the entire
uh PayPal website from A to Z you can use a proxy I didn't use a proxy that's why I have a VPS um you can wild card it here so it looks for only certain things it skips other things so it can just go like and take everything that starts with www or skip anything that starts with like Zip you can give it some options the list is long you can always rtfm if you really want to figure out everything it does for those of you that don't know what rtfm means I'll let you Google it oh I'm sorry yes um yeah hold on there is a way
to oh sorry yeah he has he asked if I could in uh sorry do you want you want the font increased or you want this the picture itself yes
um see this might be a little much oh one
more
better is sorry was it was a lot darker in my basement when I was doing
this so continuing on so basically what happened back there was I I started the crawler to go and crawl everything and now I've switched over to the social engineering toolkit to craft the uh the the fish in itself so you give it a send a email so we're going to send it to my fake PayPal victim here if I can type
correctly basically what's going to happen is it's going to get ask for like a a few options of how you want to craft your email because you can send it out in like a mass mailer you can send it out with a few different options and since I didn't want to spam everyone in the world I just selected one oh and then I also have a a fake PayPal sender email too that I was using for the demo so you put in you can actually run a mail server too like so if you have your own mail server going on your server uh you can actually spoof it even more than just coming up with this
horribly crafted PayPal Gmail account here we're going to make it seem like it's verify paypal.com from the sender even though it's not really from verify at paypal.com you have to pass it your password of course you want it to be high priority because it's very important the sender or the receiver gets it give it your subject line something going along the lines account information's been changed you know your account has been compromised something like that and then of course no fishing email is correct if it doesn't have some type of grammatical error in it so we put information uh we're going to send this HTML instead of plain text because I'm the email is crafted with
HTML I have this already pre-made handy HTML email that I've made which will spoof the actual uh real email address or the real domain site in the email as we'll see here's shortly and that's just a little HF on doent stuff pretty simple straightforward paste it hit return again leave it a blank line now somewhere here oh yeah so as you can see like it says fake pal fake PayPal update.com that's the uh website that I bought uh to to use for my spoof but then if you look down I it's kind of tough sorry it's kind of tough to see but I'm basically spoofing that it's from verification paypal.com just to add a little bit more
legitimacy to the email send it and off it goes now yeah we're done there so the other thing that you have to do in the meantime which I probably should have done the other way first but it takes a while for HT track to to go and get all the stuff that you want so now you have to go to the directory where you store the uh the files that have been downloaded from HT track and then start copying them over to where the the landing page will be for the the link in the
email takes a second so usually what the attacker will also do is they'll have a crafted PHP page already like ready to go uh so that they'll take like the main landing pages and then they can have kind of a man-in-the-middle aspect so that when you get to the page that you're about to see here in a minute when you click the log into the page it'll it'll um basically man in the middle of your credentials and it'll either send you to um more web pages that they've crafted for more information or this ship you off to the legitimate site and somewhere there it is verify paypal.com and it's kind of yeah you can you can see it so as you can see the
link actually says verification hyen paypal.com even though that's not where we actually end up because we end up at fake PayPal PayPal update.com home.html but it only took oh about six minutes from inception to deployment and then of course there's a couple other steps that go on with it like I was talking about like the PHP pages that would be basically you would just change the way that the document sends you from the actual PayPal source code and then you would change it to your own pages that are also stored in the same directory down the line uh I believe that is the end of that
video so it's really simple to make one of these web pages uh yeah sorry uhoh uhoh there we go so usually what we've seen in the past is um and this is not actually a copy of the the page that I deployed it's one that we found uh using some of our detection methods so as you can see in the the top left corner it's security PayPal center.com hyphenated and um this is one of the pages that I found when I was going through some pages uh a while ago so basically after you entered your credentials it was taken you here so it's asking for your address the city you live in all that stuff your phone number and your Social
Security number nonetheless so here's where they start trying to harvest your credentials uh in the first part and then after you click next it not sure why sorry what's going on here sorry technical difficulty here
what is going
on sorry my slide doesn't seem to be forwarding for some reason come on really what's going on
here oh oh are we
[Music] fixed what is going on [Music] um maybe sorry it's always
happens
[Music] oh huh yeah it's right next option
here sorry my new laptop seems to be
here put it start
over
fail
fail all right I think we're fixing okay sorry about that technical difficulty so social security number gathered the next page that we went to uh was a credit card information so it asked for your card holder name the card number expiration date the card security code and then the thing that stuck out the most was asking for the pin number as well which most websites won't ask you for the pin number as that's the point of asking for the CVV on the back of the card and this particular website actually validated uh your credit card to make sure you weren't just putting in some type of BS card so it was using the the lon algorithm to make sure that it was
actually valid so that you weren't tricking the trickers and then the last page of this actual website uh asked you to to verify your information even more so they were like send us your passport your driver's license your image of your credit card and proof of address and then you could nicely submitted to them all in one Fell Swoop uh so that was pretty nice of them just to make sure that you were verifying who you were as they were stealing your information so some of the things that kind of stuck out from the email I sent that you can look for in your emails would be this so even though it says verify at paypal.com here
um you can see in the the actual sender part of the email where it says fake PayPal sender do @gmail.com uh that sticks out the fact of the fake Webb page that uh um doesn't have any https involved here so none of the information that you would pass to it would be encrypted and PayPal obviously have their SSL CT and we'll have encryption going on so what else actually here I'm so this model can also find things that have to do with like the dark Hotel domains so Microsoft XP update.com Java update. flash.net Adobe Microsoft update.info or info.com Firefox updata dcom and Gmail boxes.com um I actually have so you might be wondering like what you can do
to kind of help catch this stuff in yourself with the model so kind of what we do here open DNS is we have the the same a model deployed because we see so much traffic at like 50 billion requests a day or somewhere in there so we're able to actually sit there and and watch for these uh kind of one-off domains so what I did in the process and hopefully I don't break my machine again was I uh I've also deployed my own DNS server that um I basically pointed a bunch of traffic to and one of the things that was pointed so when that email when that link was clicked it actually resolved around the DNS server that I had
deployed so using the model that we use at Open DNS looking for the um the editing distance and the stemming you can actually go through so I have this log file
somewhere so this is basically everything that my DNS server has been catching since I pointed stuff to it a couple days ago I think that's somewhere around the realm of like 5,000 th000 queries in it so using the handy dandy there so if we run the script what this is going to do is basically go through my entire query log and find out of all that oh if I get into the right thing so what it's doing right now is basically going through the entire query log and it's basically taking all the domains in the query log converting them the IP address testing the IP address against the ASN and then it's going to if the ASN is in what's a white list
that we've created in our data set it's going to if the ASN is in the white list it's basically going to take that domain and this keep passing it through because it's obviously not malicious because we've already whitelisted it so as you can see we start seeing is that kind of hard to
see
sorry
white on black or black on white it doesn't matter someone I'll take anyone's suggestion let's see so you can tell that I I've actually well you can kind of tell that I I actually ran this quite a few times so out of like the 5,000 domains that are in the query log uh it's basically just pursing out what this right
on thanks sometimes I play with computers so I I actually while I was making all this happen like I that's why you see so many of them but it's only catching the ones that like so this one it's it's basically flagging on the updat part that I was showing you earlier from the stemming and then comparing that to the brand name and then taking the the ASN and seeing that the ASN is better and then being all like this is a high priority someone clicked on this domain you should probably go investigate because someone just got
fished so the question was I'm I'm checking to make sure that this domain doesn't match the same ASN as PayPal's ASN yes that's exactly what is going on it's B basically it's it's going and it's figuring out the exact IP address for where this domain sits and then it's taking that and running it against what the pi ASN library and then it's converting that off of the bgp tables to figure out the actual ASN and then it's taking that and then testing that ASN against a a known bucket list that we have set up of like so basically you go through and you make a a a table of each like so Google has its set of asns uh
Google has its set of asns and on down the list and so once you compile all of those asns then you can just bucket them and then if it's in that list it can just automatically push it through and you're not using the resources it takes to you know uh to even check it because you've already you know the odds of a uh fake PayPal domain being on PayPal's ASN are pretty much next to none so we don't even need to the query that so if the ASN is in the whitel list and we just keep going to the next one so since this one since it was you know the the stimming happened the editing distance
happened so it was penalized the ASN was wrong that's how it gets flagged so that's how people in a network can actually make it a little bit since we have so much data like it's easy for us to see this but like usually smaller companies don't see the amount of data we have but it's really easy to go into depl like I think I built this DNS server for five bucks so depending on how much you need it to scale you can give it more RAM more hard drive space and all of that but definitely make sure that if you're going to build one for your own company you set up the access control list cuz next thing you know
you'll be participating in the DNS amplification attack if you don't just a word to the wise so let's see if I cannot kill PowerPoint again H come on it doesn't work very efficiently does it I'm just going to kill this
[Music]
Microsoft sweet okay so that's basically the same model that I used to find the fake PayPal domain in the query l there is the exact same thing that we Ed to find other uh threat attacks so like I was saying earlier the dark Hotel domains AP1 domains this works all across the board and not just for fishing so you can you can use it to get a lot of insight into pretty much any attack or any malicious URL that's on your network so back to the fishing some of the takedown processes that it takes to take one of these fake fishes down as you would notify your like the ISP of where the domain is sitting the register
of who uh the domain was registered through you can contact the site owner but if it's a dedicated server they're probably going to ignore you and then you can also contact Microsoft Google and Modzilla uh because they put it into their browser so when you go and visit you know it sends you that big red this is a fishing page uh warning so some things you can do to help uh kind of protect yourself even more you can enable two- Factor authentification on every single thing that allows it uh check for https in the URL bar use password Keepers like last pass and keep pass um this allows you to use different passwords across every
different site you go to so even if one of your sites is compromised the odds of them compromising another one are next to null and then by comparing the ASN the IP the who is and the natural language processing for the site validation and lastly don't click on random crap because that's the first Vector questions I know go
ahead so the question was do we have a see fishing SES with a valid https and while I have not actually seen one I was actually going to make mine one because it's not really that hard but I I think it's more of like a couple extra steps that the attacker like that would probably be something for like a more like a really deep down targeted attack against someone just because like it you know like they don't want to spend the extra money usually and if they can this mass email it out to people like you know you send it out to enough people you're going to get a few clicks but that's a that's a good
question because I actually thought about doing it so just to disprove myself on the https
thing uh so we so that's part of building the uh that's part of building up the the training set for the model is like you you know sometimes you will get like a random like FP in there it you know it's bound to happen and then so you have to go and investigate a little bit more that's where a little human interaction comes in um but it does happen and then you just add it to the white list and figure out if it's actually like a legitimate PayPal domain before you do
that anything else no yes I I'll put the slide deck out there somewhere uh if you so there's my little Twitter handle that's a that's a three and a zero leitess uh actually every every rendition of my name I tried to get was somehow amazingly taken which I don't really know how but uh yeah if you go and like look at this I'll probably put it out sometime today that it sweet just want to say special J thanks to uh Jeremiah OK Conor over there he uh he actually helped like with a lot of the codebase that I I was using so really awesome stuff and if you're happen to go to Source Boston he'll actually be talking about the
natural language processing and stuff even more and maybe if he's really lucky at RSA this week and then of course the Open DNS research team and besides SF itself and for those that haven't eaten lunch
yet