
okay hey guys so my my talk breech offering a culinary exploration of data breaches sounds a bit silly but it's good for getting into conferences basically having a look at data breaches so this is me Frank Allenby I work at Tel space I'm a security analyst and my email address and Twitter handle if you know that interests you at all so I'm gonna give a foreword here there is some credit to sense post where I used to work because much of this research was done there though I do currently work at L space so just you know giving a bit of credit to the time that I spent while I was there the methods here that I used
on my own they're probably imperfect so anything you guys might come up with are probably better than what I've done just keep that in mind I use fake data for this talk there's not much data but it's fake it's not real I don't have any access to reached out at this point in time and I wouldn't be able to help you guys get breached data and you know source it or anything like that so please don't ask me and then yeah much love so Burt data breaches abrete basically a breach is data obtained in an unauthorized manner and it happens to be sensitive quite a lot of the time since if it wasn't sensitive there would
be no point in really wanting to steal it right okay so we'll go through have a look at some of the top ones so we got Yahoo they're at three billion they started off with a couple billion or a couple hundred million turns out three billion actually their entire record their entire user base then it goes down to that old friend finder where we get the tells pay so it's like Charlie and him I use you as the dummy of my jokes and then cord ventures deeper donna lytx to kind of data analysis firms myspace eBay Equifax which is very fun because it's basically like off of America so you know half of America is not screwed
and that's one of the three big credit firms so if you got hacked in that breach and now someone's impersonating you you put a freeze on your credit account it's only with these guys the other two credit firms don't care about what these guys are doing so even if you've been hacked you're still hacked in two other places and then LinkedIn which most people use that's from 2012 so it's a bit old so the South African one that we had recently here was the master deeds breach and you know hit headlines all over the place it affected about 60 million South Africans in total that's how many breaches I mean how many individual people who are affected
individual ID numbers which is basically like everyone because we're they're about 56 million of us here so it's probably not everyone but like there's a bunch of dead dudes in there and stuff but you're probably in there so that's like names addresses ID numbers you know all that fancy stuff that people like especially the could the deeds people like where you live what you've done the history of your stuff all of that fun stuff the original source is murky nobody really knows where it comes from there's a whole lot of you did this and I did that in this guy like saying that in this guy and people's reputations being damaged so I'm not mentioning
anyone I'm just saying some kind of deeds database sales corporation that's where it came from and it's a seem to be where I'd wait to be widely spread because effectively it was just a thing sitting on a web server somewhere that you could crawl for and download a 30 gigabyte file with South Africa's population so this is effectively how a breach happens right the breach happens it gets hacked in whatever way it gets hacked then it one of the two things starts to happen it either gets leaked directly so someone sends it to someone someone makes a torrent of the thing someone releases it online then any of us can go and download it or it gets
sold which is usually on the dark web you know with tube it with Bitcoin or you know all of those fancy things that people use to buy drugs and stuff as well sorry whoops and then usually once it's been sold more people have it and it gets leaked eventually and then we get to have a - because we cheap bastards so yep that's how I get on how I used to get my breeches so a breach is forced dr. liberation we got to liberate the doctor man it's trapped in those databases gotta be free okay so one way well these are ways it can be breached so system configuration issues like open databases some dude that happens to
leave my sequel port open with no password you know here have my daughter that's fun public files someone like the master deeds breach he just dropped a sequel file and a server and hope nobody hits it but that's not gonna happen someone's gonna find it unpredicted protected displayed data so this is when for example you'd go to a website you type in an ID number and it dumps out the details for the dude right so what happens if you take your friend's ID number or your other friends ID number or if it's one two three four you can just go from one to the maximum and get everyone and if that's effectively a data breach because you're
still stealing the data it's just not directly from the database and then internal theft is often the case where some dude doesn't like the company like Edward Snowden I guess you could say is a one of them he stole all that stuff that cause you know the NSA a much fun time they might not surface anytime soon so something might get breached in 2012 or in released in 2015 ish like the LinkedIn breach so you never really know what has been breached or what hasn't been breached until people actually know about it so what we know about now is probably like this much of what's actually been breached because people like to keep their daughter for themselves and there's no
reason for you poor people companies not to collect all you all of your data so you sign up to LinkedIn they collect all of your stuff but they're collecting periphery stuff as well metadata what you visited when you visited the website who you talked to all of this stuff and they just shove it in databases and databases and databases and you can imagine this happens for everything take a lot does this they analyze what you want to buy what you're buying so that you they can match it with you know maybe your friends or whatever it might be which means there is so much data out there to steal and it's easy to collect it so companies just kind of throw dot
around like it's like it's nothing like it doesn't mean much and there happens to be this kind of attitude of ever we got hacked like you know it happens but like the people don't take it seriously right they like I got hacked and then the American banks or whatever the might be they just eat the fine because it's this much money and they don't really give a it's being helped a little bit by like different regulations and stuff but most of the time especially for banks banks kind of way up how much it's gonna cost them to fix a thing versus how much it's gonna cost them to pay the fine and most of the time they find is they just
happen to choose to get hacked instead so they don't love you guys very much and also they got hacked but we won't get hacked right how stuff is always secure we never get hugged nope you're gonna get hurt someday and it's gonna hurt like a or you won't even know about it and then you're gonna get hacked again because they use the original thing to hack you so yeah one kind of rule about the Internet is don't put anything on there that you don't expect to get hacked because it's probably gonna get hacked if you want a fun experiment take like a server put it on the web and watch the SSH login access access logs
within seconds you're gonna have works from China just hitting that thing with the common credentials and it's actually sorry but does one point I've read of reports of dudes that bring servers online and it gets hacked before they can even configure the damn thing to use it for themselves yeah and what has been breached cannot be unbreached right it's out there it's out there you can't unsee something like you're there and you're there and I can't unsee you guys so I can't which is not meant in an offensive way it was an example no I don't mean it like that I can see you to your boss yeah so you guys are losing them it's okay you cannot take
this daughter back it's like opening a fire hose it's gone you can't be stopped the water back in the hose and this you're extremely pro chopsticks so and we actually give a lot of it away ourselves we don't even think about this but the we give all face books and instant grams and snippy chats and Tweety tweets and all the flippy flops and all of these things we just freely spread to the world like hey look I'm in Cape Town right here right now and then three seconds later you post a thing so dudes can like watch your instagrams and then see you're walking down the bloody Street and Cape Town because it's amazing and wonderful for you but
everyone else can track you and if they don't like you very much then they can say hi and you know murder you but what it means is this is all open right so people can passively collect data about you that you're giving them so if you give them something then you know what do you expect really which is why I don't have social media but that makes life difficult in other ways so and you cannot really authenticate with what's public so if the master deeds breach is done you got you in the first name last name Andy and ID number how can a bank ask you for your ID number you know to verify that it's you so how many phone
calls do you get from a bank and says oh it's under banker you want to verify this thing please give me your ID number like sweet bro but like that guy knows my ID number as well and it wasn't like a lot of companies you just phone in and they say hey what's your ID number like okay if I can this here's this guy's ID number I'm him now kind of thing so I'm gonna go through a little bit of terminology so for the people that are not initiated as much so database is basically a bunch of stuff that's collected into individual little compartments the records are the individual things like you and you and
you and you and you are a record in a daughter breach and a table is a collection of us so a table would be the room and we are all individual people inside the room at this point in time sickle structured query language sequel SQL squirrel squirrel whatever the hell you want to pronounce it so we've got there in the beginning select whatever whatever re name and surname from users users is the room ID name and surname was an individual attribute of an inner / record where the ID is leap or whatever you want to you know specify so I could select Charlie from the room and then I would get charlie we could have
happy fun times later so this is what kind of what they tend to look like a bit of a demo record this is the fake tote I spoke about you get the you know the ID leet username hacks are katha name Harold surname O'Neil for sequel injection and other such things email blah-dee-blah for a number of bloody blah and password is my hot cousin 69 which exists and has been found in undisclosed sure okay I'm gonna start picking on Kenny and Kenny's you know when we've gone of Kenny start to breach records so what we want to do right we've got all of this cool stuff there's like we got millions and millions of records as you saw back
there at 3 billion Yahoo records hundreds and millions of other records not all of it is out there we know the scope of the breaches the who won I don't have we I don't know where you can get the 3 billion records but basically this is freaking useful man we have billions and billions of records and we want to index it why so we kind of want it to be like this it's a little quest icon from World of Warcraft if you've played the game so the quest is to make it look like that kind of so you can just type in an email address or a domain name like gmail.com or F&V dot 0
0 or whatever empty and tell space and then get a list of tells place people and their passwords or their first names or their surnames or their phone numbers or whatever so why well we have a scenario here right we're on assessment as a pen tester for evil Corp and with there's a one website and one API both education so you need to login to this somehow and we don't have any other luck there's no like open port thing there somewhere that we can just put so now we have this scenario we're kind of stuck because you don't have creds we can't put the thing in any other way so what do we do we kind of
end up having to brute-force this thing now where do you actually get stuff to use to brute force how do you know what evil Corp emails look like how do you know who works there how many how do you know how many people are that worked there so we can use it for recon all right to actually find how many people work there you know is it 50 is a big company is a small company good luck brute-forcing small ones because they're actually quite a lot harder than the big ones and you know as part of recon user discovery so who works for evil Corp we can get a list often the South African banks of 50,000 plus users you can brute
that thing it'll take a while but you'll probably get someone with something stupid like password 1 2 3 often executives as well by the way you can find their passwords so you need to know who worked you want to know who works there and what their passwords are so if you have this website for evil Corp and you want to try to log in now you can get evil Corp and their passwords and then try to login to the website now we can also use it to construct word lists with the pirate those many passwords very crack wow we can use it for analysis for stuff like how many breached email addresses are there per domain so we can actually
watch you know tell us where I start coz I preached email addresses if five or ten a small company bigger companies you know more than that you can also watch it over time as they get pwned if that makes sense woopsie all right kick what are the most common passwords you know one two three oftentimes it's like summer or spring or Arsenal or football like teams and that sort of thing and there's something kind of interesting to think about is the strength of a hashing algorithm before is the actual organization security so if an organization uses bcrypt is it gonna be more secure than an organization that uses md5 to hash their passwords or is there no correlation
there I mean that's a pretty interesting thing to kind of have a look at in my experience yeah that kind of happens to be the case LinkedIn used sha-1 plain old sha-1 without any salt so you can basically just run through that thing of the word listen crack a shitload of them I think there's about 99% of them cracked at this point in time so if you have a LinkedIn account and you haven't touched it change your password so these are really my email addresses so we have real data breach from real databases real people real email addresses we know they exist we know they're in use and often times someone has gone there and
clicked that verify this email address button whether you there real passwords because someone has to type the password in so we know they're being used by these people they're not just random I mean there are gonna be random people that just type in and press enter to get into the website because they don't feel like having to I don't know sacrifice their child to the demo gods or real people as all so these are real people real you know addresses phone numbers email whatever ID numbers so this stuff is actually useful because we know it's realistic it's not some fake thing it's not a model it's not a guest it's not in anything it's real so
my first try at getting that thing working was to have a Python front-end we're using the flask micro framework and Postgres on the dot at the backend as the data base it was just these two things talking to each other on a very very very constrained hardware machine with like a hundred million other things doing stuff on there so it was kind of a pain in the ass but it worked and I don't have screenshots of the thing because it's not my IP it sends post IP so I can already show you the stuff but this is what it was if it makes sense to you and cool so did it work well I learned a few things and one of them is
that breach schemas can differ wildly so in one schema you might just have an email address and username in the next one you might have a user name email address password first name and then the next one you might not even have email address or the password you might only have personal details like first names last names all that sort of thing so the stuff is all over the place and every single breach is different right not it not you you would store my daughter differently to the way you would store it and him and him and him and trying to normalize that in the Postgres database kind of made my life a bit of a bit difficult because you have
to squash it all into the same to the same model another thing I learned is that breach Falls are broken like when a hacker is sitting in some corporate network hacking a thing they're not gonna be there like yeah well to look nice no they're gonna be like I want to get the out and leave like they don't care so that makes my life difficult because I've now better go through like tens of gigabytes of where there's a where it's a comma-separated thing without quotes but they're commas in passwords as well so now you have like some lines with ten ten columns and some lines of three in some you know it makes your life a pain
in the ass so you have to actually write like custom custom parsers which I had to do and I mean it's fun actually to to get to dig into the stuff up until a point at which you kind of realize that it's really getting really tedious because there are hundreds and hundreds of breaches that you need to get through and some of them are not even worth the effort if you think about it manage your manual management of these things is a pain in the ass because oh but nice world you've got to actually manage the stuff actively and manually you've got to make sure that you've curated the dotter breaches that the data breaches are where you think they
came from and you've got to see what's in every single one you can't just shove a hundred of them into one folder and click a button and have them all go well you can if you really spend like lots and lots of time writing really highly specialized custom tools to do this for you which will take a lot of time and time isn't something that I typically have when I'm building one of these because I really need to get data and I don't really want to spend weeks building a massive tool around it um tools exist for that though like Cubana and whatnot I did try them they didn't really suit my needs and lots of data
means lots of labor so if I have a hundred breaches it means I've got to go through a hundred files and sometimes because the files are broken it's a hundred times looking through them the writing scripts making sure they actually pass the data correctly if it doesn't pass then I gotta go back and I'll try to do that a hundred times for maybe an hour or three or half a day each it's not a hundred there are hundreds and but I had about 500 individual breaches and that's not even close to what's really out there so it was about a hundred and fifty gigabytes once I had indexed everything and I think I'm not sure if this is compressed or not but
yep which is about 2.3 individual Billy 2.3 billion records that I had indexed to be searched 50,000 songs worth ish at 3 megabytes per song 100 HD movies worth a wink-wink Kenny what kind of movies are you watching man what do we actually want so we actually want with this solution to be able to query differing schemas we want to be able to query differing the kinds of data that look different at the same time without having to finagle stuff too much so I want to be able to get from one for each and a different breach at the same time without you know too much work we want it to be easy to import the
data because it doesn't make sense if it's difficult to import data but easy to query it if it's so hard to get it in that it just takes too long to get it into to make any sense we want a familiar query interface like sequel which happens to be the most widely used kind of query language / interface / whatever you want to call it out there we wanted to be flexible enough to do something now and then when the requirements change in the future do something else we wanted to be scalable as well so right now it might just be me using it which is fine when I'm running it myself but add 10 people onto the
thing at once and we'll see how that performs it doesn't perform very well at least it did not with the Postgres solution I had which is not to blame Postgres it's a combination of Postgres and the hardware and you know all sorts of stuff and we wanted to be managed because we don't want this to be our problem it's nicer when it's someone else's problem we don't have to worry about you know micromanaging individual pieces of software to make sure that it's using the correct number of processes and memory per process and and all of that in sort of stuff and I guess lastly we want it to be cheap because cheap is nice and it means that we get
more budget to do more inter thing stuff so some possible solutions that I researched were key-value relational database structures where you would use Postgres or a traditional audio yeah relational database but you'd have like multiple columns that join together with inner joins and all sorts of stuff like that but that seems like a great idea until you start implementing it and realizing that you're gonna have to start inner joining across millions and millions of Records so for 10 it's quick it's like instant but for a million that takes a really really long time and then you have to start spending time optimizing your database and that really gets kind of annoying especially when your hardware is constrained we can
maybe use something like no sequel which includes stuff like MongoDB I'm Amazon Diner DynamoDB which is similar to you basically take a thing and you shove it in a hole and you say that's my thing and it's all good and then later on when you come back all right so I don't like them very much by the way no sequel stuff I really don't like because it's it's such a great idea now but then you come back in a year and I've taken Kobus and shoved him into the thing but now Kobus is different and I've taken Kenny and shoved him into a thing and I'll have all of these different things but I'll chase yeah our
requirements have changed a year later and now I need to go back into this no sequel database and try to mangle through this blob of stuff that I have I'm not sure if any one of you has experienced that before but if you work with legacy systems using MongoDB when it kind of first came out you you'll have some fun a key-value datastore like something like Regus in-memory database where you just shove first name last name bubbles and then values next to each other and kind of just query from that so what I kind of settled on was Google bigquery um why well ok Google's bigquery is basically Google server this highly scalable low cost enterprise data
warehousing thing lots of big buzzwords in there it basically means it can store a shitload of stuff very easily and you know very quickly you don't have to worry too much about it and whoopsy the nice thing about Google is that it's well okay sorry I think I skipped ahead a bit but it takes the biggest boxes that I had at least four for a database management sort of thing in that it's managed and Google knows what they're doing right Google does Big Data basically Google's businesses big data they collect everything they can about you so that they can sell it to advertisers it's not valuable to them if they only know your name when they can
know your name who you visit who you talk to blah blah blah I mean your Android devices feed into the Google machine basically everything you do feeds into Google and that's ridiculous amounts of data it's like petabytes upon petabytes that they can query like this and so I kind of trust them to know what to do with dotto and I'm storing it with them bigquery supports sequel which is nice because I don't need to change tooling I don't need to learn new methods I don't need to do anything strange you can query different schemas schema as being different models of data a different so you can say select ID name surname from breeches star so it'll
get all of the breeches and get those fields from all of the ones where it matches so you don't have to you know faff about with a single column in a single table and a you know Postgres database or whatever it might be or my sequel or whatever you choose you can just add a new table if it has those fields in there it will just pull the dot out for you and you won't have to worry too much and it's pretty cheap right you get the free tier they have you get one terabyte of queries per month which is actually not much when you have a lot of data but it's quite a lot when you don't and it's also quite a
lot when you've managed to optimize your database or the way you store it so it's only cheap if you use it properly and it integrates deeply with the other Google cloud offerings and stuff like that so because it's with Google it's all on the same network it's all managed by Google they know what's going on the teams can talk to each other that the software talks to each other and you don't have any kind of big problems in dealing with multi multi cloud stuff for managing your own boxes or will this thing's bored that thing if it's this version so what does it mean if it's actually used properly all right so you get charged by the amount of data
processed which means they when you do a search they churn through the data to find what you want which means they grab everything in a big bag and then they shove their hand in there and find what they want and shove it and you know give it to you which means that the more data that you ask for which means if you query more tables or more columns then you're gonna increase the cost which means to optimize cloth for cost you're gonna want to query less tables and less columns because that means less dot opposed that processed and yeah it can be very expensive very quickly because if you don't do this properly so don't
query everything at once unless you don't like eating because really you can run up bills pretty quick where I was it was about you get a terabyte free but each query ended up being about 150 gigabytes worth of processed data which it's a lot so it's like 10 queries and then it's $5 per terabyte after the first terabyte so I think one month it was about $200 or so which is it's not much for a big company but it's enough to kind of make you be like well that that happened so to optimize for using bigquery right I learned that you should split and you should shard your tables like so bigquery only supports date sharding which means it will take
records that were inserted in specific days and then in the background archive them in different ways so that you can query per day and not the entire data set at once but you want to split it by breaches and domain so you can so for the LinkedIn breach she'd split it by LinkedIn and then gmail.com or yahoo.com hotel spaced at 0-0 or whatever it might be so yeah for that you'd also need more specific queries but that works well with the search model we have because if we only want tell space we can just say star - tell space dot 0 0 and we get only for tell space which is a total of as many breaches as we have so if we
have 100 beaches that's 100 tables which sounds like a lot but it's not really because each table is going to be very small so it won't be gigabytes at my be fifty megabytes worth of trolled data so Google cloud platform all the things because it's all under the same roof and it makes sense right it's all in the same network the Google Data Centers are all connected by these massive multi gigabit links the transfer speeds between them are extremely fast and you know the engineers working on this stuff of very familiar with what happens between the between you know the teams and between the tools plus the data transmitted between data centers is encrypted by default so anything sent
from one data center to another data center is encrypted which means that the NSA cannot really look at your stuff unless they happen to ask Google very nicely or something like that so I settled ok this is gonna sound a bit like a you know a Google fanboy rant I'm not advertising Google I just chose this because it happens to be really good for the specific solution I don't work for Google and I'm not you know promoting them or anything like that I ended up using firebase which is a mobile and web development platform sounds more fancy than it really is but well that's actually pretty cool but I use it as a content delivery delivery network which
means that I used it to host static files so I just took my HTML Javascript and CSS dumped it on a server and let it spread it throughout the world I didn't have to care about hosting a server in Europe if there's mostly European customers the was everywhere I didn't have to care Google cloud functions I also used which is basically a AWS lambda if you've ever used it so use tiny little snippets of code that you run in the cloud so on demand server this server lists code spin up and execution which means that when something happens it will spin up a new instance of your code do whatever it needs to do and then kill itself so it's
you know it's pretty cool I used this to implement the API functionality which meant that every query spawned a new function which query the database and then went back to the user I'll show that in the dotto in a diagram later I used Google datastore as a no sequel document database sort of thing for managing user accounts just a little silly thing to dump a bunch of data in so that people could actually log in and use the thing and then finally I used bigquery which we've already kind of gone through to store the actual data so this is kind of how it looked so you'd have three users coming in they'd hit firebase which
would hit the local the local instance or the local server which might be in South Africa it would give them you know the HTML CSS or JavaScript and then that would talk to Google cloud functions so every time a user did a search it would start a new function so now we've got three users doing three searches with three individual functions started so they've got their own individual containerized code running it's not shared code or the code is not being shared per user in these individual functions each user has their own instance which means that if I have a hundred thousand users or one user it's gonna be using the same amount of time because user one is going to have his
own thing which means it scales extremely well it matches basically one-to-one and the the performance is pretty consistent you don't doesn't matter too much how many people you have because you'll have about the same response time for anything because it also you know well it talks to daughter store at the top there to authenticate and then it comes down to bigquery does the fancy querying stuff and big queries built-in it for you know for its own scaling and whatnot so it can handle extreme amounts of load and because it's on the same network it means that you know the talking between between the functions and the database or extremely is extremely fast and once all that's
done the function kind of just grabs the data and sends it back to the user and then my JavaScript and that stuff which would show it to the dude whether or the female or the whatever so the winds is that at Google so Google kind of knows what they're doing right they're pretty well versed in in Big Data and that sort of thing it's managed so I don't have to really give a about what happens it just happens and it's not my problem it's server lists so nowhere in this process do I have the server running of my own I don't have a virtual machine have nothing to patch nothing to worry about I don't care about firewalls and I care
about any other stuff Google does it all for me it's Google's problem and it's all part of the whole pricing structure so I don't have to care if I've patched my thing properly I don't have to care if I left an SSH port open or if I left password authentication on or whatever it might have been that's Google's problem it's scalable as I demonstrate it doesn't matter how many users you have the the performance is consistent and it's pretty simple like it looks complicated but when you get down to it it's basically calling a thing and then it returns the data it's not it's not very complicated and less moving parts means less maintenance if
something goes wrong you know it's either there or there you don't have to go through a massive tree of stuff and debugging and all sorts of fancy stuff because it's all just there it was all so maintainable you know as part of the simplicity argument you know it's easy to maintain if you know what's going on because there are so few moving parts and then I guess it's transparent because you only have one interface that you can look at so you can see exactly what's there and how it's working and how it's being used you've got analytics built in with the whole Google cloud thing so you can see how much is being queried by who you know when and all
that sort of stuff but they're also losses to the some solution and one of them is also that it's Google so if you ever try to like contact Google's support for anything that's not trivial like you get hit with automated system upon an automated system upon an automated system and if you want human support you've got to pay like stupid amounts of money and it's not really worth it for most of the time for most of the stuff but luckily it's automated enough that you can kind of get by but still it's it's a really a pain in the butt you lose direct control of the servers as well so you cannot really optimize your stuff but it's not your
problem either so you know but if something goes wrong you're sitting there and you're staring at a screen that Google presents you and you've got no means of support so you're kind of having to sit there and wait for Google to fix this thing for you meanwhile your users are screaming and shouting and you're shouting at you but you know Google's pretty good at keeping their stuff online probably better than I would be so you know so did this actually work well whoopsie sorry did it work so I'll talk about yeah it pretty much worked as well as it could have it did what we needed needed to do there was a typical kind of Google
interface where you could type in a tell us where it starts here that's it and it would give you five or ten people's names and their passwords and if you wanted more you could go query for individual records and it would give you everything for that so it worked pretty much as I wanted it to be to work at work for everyone that wanted to use it I didn't have any complaints around it and people only had compliments for compliments for the things so I consider it a win but I did not optimize it properly because I only had individual breaches as tables rather than sharding at per individual domain to make it cheaper but other than that it worked
pretty well and it was effectively free because the one terabyte thing it gets you pretty far the one terabyte query from bigquery you get two million function executions from the cloud functions so that's 2 million individual queries from users if you have 10 users you you're gonna struggle to hit that unless you're really hacking a lot of stuff the data store unit basically free if you are not properly I was thinking of the future I would the blockchain kind of seems like an interesting solution because they tend to be distributed and imagine you were to take the records and shove them into a blockchain now they're immutable and accessible by everyone which means that you know you can this data is there for
good it doesn't matter what's happened to it or where it's gone so that's a thought for the future I haven't even tried to investigate that because block chains are another whole and on their own you kind of got a you end up running down these rabbit holes and you only have so much time so what happens if you've been poned or have you been pawned so you know just head over to troy hunt suburban point com2 check it's pretty well known it'll basically tell you where you've been point if you have been pawned if an account of yours has been hacked or not obviously not everything that's ever been hacked is in there so just
keep that in mind but it's a pretty good indication well what do you do if you've been a you know hacked if you find something there firstly you've got to change your passwords because you know if someone has your password you've got to change it to stop them from using your password and you got to do it now you know otherwise that someone can still use it and yeah really like now and I know it's a pain in the ass and good luck good luck if you have the one password to rule them all and in the darkness bind them so to help with this use a password manager because these things store the passwords for you you remember one
password pardon me but you got to make that thing strong they have is super passwords with that being used is used incur to encrypt all of your other passwords so pick like a twenty word phrase from your favorite book or your favorite song or you know we all live in a yellow submarine whatever that's a pretty good one includes spaces and apostrophes and all that stuff good luck brute-forcing that you're not gonna get at it so here's a password manager make the posit strong and unique so it doesn't help if they're strong you can have the strongest passwords in the world but if someone gets that one password you're still hacked everywhere because it's it's you know it's just one
class word so you also got to be wary like look out for spam so now your email address or whatever is available to the world which means you're going to be getting more spam emails from dudes that actually have your email account there your email address and fissures are gonna go for you like a lot like I've been hacked a bunch of time well not me but companies with my the information have been hacked and you get this like five times a day just people from like Nigeria wanting to sell you like their brides and stuff like that like it gets a bit so you just gotta watch out for that some of them are
sneaky as hell - it's like effing busy Rosetta it looks legit but it's actually like if RBG or some like that so you know just be vigilant of suspicious activity as well so if you see login attempts from China then it's probably not you unless you're visiting China like uh-huh so if you see anything like that then you know change your also tried to lock down your accounts as much as possible I don't know what depends on the service but yeah and if you've not been burned then you're really a lucky bastard because yeah we'll see how long that lost so eventually you're probably gonna get hacked yeah I actually have 15 minutes left but I have some thoughts to
add there sorry yeah is it is it kala to hoppers I'm done 45 just on right on time that's good
I need any questions perhaps if there's time for questions it's it's out there now so it's it's yeah once yeah you can't uncon be unbreached so I can't help you with that it's out there you should assume that it's been compromised that was it was exposed for at least at least seven months it was exposed before they took it down it's long enough to assume that basically anyone that's interested has it so I don't I can't help you yeah maybe but I can't help you get it I just I don't as part of the forward I said so yeah you can maybe look and in the in the dark corners of the internet to see if you can find it but that's as
much as I can help you with that anyone else oh if you if you build the interest of the the infrastructure upfront it's not too bad because you you'd have a script I guess I would just break them out by by the domain name and now pretty quick and easy yeah when I broken you need to write the manual parsers and then parse them into her into a way that's consistent it wouldn't be too difficult it would have taken probably a week's worth of work to to redo everything to get a fancy but it's not too bad it's it's basically a little Python script that I changes the name of the table yeah there are there
are ways to do that some more reliable than others ready regular expressions are pretty well used for email addresses and such names and first names are harder to guess because you know it can be basically any random string that you could think of how do you know that I met a guy his name was loved more yesterday you know I wouldn't have put that in a database of names that I that I had thought would exist so you know it's complicated too it's complicated it depends on the situation but email addresses are easier than others yeah if that makes sense okay thanks guys
[Applause]