G1234! - Abusing Password Reuse at Scale: Bcrypt and Beyond - Sam Croley

Name: G1234! - Abusing Password Reuse at Scale: Bcrypt and Beyond - Sam Croley
Uploaded: 2018-09-20
Duration: 50 min 41 s
Description: Abusing Password Reuse at Scale: Bcrypt and Beyond - Sam Croley Ground1234! BSidesLV 2018 - Tuscany Hotel - Aug 08, 2018

BSides Las Vegas50:411.3K viewsPublished 2018-09Watch on YouTube ↗

Mentioned in this talk

Tools used

hashcat John the Ripper

About this talk

Abusing Password Reuse at Scale: Bcrypt and Beyond - Sam Croley Ground1234! BSidesLV 2018 - Tuscany Hotel - Aug 08, 2018

Show transcript [en]

all right everyone can hear me all good I so my talk is abusing Password Reset scale be captain beyond you probably don't know me as Sam if you know me at all you probably know me as chicken man but we'll get to that so in this talk I'm going to be assuming you have some password cracking knowledge I'm assuming you have cracked a hash at some point or thought about it or done any amount of research that I can say things and not have to explain every single word I say but I won't cover some simple stuff lots kind of over it a little bit fast but I'll go through it we'll be talking about mega breaches the News's favorite

new term and database dumps and stolen hashes things like that we'll be talking about the concept of password herd immunity insulted hash lists and then efficiently reducing that herd immunity that would be the attack I'll be doing a demo of the attack itself and then talking about what can be done in the future to improve the attack and then questions so first a little bit about me again I am chicken man to almost anyone who knows me I currently work at Tara hash formerly known as saggital you probably know them for their brutalus they're cracking machines I'm a password enthusiast I'm a hash cracker I'm part of team hash cat it basically consumes my life and I'm a part-time chicken

enthusiast so chickens so yeah so let's go over some of the basics real quick what is a hash you should know this by now but just in case you don't password hashes are the way of storing passwords in a database you don't have plaintext passwords hash functions or one-way deterministic trapdoor functions for every input you have a single output and with that output you shouldn't be able to know the input backwards you shouldn't be able to know it without having it first never ever store passwords in plain text if you can help it please always has your passwords this thing is annoying okay so how do we crack passwords because password hashes are one way you can't go backward and

you can't decrypt like encryption so the best way to figure out what the password was is to generate tons and tons and tons and tons of hashes until you get the same hash in which case you either have a collision or you have the correct input password the tools we used to do this the free tools are hash cat John the Ripper MDX find hash manager inside Pro there's a bunch of free tools and then there's commercial tools like hash stack I believe password has tools access data several other companies have commercial software for this the hardware we typically use is mostly GPUs although some algorithms as you'll see in a moment are better on CPU there are

FPGAs and Asics and other fancy specialized hardware that can generate hash it's very quickly but those are typically pretty rare you've probably heard of Bitcoin Asics they do shot 256 times - I believe it's two rounds of shot 256 and they do it very quickly but they're not useful for breaking in sha-256 hashes they can only do that one type of math the attack methodologies that you should probably be familiar with would be brute force or mask attacks where you start with you know a a and then a a B and so on and so forth those are usually very inefficient the key space is very large they're not very good for complicated passwords or long

passwords things where there's lots of symbols things like that do not fall easily to brute force dictionary and rules is probably the most used attack I would say pass Bruce force with a dictionary you take known passwords English words things like that and you run those against the list and then with rules because you're not likely to have every modification of every password in your list rules allow you to modify each candidate by adding a number replacing a with an cymbal s with fives doing very basic stuff like that to try and mimic human behavior and how people create stronger passwords stronger these days tends to be longer not necessarily more complex because most of those patterns are very

predictable people add ones and zeros to the ends of passwords all the time and there are other attack methods Combinator hybrid prints I'm sure there's more than I'm missing but most of those tend to be kind of on the edge I know a lot of people spend a lot of time on them I tend not to so I'm kind of just going to gloss over those so how do we make hashes stronger against this cracking the first methodology would be salting where you take a random usually unique to each hash string and you append it or you prepend it or you add it somehow to the input plaintext so that two identical plaintext generate two unique hashes this solves the

problem of rainbow tables but this does not solve on-the-fly cracking of just a few hashes it does create a higher workload as we'll see in the talk but it's not really as suggested anymore everything should be salted but that's not all you need we have iterations if you take a hash and you hash it again a couple times you can create you know a thousand md5 hashes layered over each other is a thousand times harder to crack in theory although that being said if it's a fast hash a thousand may not be you know very much you could be up in the trillions of hashes per second on your system and a thousands not going to be noticeable so

iterations are good they're just again not really what you need to be focused on anymore these days the suggested hashing algorithms would be slow hashes bcrypt s crypt are gone to I and D I believe the slow hashes are not just high interation and not just salted but typically they are memory hard which means because password is a parallel operation if you have an algorithm that requires a certainly large amount of memory you can't start that many threads your device will fall to local Memphis eyes and you'll be limited even if you're not consuming the whole processor - how many things you can do at once and so that makes it much harder to run these algorithms and plus

they are very high in eration bcrypt the iteration the cost value is actually 2 - that number of rounds so if you have a cost bcrypt as you'll see you later that's 2 to the 8 number of rounds the water rounds so it makes it a lot harder to run these hashes and you'll see that they're much slower alright so let's talk about mega breaches and database dumps the news went crazy about mega breaches sort of recently but they've been around for a while a lot of the popular ones were dumped in 2011 2012 2013 and only came to light really in the past 2 or 3 years so they've been out there for a while if you've know

where to look myspace was a very popular one it was one of the largest for a long time it's still one of the larger ones there was 360 million accounts that got dumped and I was accounts with passwords now they hashed their passwords with sha-1 but they made some mistakes the passwords that they were hashing were truncated and forced to lowercase so we were able to complete the key space on the 10 character passwords in you know couple weeks no big deal there were other passwords in there that were full passwords some account lines had a truncated hash and a second hash that had the full password as well which you could just take the first hash and take

that password and basically truncate off the first ten characters of the second hash link did was a very popular one got very famous because small portions of it came out right as it got dumped and then later on the whole thing came out LinkedIn was a hundred and sixty-four million accounts but not a hundred and sixty four million password hashes LinkedIn had closer to 62 to 63 million password hashes because lots of people like to log into their LinkedIn account with Facebook and Google and other accounts that use OAuth I believe so they didn't have passwords to LinkedIn Dropbox is actually so we're gonna be focusing on LinkedIn and dropbox dropbox was interesting because it was dumped around

the same time as each other databases but they had already upgraded to bcrypt they were already using eight kospi crypt it's a plenty good enough hash for the time it's still pretty decently strong about half the database of the sixty-eight million was in be crypt the other half was a forty character hex hash of some kind we it's believed to be sha-1 but if it is it's either got a pepper or an unknown salt that makes them impossible to crack right now no one's ever successfully cracked any of the sha-1 formatted hashes the bcrypt however are possible to crack there's about thirty two million of them they're a lot harder than cracking these other databases as you'll see and so that's

sort of where our attacked kind of plays in according to have i been poned troy hunts website there are three hundred-pound websites listed that total up to about five point three billion accounts now clearly there are more than 300 websites getting hacked so 5.3 billion is probably pretty conservative we're probably looking at you know close to the 10 billion accounts floating around in dumps and underground forums and anywhere else that would show up so we can make use of that data linked in again 164 million accounts 61 point eight million sha-1 password hashes and according to hashes org who hosts some of these password lists just the hashes usually there were sixty point five million of them cracked so that's a

ninety seven point nine to recovery rate for linkedin passwords now sha-1 is a very fast hash as you'll see with a benchmark they GT x 1080 an nvidia graphics card will do about eight point five Giga hash or 8500 mega hash per second that is eight point five billion hashing operations every second so if your password is you know a little stronger than most people it's still probably in reach of someone with enough dedicated hardware and enough know how to run a solid attack Dropbox on the other hand has been around just about as long it had thirty one point eight million bcrypt hashes currently only six million have been cracked according to hashes org and gives us an 18 point nine

eight recovery rate much worse than the LinkedIn recovery rate and that's because bcrypt is so much slower and so much harder to run this was around the same time I bet you know the passwords are roughly similar in strength and format but running them and actually doing the cracking is just so intensive that no one has the power or the time to make it feasible perhaps you'll see there the GTX 1080 same card as before is only doing about 2,000 hashes per second so we went from eight point five billion hashes per second do about 2,000 that's four point five million times slower which makes it four point five million times harder to crack this list

and they're salted so it gets even harder as you'll see here with the concept of password herd immunity when loading and cracking a large password list like linkedin sha-1 if it is an unsalted list in theory to do the math you still have one salt it's no it doesn't exist but you have one salt that you're generating against for each new salt in the list you have to do all of the hashing again and again and again as you add more salts so the workload for attacking linkedin sha-1 would be for a basic dictionary attack it would just be the number of lines in the dictionary times one you only have to hash it against that one salt and then you can

just the rest of its comparing to all of the hashes in your list regardless of how many with uniquely salted lists like large bcrypt lists or I believe vBulletin forums actually has a very unique salt it's 30-some odd characters and random for each individual unique salt you must do all of that hashing again for every candidate in your dictionary for B Crips 32 million salt or drop boxes 32 million salts that same very basic dictionary attack just got 32 million times larger and its amount of work so if we have the very basic rock you 10,000 password dictionary I have 320 billion hashing operations to do for the same attack that would have taken 10,000 operations on LinkedIn and

that's on the slow hash so we have 320 billion to do at 2,000 a second at best so that's far from feasible it would take years it's just not the same whereas with LinkedIn sha-1 it would take you no fraction of a second the cost of attacking the whole list together because of these salts is what makes salting good it's what makes it useful and makes the actual password hashes stronger but it really only stands up in that list and that is where the herd immunity concept sort of comes from if you have a list with lots and lots of accounts and each one has a unique salt the more accounts you add the more work it is for an attacker to

crack the whole list together single accounts still have one salt so if you're targeting that's not a big deal but if you are an opportunity and you are trying to crack as many as possible as fast as possible the more salts will make it much harder if you're a targeted attacker and you want to attack one person at a time then you probably know a good bit about your target you probably have one or two hashes to attack depending on how many accounts they have and that really plays out to like one or two salts you probably have quite a bit of time and money to invest attacking this person whereas an opportunity someone who steals large

dumps of accounts and then uses them to just scrape as many as they possibly can out of it and then I don't know crack Netflix accounts and sell them online someone who's doing this maliciously but isn't really dedicated on attacking any single person these attackers are typically looking for the low-hanging fruit the easy passwords passwords that are heavily reused you know one two three four five six they don't care who it is they just care as many accounts as they can possibly get they're not going to spend much time on a strong account or a strong password hash because it's just not profitable for them they only want as many as possible as fast as

possible any what they do Eddi that they miss they just throw them away because a large salted list is so much harder for an opportunity they tend not to go for them they tend to be the Dropbox list has seen very little attacking no one's really been running it because it's just so difficult those six million cracks I believe most of them actually came from a cluster of FPGAs owned by Tyco tetanus he's a friend of mine he cracked most of those over the course of I believe a month they took a very long time and that was FPGAs that were specialized for bcrypt so he was moving pretty quickly with large lists of salted caches like this

the time spent on them is just not feasible you can't crack them classically so the way to solve that is to remove as many passwords as you possibly can that are easy from the list to reduce that herd immunity to try and remove as many of the salts as you possibly can because the more salts you remove the more you can do to the remaining hashes they get weaker the less there is in the list so that's where the attack comes in I'm calling it offline credential stuffing because it's very akin to the idea of just you know correlate a password with a username shove it into I don't know the Netflix login page if it works great sell the

account online for some Bitcoin the classic credential stuffing malicious activity that people do the way this works is you have source data databases with fast hashes usually things like LinkedIn we've cracked 97% of it you have plenty of accounts that are username or password or email and otherwise on the line and then you have the actual plaintext password you have target data typically this is databases that another one else is attacking this could be something private to you this could be something that you're doing for an audit it could be anything but these are going to be slow uniquely salted lists just like the Dropbox bcrypt the larger the list the better this actually works surprisingly and what you do is

you take data from the already cracked lists and you correlate it with the uncracked list the target data when you do that you take a plaintext password and a slow password hash you stick them together and then you run just that candidate on just that hash the reduction of workload from doing just one candidate per just one hash would bring say you had a perfect scenario you had 10,000 lines in your source data set and 10,000 lines that you were targeting if you just took that source data set as a dictionary threw it in hash cat and said target these 10,000 hashes your total amount of work would be about a hundred thousand or sorry a hundred

million hashes that's a lot of work especially because each one's uniquely salted so you have to do that ten thousand lines against all of the ten thousand targets if you correlate data between two data sets a source and a target you bring that down to maybe two or three candidates per hashman max but in a perfect world where 10,000 lines equals 10,000 lines you can bring that down to one candidate per ash and that's per salt as well so your total workload goes down to about 10,000 a hundred million it brings it down incredibly it does so very quickly so and as you crack these hashes through this correlation the ones that you miss you can now spend more time on because

there's not a whole lot in the way they're not getting blocked up by other hashes with other salts that are causing more work they're easier it makes the other hashes in the list weaker so what I did is I took LinkedIn I took all of the cracks that I had it actually came out to 112 million correct lines I believe what happened was the version of the data I got had been mixed with something else not sure what and then I took the Dropbox bcrypt there are a 32 million but they were split into two files and one of the files a little messed up so I skipped those so I ended up with 24

point eight million email and bcrypt ash and I correlated the emails in each list and I found that in LinkedIn and in Dropbox there are three point five million perfect email matches so three point five million accounts exist across both LinkedIn and Dropbox I then took all of the accounts that had cracked passwords in LinkedIn that came out to 3.2 million I took those accounts took those passwords and put those passwords next to my bcrypt hashes as my targets and as you'll see

I can't crack lots and lots of bcrypt hashes on my laptop CPU a lot faster than you typically could be able to on GPUs on large clusters etc and it'll speed up in a moment here as it catches up on its own workload and it'll go it'll go much more quickly than I expected when I first did this when I first did this I thought I broke it I thought I was just spitting it back to me yeah so what this is doing is each one of these hashes is only tested against the candidate it's only doing one operation per salt and those candidates tend to be right because people reuse passwords like crazy the

number of people reusing their own passwords or variations thereof is surprising and so we took a list that had 32 million we found three million matching 3.2 million that had cracked passwords that were already known and we threw them in here and it will very quickly I actually know how many it will find it will find about 2.2 million to be correct so in minutes not hours days years on a CPU and a laptop you can grab 2.2 million decrypt hashes without really doing anything and then you have 30 million left not 32 million hashes so you have effectively reduced the amount of work you have to do on the rest of them so if you take the rest of the list

and then attack it classically it's easier you can crack more and the more you crack the easier it gets so reducing it in this way you've made everybody else weaker I mean these people reuse their passwords maybe you didn't but your account is now a bigger target for me and this is cost-effective for an opportunity if I don't want to attack a whole list and I don't really care to target anybody specific I've just made it cost-effective to actually keep attacking that list and doing a lot more to it than I could previously and this is one database my source database was LinkedIn my target was Dropbox if I had five six seven billion lines of

source database I could have a match on every email in a list and then if 70% of the people are using their password I've cracked 70% of that list without doing much at all so it's you know it's not the most serious thing it does require there to be commonalities but it does crack a lot of passwords and it doesn't really do a whole lot of work to do it and it reduces the again their herd immunity of the rest of the list we're gonna stop that cuz my laptop is quite hot

nope step all the way back through right so again of 24 million lines that I was targeting I pulled 3.5 out that had commonalities with my source data and I cracked 2.2 of those in you know a couple minutes not on the laptop but on a beefier CPU and that was a couple minutes of just running it straight through this is very akin to if you're familiar with John the Ripper single mode but instead of applying rules or anything else all this does is straight run the hashes this can be improved surprisingly well actually of course adding more source data if you've got billions of lines you're suddenly Troy hunt and you want to be a bad guy you

can very easily source more and more and more data and use it and use that data to correlate more lines and then if more lines come with more cracks reduce the security of the list even further you correlate across different columns this actually works pretty well I've already tried this some that was just emails not everybody uses the same email everywhere sometimes they use the same username or the same phone number so if you correlate across other columns and the data depending on what data you have you can find you know collisions between lists that you maybe wouldn't have initially found and then of course reduce the password list even more you can stem emails I actually thought about

this and didn't really play with it much I talked to Ryan C the other day and he said hey you know why don't we just end the emails you could do that and I guess I haven't really personally done this but a lot of people especially more technically inclined people will use custom emails per account so they'll have liked S Plus LinkedIn on their LinkedIn account now Gmail or Google they ignore everything after the plus so that's technically the same as test a Gmail same with adding a bunch of periods so in theory those are the same email address so you can stem emails correlate that way more collisions you can reduce the total key space say you have a ton of data and

you've correlated accounts like crazy and now your workloads too big again reduce it you can throw it site-specific passwords say you go through the password list and everybody's using what why does it do that say everybody's using you know LinkedIn as part of their password and you're trying to trigger crack accounts on MySpace something else there's no reason to have LinkedIn in there it's not going to be useful for you they're not gonna reuse LinkedIn 1 2 3 on their you know Google account you're not gonna do that so you can throw out accounts that have that problem you can throw out candidates you can add rules this is something I've already explored but it's bcrypt it's

slow it's not very easy if you have a large list of candidates and you run through and you know 30% of the correlated candidates crack it's probably just because you're only doing exact identical matches if you have someone who has you know password 1 2 3 maybe they're using password 1 2 3 4 1 2 3 4 5 so if you add rules to the individual candidates you can still reduce the total amount of work to 5 10 15 candidates per hash for salt and still reduce it massively from trying to run a normal classic attack against the list and your recovery rates will go up I did explore this but again very slow didn't bring my rates up enough to be

economical but it doesn't work so yeah there's a few ways you can make this better of course more source data will always work better if you have more source data you'll always have more collisions and you'll always be able to reduce the total list further and then once you reduce the list of course you can run it classically you can put it in hash cat and just let it run you can do whatever it is you want to do with it so yeah so that's the attack I can't actually make it faster if you'd like to see the demo again if you give me a moment I forgot a flag so that's how fast it normally runs

those are real time cracks on my laptop CPU for eight kospi crypt hashes so yeah that should actually complete in a couple minutes I believe there should be about four or five minutes and it will have found 2.2 million of the 3.5 million hashes in the list whereas if you took this to hash cat and took those passwords and put them in a dictionary it would take you something close to 35 years so nowhere near as quick as this I am still getting too hot so I'm still gonna stop then probably not a good idea to run this on a laptop it works it's just ah you know so let's go all the way back here alright questions we go who's

got a question do we need Mike for the question so you mentioned filtering the list based on site specific passwords but what have you thought about saying okay this linkedin password is linked in 1 2 3 how would I change that to Dropbox 1 2 3 so that would be a little hard to do technically from the technical standpoint because so what you're talking about there would be stemming the passwords themselves and saying you know this is site name modified by 1 2 3 and so on and so forth your rules are the second portion of that they are the add 1 to 3 but the actual dictionary that you're pulling and adding to in

that case would be pretty easy if it's just site name that's not a big deal but in other cases where you know maybe their password is child name whatever and you only know one of maybe five children you really can't figure out what the other password candidates would be I have thought about that you know yes I just technically you can't go backward you can stem the rule modification off but stemming the actual source is a lot harder anybody else over there so kind of along the lines of you know Dropbox versus you know Facebook or something like that what about character substitution intelligence in did you ever look at that or see any correlation where people

would use slight variations so so what you're talking about is where if they have you know just a normal password and it's slightly vary between sites yeah that does come up a fair amount if you're reusing a password and we know you're reusing it you're very likely reusing a portion of it elsewhere as well the problem again with that is how many modifications is too many if I start getting you know thousands of rules in I haven't solved my problem I've just recreated it so the idea that you want to reduce the number of candidates really means you have to assume it's either gonna be identical or very close or ignore it and you can do

that again later when you run the classic attacks but for this it really doesn't help too much to modify and add rules and things like that it's better to just sweep as many as you can out now and then fall back to classic attacks

this might be a side note but I was just curious about the FPGAs and you know mention that they cracked some liquid sure what do you know what kind of performance like hashes per second you could get there these days and order commercially available ones so I do know how fast the FPGA is that he was running were they are xetex 1.15 why boards they have four spartan six chips on them and i believe they do anywhere between 43 to 45 thousand attempts per second that was across a cluster of 15 boards I believe so 15 times for chips now these are older FPGAs and not new they're not fancy they don't have fast memory newer FPGAs could be much much

faster the major problem is cost getting someone to write the software for them the 1.15 why boards the old xetex boards are supported by John the Ripper they have decrypt support des I believe and a couple of shot to 5060 eunuchs I believe they're just out of the box supported by that tool there are commercially available FPGA units or a6 or otherwise I think the only one I can think of right now is the tableau accelerator it's a forensic accelerator it works with pass where and a couple other software's it does not do bcrypt I don't believe so I don't know how fast it would be and it's also again kind of outdated I'm making a new one costs a

whole lot they're not very cost-effective so most people tend not to they could be fast if you have the money they could be fast

so to any extent which you've checked where these reuse passwords the one-two-three four-five-six variety or these some actual solid passwords so actually these were solid powder I was very surprised by the strength of these reuse passwords we can go ahead and well I'll just start it back up and then we'll stop it and look at some of them so if we stop it these are decent passwords I mean Stephanie one two three maybe not the world's strongest but some of these are just absolute gibberish hot dog 22 is definitely a very easy password there are definitely passwords in here though that I was very surprised I was cracking some of these actually most of these look pretty simple there's a

couple hard ones here in there again for this list maybe not but there were times when I was testing this I also tested this on the Edmodo data set which was a recent very large bcrypt as 12 cost very effective there however the passwords there tended to be site-specific that specific website was mostly students and teachers I believe and most of the passwords we ran across had and moto in them so instead of the source being site-specific the target was so the ones we were getting were very very strong but that was abnormal it's because they were using a password manager generating one and then reusing it everywhere or something like that that's not typical typically there are a

little easier side but I have seen it where you know only the strong ones get reused because everybody else uses something so trash that it's not come up believe you had a question okay any anyone else we got one in the back so recognizing the ways that things can be broken other than using absolutely random totally different passwords that have no basis in reality is there anything where there can be some level of reuse that isn't going to be easily cracked or does it have to be absolutely totally random to get that what you're trying for so in theory the best option is always going to be complete randomness longer than 13 characters actually roughly exactly 13 characters

you can do things to make it easier you can do formats like dice where where you have four or five words that build a passphrase that's very long very hard to attack very hard to prove force they're very strong dice where passwords and they're easy to remember is the goal they're very easy to you know four or five random words and have no correlation read it four or five times you can probably recite it fast enough to type your password it for things like that you can have a couple words be the same and maybe change the last one but really you shouldn't be reusing any part of passwords across websites website should have unique passwords that are

long and random and that would be because you don't know if the website is hashing your password if tomorrow the website gets breached and everything's plaintext as strong as your password was everybody knows it so you're trusting the site to be secure and sort of taking that you know security into your own hands the best you can anybody else I have no idea how much time it's been because my timer got reset but I think we're good on time okay well if no one has any questions we can stream the passwords by and okay

now you mentioned that you know there are these these lists that you know you're strong bcrypt where it's not easily entry level entry level crackable by just common people do you see have you heard any inference that there are some attackers that are going the long game and continuing to attack some of these more difficult patterns or is it really something that only someone like you would really have the endeavor to go into so there was one list that I have seen come up time and time again that's very difficult it's very slow hashes and that was a Bitcoin related forum got breached and what they've done is what I've seen is mostly people that are

malicious they're looking to break into accounts they're trying to crack these hashes they do it on a one by one kind of basis they will pick a target pull their hash try to crack it and move on so it's still not quite the massive list opportunit a core style they're still targeting but they are coming back to the same lists over and over and over again and with something like this if they could mass crack that list it would be much more devastating because it would take them much less time to attack much more people or many more people so it's not happen most people have left Dropbox alone that's why I picked it Edmodo came out and I don't think really

anyone noticed the hashes have sat untouched on the websites for a while now I don't think hashes not already even loaded it because it's just so difficult to crack I mean if you're doing it for sport great you can spend weeks doing it and spend a bunch of money for nothing if you're doing it maliciously you can't I mean there's not a whole lot you can do and anyone else I don't know why you're doing it you know the you know the eighth card Riggs for hash cat about how much are those read if you wanted to build those retail right now how much of those go for you would probably want to talk to sales at my company about that

because I know I don't know the right number so I'm not gonna guess I mean if I built what I have no idea but um ya know you're the card and ten and however many cars there are now are great they're just things like this where bcrypt comes into play you can only throw so much power at it at a time before you have to find something new and so that's what we did thank you sure anyone else no okay we're gonna watch a bunch of password stream but I'll make it a little slower so it's easier to read them but there are usually some good ones I hope that's not slow enough

there we go so these are roughly the level of password that we're seeing in the Dropbox list that came from the LinkedIn list these were all cracked in LinkedIn and then correlated over and then recorrect in Dropbox as the bcrypt list some are stronger than others some are quite weak I imagine maybe some of these accounts or BOTS there's a surprising amount of cross person password reuse for passwords that don't make sense every now and then there will be a random password where there are 50 hundred 200 accounts with that same random password so I have to imagine that spots but it does show up and it does crack pretty well yeah most of these are pretty weak

I would show Edmodo but that one is layered and has a number of issues Edmodo chose to do 12 cos bcrypt probably a couple years after they chose to do md5 and to fix their database they upgraded by just hashing md5 with bcrypt so whenever you log in your password is hash with md5 and then again with bcrypt so when they come out you can either do that layer internally or you can just crack them with a list of md5 hashes you don't even have to have the password so that usually just looks like gibberish when I run it because this tool will automatically parse that out for me this tool was written by the creator of MDX

find woful I believe it was edited a little bit here and there by Hopps one of the other developers it is custom to this and they actually used it in the Ashley Madison talk that they gave this is a new revised version of something they call BC vowel it's a bcrypt validator right now it's just slowed it down what I've done is of how to do case twiddles it is modifying the beginning and end character cases up and down and it is doing 25 different variations throughout the password before skipping in and moving on so that creates enough work that you don't see just a blur on the screen you have a question is this

tool open source bc val is not i don't believe in the ex find is either you'll have to talk to waffle about that but they're both i believe both are posted at least mdx mine is posted on hashes org you can find that it's pretty easy to use it's free it's just like hash that is chipped there is no open source to it right now i think he talked about it I just don't know yes so to do the correlation you have to have the usernames so so to the gate to the priestess can we include the user IDs associated with these hashes not sure I understand the first time now because I gave eyes it like you mentioned you have

like these various breaches on the internet with what hashes do they include the usernames associated with those hashes okay because otherwise you end up saying that okay someone in the world use Pepsi 7 as a pass yeah and then what so typically for research and the things that I do I strip all of that information out and I throw it away there is some kind of gray area kind of a legal gray area on whether or not you can have this data it is still technically stolen data even if it's been posted publicly so typically the information like that especially among the more legitimate researchers is stripped out thrown away and what you are left with is just hashes or just

passwords or if your troy hunt just emails as far as I know he doesn't really save anything else from the databases but you can't find them typically wherever the dump came from before it got parsed by as many people as wanted to play with it and all the researchers and stuff the original versions are at least nearly original versions will have all of the data in them and you'll have emails user names IDs phone numbers anything that got dumped yeah sure thanks looking at the the list of the passwords it's good scrolling across I see clumps a very similar passwords can you blind so that's artifact of when I cracked the LinkedIn list these passwords as they

appear in the correlation list are sorted by that column not the hashes and it was sorted in the order that they cracked in hash cap so I cracked the sha-1 from LinkedIn and then left it in that format and then correlated across using the hash plane and the email hash and then the email hash for Dropbox so what you're seeing is a little bit of left over clumping because that's how hashed at broke them at first the first time and now these hashes are being broken by those same passwords in the same order it is interesting though I probably wouldn't have noticed that yes over here

so obviously I'm seeing that a lot of the passwords are a lot shorter yeah and one of the passwords I saw rolling by that caught my attention was just the Polka Pikachu and I was thinking that if that wasn't in the dictionary in the dictionary used as a word that it might as well just be a random array of characters and obviously it's still short to be a strong password but would you say that something like that is generally better then like obviously just a straight normal English dictionary word followed by numbers so in theory if you have a rare word however you might come about classifying a word as rare a normal dictionary attack and maybe even dictionary attacks

with rules won't find them if they're short it doesn't matter though because then you fall under masks and brute force and everything else the other thing is sometimes words look a little more random or a little more rare than they are statistically there are many attacks like some of the cut' attacks where you take two halves of the same word split them on the whole list and then start shifting that list you'll find words that are valid that you would never have thought of so there's there's attacks that may find it here and there of course if for some reason your word is a word it's probably in a dictionary but again if it's not then yes you

should avoid most of the attacks that will attack your bathroom all right thank you over here over there so you mentioned that one of the one of the lists was not able to be cracked because there's there's a pepper in it yes um you know based on your experience you know going through all these passwords and and and how they they break down do you think that generally in your personal opinion that developers should you know even if in theory it's not very imposing they should have a hard-coded pepper in their code that maybe won't be dumped in a database dump I don't personally believe in hard coded because if I have I'm an attacker and I've

broken enough to steal your database I've probably broken enough to steal the pepper do you see that in I guess me that might be outside of your range you see that in practice so I have seen a few actually databases where a hard-coded salt or pepper was used and in almost all of them it was either short enough that we eventually broke it or it was found by the hackers and then dumped with the database and no can you speak of how you found the pepper I think we're a little out of time but I can I'll talk to you outside about it and yeah they're they're typically pretty easy to find and the dumpers have

found them for us thank you of course so I think that's it

G1234! - Abusing Password Reuse at Scale: Bcrypt and Beyond - Sam Croley

Related talks