Cracking 936 Million Passwords

Name: Cracking 936 Million Passwords
Uploaded: 2025-12-08
Duration: 43 min 3 s
Description: Jeff Deifik shares insights from cracking 936 million passwords sourced from public breaches, exploring the practical hardware, software, and algorithmic challenges at scale. He compares CPU versus GPU efficiency using John the Ripper and Hashcat, details custom rule development and wordlist curatio

BSides Las Vegas43:0319 viewsPublished 2025-12Watch on YouTube ↗

Speakers

Jeff Deifik

Tags

CategoryTechnical

TopicCryptography

StyleTalk

Mentioned in this talk

Tools used

hashcat John the Ripper

Service

Have I Been Pwned

About this talk

Jeff Deifik shares insights from cracking 936 million passwords sourced from public breaches, exploring the practical hardware, software, and algorithmic challenges at scale. He compares CPU versus GPU efficiency using John the Ripper and Hashcat, details custom rule development and wordlist curation, and presents statistical analysis of cracked passwords—including 92% recovery rate and breakdowns by length, character class, and structure—to inform effective password-cracking strategies.

Show original YouTube description

Identifier: 7PHURF Description: - “Cracking 936 Million Passwords” - Shares personal experience cracking a massive dataset of 936 million passwords. - Discusses challenges of password cracking at scale. - Covers hardware setups, tools, wordlists, and custom rules used. - Explains CPU vs GPU tradeoffs in cracking efficiency. - Presents statistics on cracked passwords and defenses against password cracking. - Notes success rate of 92% passwords found. Location & Metadata: - Location: PasswordsCon, Tuscany - Date/Time: Tuesday, 17:00–17:45 - Speaker: Jeff Deifik

Show transcript [en]

Okay. So, we like to welcome everyone to Passwords Con to the next uh talk that we're having. Just a reminder, um we'd like to thank our diamond sponsors, Prismacloud Inventor, and our gold sponsors Adobe and uh Simgrip. A reminder about um your phones. Please put them on silent so that the speaker is uh not interrupted or anyone else in the audience. If you want to take photos of the speaker or his presentation, are you okay with that? >> Sure. >> He's fine with that. Please remember the policy not to take anyone else that is attending without their consent both in here and and outside. So cracking 936 million passwords, how is that going to work? Well, we're going to have

Jeff Dick take us through that. So, Jeff, you're up. Thank you. Can everybody hear me? >> Excellent. So, I'm Jeff. This is my talk. Um, if you go to jdify.com or dy.com, you can see a PDF of all these slides. So, a little bit about me. Nothing very exciting. Read it at your leisure. If anybody has any questions during the talk and they're short questions, ask me. If they're long questions, save it to the end. So, there's a thing called Have I Been Poned? You probably all heard about it. They have a way of downloading all of their passwords. You can't download them in plain text, but you can download them hashed. So, I got these in NT landman

format and 936 million. Um, so it turns out you need a lot of RAM to crack passwords. So you can either use a CPU or a GPU. I found that a CPU worked much better. So I started off with 128 GB of RAM. I upgraded to 256 GB of RAM. If I had deep pockets or somebody wants to pay me $1,000, I can update to 512 gig of RAM, but that hasn't happened yet. Um, yeah. So if you want to do this in a timeefficient fashion, buy yourself some RAM. Um, and I found 92% of them so far. So the tools are John the Ripper and Hashcat. Hashcat is primarily a GPU accelerated way of breaking passwords.

Turns out it didn't work especially well for me. Um, it it's a common tool. It does it does work. It just didn't provide a lot of passwords per second. So, John the Ripper, um, they they don't make a lot of official updates. Neither does Hashcap, but they make a lot of unofficial point releases. Um, John the Ripper doesn't really work too well with GPUs. I don't know why, it just doesn't. Um, the good news is is that the way to break passwords is to have good dictionaries and custommade rules or good rules. That is by far the most efficient way of finding passwords. And the other thing is they have extremely great support. Um, hashcat I'm using 626 which is the

latest version three years old now. Um, I was using a 3060Ti GPU. If I used a faster GPU it might be five or 10 times faster. It still wouldn't have made much of a difference. Um, the rule syntax. So imagine you want to insert an alphabetic character in the first position. With John the Ripper, that's one line. With Hashcat, you have to have one line for each character. So now imagine that you want to insert a character in the first position, the second position all the way through to let's say the 30th position. Let's say you want to insert 100 characters. So that's 30* 100 characters. That's 3,000 lines of rules. So they say they need to

do this to run quickly. Maybe it's true, but I'd rather have one line and debug one line and show you guys one line of a rule than 3,000 lines of a rule. Um, and the dictionary attacks for hashcat I found were taking up a good amount of memory. My hashcat machine had only 16 gigabytes of memory. I tried it on another machine with 120 gigabytes of memory. It just uses a lot of memory to do what it does. So, word list, I don't know if you guys download word lists, most of them are pretty crappy. Some of them are pretty good. I found that um Rocky, either 2021 or the previous version of Rockyu is very good. Rocky

2025 I downloaded, it's very big. The quality is not so good. Um, so when you download these things, they're full of junk. Most of them except for Rocky. Some of them have lines that are hundreds of thousands of lines of characters long. Some of them have random junk in them. Trying to curate something that's 10 gigabytes or 30 gigabytes is very difficult or impossible. So good quality word list is mission critical to breaking passwords in a timely fashion. So I wrote a bunch of tools to help me with this task. I wrote a program called short which will truncate a file and only pro produce the first x characters whatever that might be. So if you have a

100,000 character line you can knock down to 50 or something. I I presume nobody's got 100,000 character passwords. I've never heard of one. Um there's m sort which is a merge sort routine. It needs to have more physical memory than you're trying to sort but other than that it works very well. What else do we have here? We have multi merge will merge a bunch of things so it runs quicker. Um I have sample which will sample however many every 10,000 every 10,000 lines or things of that nature. Uh I have things that produce line length statistics counts of asky characters. I have some password utilities that I wrote and I have some sometimes when the password is

found it's got unprintable characters and so they encode it in a hexodimal format. I have a thing to turn it into binary stuff. It's called password unhex. I also wrote some password statistics programs to give me statistics on the password I found which I'll be showing you later. So there's standard tools. GNU sort is an excellent tool. unique com. Um, I use Emacs as my text editor. I know not everybody likes it, but it'll handle gigabyte files. It will handle binary data. It'll handle pretty much anything. Is somebody have a question? Okay, so relative hashing speed, if you look at NT Land Man, it's the fastest hashing algorithm. That means if you want to break it, it takes the least

amount of resources. If you look at DScript, which is a tool I used the passwordized breaking previously, it's much slower. DSCrypt, I think it came around in 1978 or something like that. So why nand man is such a fast algorithm, which means it's basically weak, I don't know. If modern Linuxes, they tend to use things like brypt, which is substantially slower. So if these passwords were encrypted with brypt, I probably would have found very very few of them. So you should be using a hashing algorithm if you want your password to be secure that's slow and if you want to make it easy to break, you should use one that's fast. So salt, does everybody know what salt

is? I'm going to Does anybody not know what salt is? Come on, somebody's got to not know. Everybody knows it. Yeah. Okay. Very good. So in 1979, people figured out if he's got a password that's password one 123 and he's got a password that's one two three, the hash is going to be the same. So he said, we're going to add what was it? Uh 14 bits of random data to each person's password. So the chance that his password and his password when hashed are going to be the same is one in 4,000. So the reason you do this is it makes it very difficult to do a rainbow table attack. So again, they figured this out in 1979. I'm guessing

that's older than half the people in the audience, but Windows doesn't use salting. I don't know why. I've never heard anybody give me an explanation why it doesn't use salting. So that's very good for me breaking passwords and very bad from a security standpoint. So in the 80s, Unix went up to 48 bits of salt, which is a huge number. In 1996, BCrypt used 128 bits of salt. So the chance of having a collision is 1 and 2 to the 128 which is a very very big number. So good for me. Anti- land man no salt. Those things run real quickly. So the tools I used dictionary attacks is the most efficient way to find

passwords. You just encrypt the dictionary word. You compare it to the hashed word. If they're the same, you found your thing. So the brute force attack. Let's say you want to break all of the eight character upper lower case and number passwords. So that's number of combinations. I'm not going to read it out. It took me 229 days on my 3060Ti. I think I got like four million passwords that way, which is actually not a good number in terms of passwords per second. Um, and the the problem is even if you can do that really quickly, let's assume you can do it in a day. When you go to something longer like 12 characters, things scale up very very quickly. And

so it just doesn't matter how fast your GPU is. If you add one character your password, it's going to run 30 to 80 or 100 times slower. So it does not scale. um 16 character upper lowercase number that's number and that's how many years it would take me to crack it with my GPU which is pretty much forever if that's a cryptographically strong password. So the efficient way is use a rulebased attack which I'm going to come to next. So I'm sorry rainbow tables people say use rainbow tables rainbow tables are awesome. So, I downloaded what was the thing? I think it was nine characters and upper lower numbers and symbols. It was very, very slow. I've got the

statistics here. Oh, if you want to do this, Defcon's got the data duplication village. You bring them a six terabyte hard drive, they'll fill it up with rainbow tables. So, the numbers are I'm sorry. Do I I don't have um It took a long time. I thought I'd put the numbers here. One second. H Well, I'll tell you the numbers. It took I think it was like 500 seconds to crack one password with rainbow tables. That just randomly accessing a file was stored on a solid state hard drive. So if you have 100 million passwords or 900 million passwords, it's going to take forever. So I don't know why Rainbow Tables took as long as it did, but it was very slow.

Um, okay. So here's Rainbow Crack. So I downloaded the nine character lowercase and and alpha and spaces thing. It's 43 GB. It's a good amount of space. Um, one of the tools I was unable to get working, but eventually I found a tool. Where is it? Oh, I sent in a bug report for Project Rainbow Crack. Never got a reply. Um, oh, I also also the memory used I figured out that if I wanted to do it the 936 million passwords, I'd need 150 terabytes of RAM. So that wasn't going to work. I don't have that much RAM. So I used with RCrack I MT. That actually worked. I was impressed. So it took 6 seconds

per file. There was 84 files. 504 seconds is what it took to break one password. So I used the password I knew was in that nine character set. So if you scale that up, that's 16,000 years to try all the nine character passwords. So I don't know why I was as slow as it was. I used a SATA SSD. If I would have used an NVME SSD, it might have been five times faster, but that still wouldn't have helped me. So if you have one or two passwords, go with Rainbow Crack. If you have 100 million or 900 million, it's not going to work. So, hashcat my machine for hashcat had 16 gig of RAM. Um, I decided to try it on a moving

passwords and it took 660 megabytes of space. So, assuming it scales up linearly with that, it would take 624 gigabytes of RAM to run that. So, I don't have that much RAM. So, I decided not to use hashcat with dictionaries. So if if you have small dictionaries or small numbers of passwords to crack, it's probably fine. I mean a million passwords is a lot of passwords. So I can't really fault the hashtag guys if using 600 megabytes of RAM. But if somebody wants to break 900 megabytes of passwords, not going to work. I'm sorry, 900 million passwords, not going to work. So now we get to the rules. John the Ripper has the best rules.

So rules they take a standard word from a dictionary and they modify it in some way. Um they have a very very verbose set of rules and you can make custom rules which I did. Um so I upgraded my machine from 128 gig to 256 gig. And when you run this it takes a fair amount of memory and you can run multiple forks which will multiply the amount of memory. So one fork would take I don't know what say 20 megabytes of memory. So six forks sorry gigabytes. So six forks would take six times that. So as the number of unfound passwords decreases the hashing speed stays the same but you can use more things in parallel at a

time and so you can increase the effective throughput of things. So, when I first started this using the default John the Ripper rules, I managed to find 487 million passwords in 12 days, which is like half the passwords. That was pretty cool. And also, since the number of unfound passwords decreased by a factor two, I could double both the number of forks. So, that was a really good thing to do. Um, that that was the lowhanging fruit. So, John the Ripper has a very good default dictionary. It's very very small. And using that I found 154 million passwords. That was really good. So there's an incremental attack which will never ever terminate and it starts off pretty quick and it slows

down unfortunately pretty quickly. So I found 320 million passwords. So these numbers are not all exclusive. These are like inclusive numbers. So they would add up to more than the individual numbers. Um so then I started using the rock word list. So I found 156 million passwords that way and then I bought the 256 GB of memory so I could double the number of threads I was using or number of forks. Um so yes I started doing attacks using the default word list and the default rules which is all the rules and I found another 15 million passwords that way. So just for context using hashcat and I think all the nine or 10 character

passwords I found four million passwords in half a year of time. So this is really really fast compared to a brute force attack. It's going to miss things. It's not going to get everything but the speed is so fast it's better to reduce the number of unound passage you have and then crank the number of forks up. So my computer has 64 cores and 128 threads. So I can scale this up quite a bit. So I started off with seven forks and then I said let's apply the rules twice. So if one of the rules is uppercase the first character, the second rule might be append a number at the end. So it would take a long time and probably

never terminate. But I found another 36 million passwords that way. So then I decided to not use the standard word list but to use the Rockyu 21 2021 word list. I found 265 million passwords in less than an hour. That was super awesome. Um so then I was able to increase number of forks up to 18 after I found all those passwords and I was able to run things like more than twice as fast as before. So I found another 11 million passwords in 3 days with that. So the thing with finding passwords, you find the easy passwords first and you find them quickly and then things slow down and they slow down at about an

exponential rate which is really unfortunate. But the good news is the more password you find the less that are unfound and the more forks you can use. So when I was done with this I was running about 31 or 32 forks in parallel. That was the limit with the amount of RAM I had. So, I did a brute force attack of all the characters up to nine. Um, so I found 16, I'm sorry, 3.6 million passwords halfway and took four days. And then I had 811 million found passwords, which is a good number. So, I started using that dictionary of found passwords. All all dictionaries have different qualities, but these are all real passwords that I really found. So this

is like the highest quality dictionary you can get. And using that dictionary and the standard John the Ripper rules, I found another 16 million passwords in seven days. And you basically just wait for the stuff to slow down. And once it gets too slow, you say, "I'm going to finish. I'm going to stop this because it would take years to finish." So the good news is it finds things quicker quickly than slowly. And at some threshold that you get to, which for me was like a character a second, a password a second. I'm just giving up on this. So then I started to apply the rules twice on my 811 million found word dictionary. And

yeah, it would take it would take years to finish. So with the simple things like just doing a brute force attack on a dictionary, it would take like minutes or an hour to finish. But this applying the rules or applying the rules twice, it the the CPU usage goes up exponentially. So now we can run with 10 forks and let's see, I was using the Rockq dictionary and I was down to 129 unmillion 129 million unfound passwords. So I use a special rule that I wrote to replace a character with a control character. So if a password is password 1 2 3, I say let's replace the first character with a control character. I ran through all the control characters.

It's okay. Let's replace the second character with all the control characters. And I found this when I was breaking my DS crypt passwords that some passwords have control characters and the standard dictionaries and the standard rules don't really look at that. So the good news I have statistics on how many control character passwords I found and there's a good number of them. So the second thing I did was instead of replacing a character was to insert a control character using the Rocku 2021 dictionary and I found another 8 million passwords. pretty quickly without it. So these are the control character rules. So if you look at the optimized version that's one line long and that

will do an over strike of a control character in any position no matter how long the password is and it includes uh AD which I believe is delete fire call correctly and includes all the standard control characters. So it's not very readable, but it's only one line long. And the rule to do the insert is the same thing. It's the last line, the optimized version. The unoptimized version still works. It's just not as fast. So I can deal with a oneline rule. That that's something I can show you and put on a slide. I can't put 3,000 line or bigger than that rules there. So here's hashcat and hashcat performance. So lower, upper, number, and special

length seven took 3.7 days. That's not so bad. Then you go up to eight characters and it's 10 days. That That's not so good. Then you have lower number passwords of length nine. That's 5.3 days. But the last one I ran was a lower number of length 10. 180 days. I found 2.4 million passwords. That's really not a cost-effective thing to do there. Okay. Um, and and you know, if you increase the length to 11 or something like that, you're you're pretty much hosed. And even if you have 10, 50, 90 GPUs, it's still not going to be a good day for you. So, this is just not an efficient way of finding passwords. If you only have a few

passwords, this might be a good way to do business, but when you have a lot of passwords, it it's just not a scalable thing. So, I pretty much gave up on hashcat after the thing that took 180 days. I said, I'm done with that. I turned the machine off. So here's some password statistics. This is the length of passwords. So for example, passwords of length one 0% overall 275. This this includes duplicated passwords and it goes all the way up to 30 plus. So this is really interesting because if you look at it, most of the passwords are like eight or nine characters long. And this doesn't show the password I couldn't find. But the statistics on 869

million passwords. This is pretty good statistics. I've not seen this anywhere else at a passer talk. So I wrote my own program to generate these statistics. So this is for example like all lowercase was 18.9%. But then I have like all uppercase or all special characters. I just made these these rules up. But the control characters will include Unicode stuff. So, I got 269,000 passwords that had control characters. Um, 8bit ASKI is stuff where the most significant bits turned on had 128,000 of those. But if you want to break passwords, statistically speaking, the lower digit stuff is 43%. So, that's that's where you want to go to break your passwords if you want to get as much as you can, as quickly as

you can. Just statistically speaking, I know your people's password, but I'm guessing most of them are going to be lowercase or lowercase and numbers and again based on 869 million found passwords. So now we have string classes. So all alpha is just alphabetic characters. So alpha number is alpha characters and then numbers after that. So that is 35% of all the passwords. So a password like password 1 2 3. But then we have like numbers followed by alphas which would be one two3 password. That's that's a lot less common. And I just made up more things. Numbers, alphas, numbers and all kinds of stuff. And I think this is very interesting statistic because this shows

how people construct passwords. Obviously these are not cryptographically strong random passwords but the these numbers don't add up to 100% but I got a lot of passwords this way. So if you wanted, you could make a rule that says, let's go find, you know, six alphabetics, then three numbers, for example. And that's a whole lot faster than saying all lowercase and all numbers together. And again, statistically speaking, you'll do pretty well that way. Um, yeah. So here's the control characters. This is the fun stuff. I got eight new lines. I'm sorry. Eight uh line feeds. Character turns, 20 29,000 of those. Uh horizontal tab, that's a tab character, 134,000 of those. Delete characters, 1,045. So these these are all in principal

things. And yet there's a good number of these in passwords. I don't know how they put them in. I don't know how they got through whatever password screening process they have there. There's a fair number of control characters and these are these all came from my customwritten rules to find non-printable characters and passwords. Do you have a question back there? >> Speak up a little bit more. >> Speak up. I can't hear you. >> Yes.

So that is a possibility that so the question was is the password I found could have been a collision with a real password. So hashes can have collisions. And so if if the real password is happy Tuesday, but password 123 happens to have the same hashes that there would be a collision and I could find an incorrect thing. So that would be unfortunate. I wouldn't get the true password, but I would have a password that would generate the same hash. And so if I was breaking into computers, I could use my collided password just as effectively. So the statistics would not be good, but the potential to cause damage with it would be 100%. With a

collided password hash does that make sense? >> Very good. So defenses don't use nandman. Why Windows uses nand man which is the fastest algorithm and doesn't have salt? I don't know. 1979 people invented salt. It's a long time ago. It's a it's a long time ago. Why why are they not using a slower algorithm than Nland man? I don't know. Even MD5 is slower. You know, not that MD5 is cryptographically strong, but you Linux is using BCrypt. BCrypt is very very very slow. So two-factor authentication always a good thing. There's this the stupid phone text message which is crappy two-factor authentication. There's hardware security modules such as a Yuba key or Google's Titan key or a

PIV card that has built-in secure thing there. There's a lot of two-factor authentications. So if you're doing something involving money, you probably want to have a two-factor authentication. Somebody didn't steal your money. um use cryptographically strong random numbers or random strings. If you use a password manager and it generates a 20 character cryptographically strong password, it's not going to be a dictionary attack. It's not going to be found by a brute force attack. You saw how long it took to do a brute force attack. You know, pick pick your favorite length that you think is good, but make it more than eight characters long. So, I wrote a cryptographically strong random number random string generator in Python. These

are some examples. that shows the number of bits of entropy. So I don't use any of these passwords, but I can generate these things and I can make them whatever length I want and I can tell you only use numbers, only use uppercase or only use whatever. And I can make 30 character passwords if I want. So is this better than than a password manager? Absolutely not. But I wanted something that was simple and portable and quick and I wrote it because it's only, you know, like 50 lines or something like that. But none of these passwords are going to be in any dictionary. You're not going to be able to remember these. So you need to use a

password manager. I like local password managers. I don't want a password manager in the cloud because I don't want to worry about data breaches. Key pass is an excellent open source password manager. You can write your own. You can there there's a whole bunch of good open- source password managers. You can even host your own. I forgot what it was called. There there are things that you can host yourself that that run in your private cloud to do this, but I can't memorize these passwords. Maybe some of you folks memorize one or two, but I use a separate password for every different system. I probably have two or 30 hunds. I can't remember all that stuff. And

what's better is it you can't torture me and I can't tell you the password because I don't know. So, if you're using a GPU, you're going to want to undervolt it and underclock it to reduce its power consumption. Less power, less heat, thing will last longer. Some some underclock better than others do. I could get like 10% power savings, which is still good because the thing ran a couple degrees cooler. There's some random links here. Um, the open wall guys are the John the Ripper there. There's lots of good sources of password information. Here are the dictionaries that I was using and how big they are. Some of these are good and some of these are

bad. But like week pass 37 gigabytes. Rocky 2021 96 98 gigabytes. These are not small files. If if you want to throw them into an editor, you better have a lot of RAM or you better split them up in little pieces. U some of these are good, some of these are bad. I like Rocku. It it works well for me. It It's shown high quality stuff, but get yourself a fair amount of storage if you're doing this. Is that the last slide? Maybe that's the last slide. Any questions? I believe we're going to get a microphone for people to ask questions.

Anybody? >> Yeah.

Hi. Since you noticed such a large improvement using that uh specific word list, do you uh I imagine you were curious and went and investigated how they had come up with it in a way that was different enough to get that large of a a difference. What did you find out? >> So, I don't know where the the providence of it, but it's allegedly from a data beach of Rocku in 2021. So they they they claim that it's an actual leaked list of passwords and statistically speaking I was getting a lot of good data from that as opposed to other stuff. The very best dictionary I was using was the one that I found my

900 odd million password thing. I found all those passwords. They were real passwords. And when you start applying the rules to already found passwords, you get pretty good performance. But I don't know where Rocky 21 came from. I I they say it's from a data breach. Well, >> I guess what I'm wondering is do you believe that that's just uh unmodified um results of that breach or do you think that there are transforms applied to that to produce the given the size? >> I I don't know. I'm I'm guessing that it's a raw breach, but but I don't know. But the thing is applying transforms you're gonna I mean if you just like uppercase the third character you're going to

double >> right >> the the file size. If you say uppercase one character you know in each position you're going to multiply the thing by like a factor of 10 or 20. So >> so so anything that would be like a reasonably straightforward rule you would expect to have someone just implement >> and you would increase the size of your file. So instead of uppercasing what you subtract you substitute a number for a a letter. So you have 10 characters that's gonna each each character position is going to multiply the size of your file by 10. So if you have you know length 10 passwords that's going to be a 100 times bigger file. So I doubt that people

augment the the raw results with different rules because that would just increase your file size like the last example would be by a factor of 100 which really sucks. Nobody wants to download a file 100 times bigger than they have to. >> That seems to make sense. Thank you.

Thank you for your talk. Uh I have a small question. Would you find that uh mutating the passwords and generating a new dictionary is faster than mutating the passwords in memory using rules? >> So I believe that they do things more or less sequentially or possibly in a batch mode. I don't think they there is a way with John the Ripper to apply a rule and write the whole file out, but I think that'd actually be slower. I think you'd be limited by IO speeds as opposed to reading the thing in and just doing it in memory, I I presume since each thread takes like 10 GB of memory. Now what what I meant is

if you had like a pre-processing stage where you took the rules that you found and you created a new dictionary before running JTR would that have helped running things more efficiently or is it more efficient to run those rules in memory on existing inputs? >> I think it's much more efficient to run them in memory on existing inputs. I think you'll simply multiply the size of your dictionary by pre-processing it with the rules by a factor of 100 or a thousand and I don't have that much this space. Thank you. We got we got a whole bunch of questions. He's got some there. Couple back there.

Uh you gave a lot of examples of the formats of the passwords. I was curious if you did any analysis or had any data on the I say longer passwords like 12 or 16 characters or longer if those were just combinations of words or like what was the complexity of the longer passwords that you found? >> I didn't look at that. Um but the the problem when you've got like 12 character passwords. So I've got what is it here? Five 50 million of them. I can't look at 50 million passwords and get statistics from I could have written a program to see what they look like, but I didn't bother with that level of detail. The

thing is there's very little people that do statistics on the password sound. I could write something. So for characters, you know, longer than 10 characters long, try to get some statistics on how they're formatted, but I haven't done that yet. It's certainly interesting future work.

So um just a question you know you've got a you got a pretty good corpus of data here. Yes. That you that you've built up. >> Is there any opportunity to use machine learning or or AI as we call it now um to use that to inform creation of better rules and better mangling uh to in essence speed things up. I would guess the answer would be yes, but if you have to feed in 869 million passwords into a LLM, well, no, you would create a new model at that point. you wouldn't use an LLM. But I I guess what I'm saying is is I I wonder if those patterns if run through you know

several uh you know models and layers and and the tensors if you would be able to come up with some call it groundbreaking rules for that mangling. >> I am certain you could find some interesting rules. I don't know how groundbreaking they would be and I don't have the resources to do this AI processing on it but I think that's excellent work and now that LLMs are becoming more of a thing I think it's going to be the future of password cracking because obviously you want good rules and right now we say okay let let's insert a character or let's insert two characters and you know you want to know which characters you want to insert

I mean maybe vowels are good to insert maybe numbers are good to insert maybe tildas are good to insert and I don't know I don't know what the good characters are but presumably an LLM could figure that out. >> So a follow-up question uh do you have a GitHub where you posted all these goodies? >> I do not. The these are relatively large files as you would imagine. >> Well more your rules and process >> the the rules. So the one custom rule I wrote I have on my slides. The other rules are John the Ripper rules. The only one that's a unusual rule, it's in it's in their list of rules, but they have something they call OI2, which is

insert or over strike two characters. >> They have an OI1, but the OI2 found a bunch of passwords for me, but it's in a standard John the Ripper distribution. So, the rule, let's see, do I have the rule here? I'm sure I do.

Yeah. So, these these two rules are the the ones that I wrote myself. The good news is the guy in charge of John the Ripper is a guy named Solar Designer who's a Linux security engineer, and he optimized these for me. I don't really understand the syntax of this stuff, but they work. And it's two lines, so anybody could write down those two lines and get the insert or overstrike of control characters. And I asked them to put in delete in there. So that's what the H0 is that that's hex for delete asky127 I believe. Um but it certainly an interesting question. >> Yeah. No, thank you. Anybody else? >> We got a microphone. As a quick one, did you uh explore uh

since you you looked into GPU acceleration, did you explore alternative hardware, FPGA hosted stuff like on a big XYlink part or something like that? >> So, John the Ripper has some acceleration for a really, really, really old FPGA, things like eight or 10 years old. And these things are hard to come by and they're not nearly state-of-the-art. I don't know of any current password cracking people that are using modern FPGAAS, but in terms of price performance or performance per watt, an FPGA is going to beat a GPU. The thing about GPUs is they're made at huge scales. The price are driven down by volume and gamers love of, you know, 4K 100 frames per second video games.

So, it's good good for people cracking passwords like me, but but it's never going to be as efficient as an FPGA. Thank you.

I have an assumption that you know a lot about hashing in every flavor. What do you like? You like Argon Blake uh for a modern hash? >> So, BCryp or Argon are state-of-the-art algorithms. So, they're both nice and slow. They're both cryptographically very strong. Let's go back here. I don't think I have argon listed here, but it it's a modern encryption or modern hashing algorithm. So the the thing is if you look at nand man, it is so weak and and you know Microsoft came out with Windows 7, they came out with Windows 10, they came out with Windows 11. Surely they could have picked a better hashing algorithm. Surely they could have implemented 12 bits of of salt or 36 bits of salt,

some reasonable amount of salt, and they haven't done that. So why their hashing is so fast and they don't use salt, I have no answer for. Maybe Bill Gates or the current guy running Microsoft could tell you that, but I certainly can't. But but I look at BCrypt. BCrypt, it's 13,000 hashes a second. And and look at nan man. It's 41 million hashes a second. I'm sorry. 41 billion hashes a second. Why why why make it so easy for people like me? Why why make the hashing algorithm so insecure? I don't have an answer for that. It it it's certainly inexcusable. It's certainly that they knew better, you know. Oops. 1979 is is when Salt came about. 1979.

It's 46 years ago. It's a long time. Even if it takes them a decade to implement this, they should have done it, you know, well before 2000. I I don't know why they're not doing it. It's not a new idea. >> Sure. Um on your slide there, I saw WPA2. Have you played with uh cracking WPA3? I mean, it's got some defenses, but uh >> I have not. >> Okay. So, so the reason I used the NT landman version of of the have I been pawned? It's the fastest algorithm. Why? Why? So, I think they also had I don't know what it was. I think it was it was maybe it was Shaw one. I think I could

get the password in Shaw one. But why should I pick Shaw one which is so much slower when I can pick Landman which is super duper fast? I I want to crack passwords. I didn't want to melt CPUs and you know use gigawatts of power. So, I do this all my my main computer that's sitting in my house right now. It's getting like 15 or 20 passwords a second and it's going to slow down, but that's not so bad. That's with 10 threads. When I get home, I'll probably take all the found passwords and make my unfound password substantially smaller. Probably increase number of threads by five or six. I mean forks, not threads. So, they're actually separate processes

forked. Thank you all for coming.

Cracking 936 Million Passwords

Related talks