← All talks

PW - Unspeakable Passwords: Pronounceable or Diceware - Jeffrey Goldberg

BSides Las Vegas47:1558 viewsPublished 2016-12Watch on YouTube ↗
About this talk
PW - Unspeakable Passwords: Pronounceable or Diceware - Jeffrey Goldberg Passwords BSidesLV 2015 - Tuscany Hotel - August 04, 2015
Show transcript [en]

it's i'm not talking about whether it was scheduled so unfortunately they couldn't make it i jackie would love to heard that pop but this is good stuff that I've been kind of thinking about and working on I kind of had something about Pluto actually talked about which is unspeakable passwords or issues surrounding pronounceable passwords we've all seen pronounceable password generators art is there anybody who hasn't seen the pronounceable password generator oh great now um pronouncing the passwords are particularly interesting for me so little backup I've worked your agile boots the makers of one password a password manager and one of our big issues is making sure that people have routemaster passwords and so a packet manager is great for having a

whole bunch of passwords that you never have to memorize and know or you can see it will be very strong but a lot depends on your own master password and we're trying to figure out how to help people come up with better master passwords or use better master passwords and in general if we derive things directly a password I a person's password manager but a person's master password is going to be complete plot in in the security if somebody gets ahold of you uses data then then it can be hacked by ball in my password crackers these things are things that need to be memorized they need to be things that you type in and I need to be able to be

tugging on multiple busts that is not only on a full keyboard but also a mohawk keyboard with a finger or thumb and of course we want these things to be as strong as top now one thing and of course we use is is a dislocation of refusing to BK yet you with with h match sha-512 to kind of slow down automated passage se but we're pleased to see that a successor to pee pee ke f 2 is on the way the password hashing competition has announced are gone q as the as the winner of a password hashing competition and it is going through a lot of extensive tweets and provisions of the moment but should we finalize relatively

soon and then there should actually be a successor to have to escape decrypt that the committee universally recommend to web developers and others so so that's a good thing basically these things slow down the rate at which average cat is the esocast but in our dressage we're using it solely for key derivation look we're not using ah your master password is never used to authenticate to the server it's no communication with any sugar out there and therefore rod of the one of the features that an optional feature of the exam password hashing seems sometimes called pepper is a secret assault one thing whatever that is known only by the server and is not stored with the salt

the password hash you know might be enough and I hardware device or something like that so pepper which are gone post k is not an option for us and so ah and so all of the actual sequence must come from user in our case and so we have no k mixing our are gone and this allows me right now making extremely obscure joke at this moment which is older than the earth itself and I'll explain that vacances people actually want to know it's not really very funny but I enjoyed it have in it through a hoop agency hey the big limit with with these type of path we have instance is that the work and to the

defender the work of the head to the attacker is proportional to the work that they enter to the defendant in normal crypto if you double your key length if you don't your deal em you're adding an enormous amount more work to the to be attack the defender may need to double the amount of work that you do or maybe even square base you know the key length but the words of the attacker has to be going from website a 56 fifty to a hundred twelve fifteen is absolutely Marcus adjustable watch the mouth with something like a slow hashing schemes the amount of work that each actor has to use proportional to the amount of work would be defender

has rooms not normal crypto and it actually means that as heating power increases the kind of arms race between the attacker the defender biases towards the time so these things are useful they work against very pretty dress but slow passion as much as I love are gone but empathic and competitions done it does not solve our problem okay um that's where you probably don't know it under that name this correct horse battery staple rule the picking words absolutely random from a wordless this notion goes back a long time esky probably in the early eighties probably the early pieces that i know of is reinvented in 1995 viral my whole dice where I kind of helped me popular on it for years ago if

that got picked up by XKCD but these schemes where you've got a fixed worthless and pick absolutely randomly from them the general idea is that these things will be memorable they're wrong but they're going to be easier to type because you're hoping familiar language and you know say this you can learn to memorize them because they're automatically generated or not generated by a human process they're going to have my randomness properties and then pronounceable pass alert they just create strings that kind of looks like your target language the things that could be used if you have to read a password over the phone so for example I all of my security questions for you know the bank says you know what's your

dog's you know what was your first X name I usually announce it will generate it for those thinking that it might be some instance when you to beat them over the phone obviously I don't use my first pet's name do this that's not really all that secret okay i'm talking about target language i'm an american therefore i'm exceedingly provincial and i have no awareness that the rest of the world exists and as such i'm going to be talking about english actually what's got to be good thing because english helps illustrate some of the problems that we wouldn't have in pronounceable generators if we were saying looking at japanese it's really nice simple social structure okay um now fordyce where

monkeys the word lists esky has a 248 words in our list of the original dice where scheme has 7776 neither those are arbitrary the word list that we've been putting together over the last couple months is sitting at around 18,000 English words all eight or fewer characters and trying to be relatively familiar not too many skewer works showing up in there but the average length of each word is quite a bit longer than we find with SQ or die forelle talk a bit but now going back to two syllables because I didn't put my slides in a mutable order going back two syllables what we've got for difficulties some are more pronounced in english than others

stelling and pronunciation in English is you know not all that a cohesive the some syllables are far more common than out of a syllable like teen will show up in words far more often than a syllable like strengths strengths is a single syllable it's possible in English but you don't find lots of syllables of that form I'll skip a bit about the Africa complications of syllable structure and what goes at the end of a syllable can depend on what precedes so now just sticking with writing school is certainly a possible written syllable in English with an actual word fall is a possible word but isn't an actual word but you would never see anything like SP

00 ll it's just not the kind of thing you can see in written English so we have these sorts of the dependencies now we can avoid the first problem the thing about the difference between spelling and pronunciation by all we can about Stealth if i'm not actually simply because we're presenting people with reading texts written words we don't actually have to care what the underlying phonology is because these structures we can just look at structures with written works and base our system off of that and because I'm no longer a student in linguistics I don't actually have to worry about getting everything right it's fine if if I approximator we approximate and his certain special cases of syllable

structures or get some things that actually aren't possible because I'm not trying to provide an exact description that was a possible civil and you which I'm kind of gets on the purchase for a pronounceable password generator difficult problems and now what most pronouncement generators have done in the past when dealing with the issue of some syllables be more likely than others is that they've simply generate there they've made a generator mimic that they produce more syllables like team than syllables like strengths they simply promised their generators to make frequent loveliness and ought to deal with things like that SPO oll case sometimes they will generate and then filter out things that don't work so they've got a system of generating

possibly through a finite state machine possibly food promising main structure rules possibly just a mixture these things added in hodgepodge and also some filters on users so that's how they that's how most have done it so far and phipps 181 the mist recommendation and that is extremely common used all over the place p w jen is a lot is a bit less bias they're focusing more on they're focusing more on the SPO out the context dependency issues within the frequency machine and a bunch of others use I use tips one uses a Markov model based on spellings so all of these deliberately as a matter of design produce non-uniform up some generated passwords are going to be more likely than other

generated password from the system and as I argued you know talk that I'm sure nobody wants me to repeat to use it well here when you have a password creation scheme that is not uniform you should be looking at the mean entropy you should be looking at the probability of getting the most likely generated password for judging what that steam is so these systems that are deliberately non-uniform are weak the problem is that they're a pain to analogs they're typically hodgepodge of different rules stuck together and so it's actually very difficult to even to figure out how strong they are and I seem to probably have this slot so yes um we should be using them in entropy that we should be

looking you happy likelihood of dangle the most common password and analyze them now creating a uniform scheme is really easy you just take the kind of consonants that could be at the beginning of a syllable by looking at a whole bunch of words take the kinds of strings of ballads can be in the middle and the kinds of consonants run together consoling clusters it to be at the end and you just and you just think from each column and sticking together so the stuff the beginning is called an on-set the nucleus of a syllable is the bowel stuff and again i'm talking about England for not talking about several creation in which Harvey you know where

the hood can be a syllable or even the second syllable of model in English just as an L but we're talking about stone here anyway um and wrong way on track uh then if he felt his three parts you stick them together and that first part the concept and also the coda can be not these can be empty strings but we need to treat them as we need to treat them as we would treat any other confident cluster that you know the beginning or the end of a syllable because even though let me just be right about here a large portion of actual symbols in after English have either an empty arms that are in EM Dakota but we must use str as

an offset as often as we'd use the empty string in order to get a uniform distribution and likewise we have to use st Otto's often do something like teen and so although English naturally biases toward lighter syllables our system in its uniform ality will actually produce mostly very heavy silence so with this green deck uniform generator and just picking comedy SH on sets and nuclei and koda's I've got something that produces about 15 and a half beats per syllable but the results really aren't very English like hear of your space as a separator and these are not on cherry-picked examples for this talk I simply grab the thing a few times for different lens and pasting what I got so

so for example in the first one SP 0 0 SS is one of these things that actually violates the typical groomsman could not find a word like that the same with a second word in that first example which not even clear how to pronounce it some cases we get things that actually are words like whom and creeks and some cases some cases we get words things that are close to words that maybe not everybody's happy before the passwords and maybe they are like and I really really hope that if anybody's just listening to the audience audio that they actually look at the slide realize these were non cherry pink examples now with word lists here all of

these things actually are words of English characters or less and you see the kinds of things that we get some actually come out like they can almost be meaningful expressions even that even that six word think at the end was seized matter Rick wintry unarmed John fourfold I don't know I mean so one thing has got to be concerned about this nasty meetings because of course our usage would be to be recommended master passwords to users and saying you know we were trying to get away from users making up their own master password but you're going to be inherently weak and so I so we run the run the risk of insulting people offending people

already move remove tableau words from the list nonsense words in some seamless with the syllable generated nonsense words can sometimes be words and sometimes we feel instead of a problem and now everybody in this room will understand when you're presented with random stuff that it's random thing isn't trying to tell you something but again we are our audiences not the people in this room that we can still lose the system is for a lot of people who don't understand rebels ok so in comparing the word lists in that particular pronounceable generator we get a few more bits per character and a few more bits per unit from the pronounceable generator and so including the separator if you want a password

that gets you to at least 40 bits you need to use at least three you both but it's going to be on average two characters longer with the works and with the them with a syllable generator and you're gonna and that's going to increase as you increase the length and again we are concerned one of the limiting factors is what somebody is willing to type in as master password on a mobile device so neither of these are great in terms of typing in on a mobile device the syllable one's a little bit better and in other things we expect that the meaningfulness of the word list has got to make these things easier to memorize and learn and the call over

longer term suspect the typing familiar works on a full-length keyboard is going to be quicker than typing nonsense words on a full-length keyboard typing the shorter things are going to be easier on mobile keyboard that on a full keyboard and we're more likely to encounter nasty means with the real word list now all of that is speculation what I would love to see is an actual solid research I will study usability in exactly in this kind of stuff and I talked ahead of mice lives so yeah we need to be we just need to start experiment that Sun a chiral thing to do it's easy for us to speculate about these things but we need

to we need to follow it up with actual data do and we also need to remember that we're pushing one end of the process we are working to make a mess to make passwords the password that somebody has to keep in their head stronger well something like slow hashing like argon to or became you have to is designed to make those more computationally expensive yes which is kind of pulling down from the other end but I don't think we're going to get those to meet and I suspect from meeting the abstract of formats talk later this afternoon that it's going to be talking about that but that's just a guess certainly look forward to that and this

is me I'm Jeff Goldberg I'll be here uh through Thursday leaving friday so I'm not to DEFCON a clear Jacob Goldberg Jeff and agile bits calm and I post a lot about security matters in cryptography on the edge of each flawed and I have any questions Oh way there oh hey Mauro Steve microphone low base turn it off guess out what is the middle length or oh sorry the minimum yes okay so the question was what is the minimum length of the dictionary you mean the firm minimal length of items in the dictionary in all of them it's one you're concerned at all let's say you Pig a for you know let's say a forward

password and they're all one letter words so we're getting four plus the separators 70 you're getting a seven character password yeah um yeah that's that's a concern that's bad I don't know what to do about it at this point I mean there are things that I can do about it but have left that out here because they just make things just removing anything less than or but so that will up the length of the passwords that people end up having to type in a lot is it going after length is it going after strength that is this ah you know I mean length you know people just typing 20 once that's not really a very strong password

and as I've argued before the strength of a password scheme depends the strength of a password from the scheme depends on the scheme that is generated from and we should always assume the attacker knows how the user generated the password in particular if we are going to generate passwords for people to use we have to assume that the athlete knows how generating those passwords so length is the is the onion it was kind of adding to that it was kind of cool when when we had a linkedin breach in 2012 after that I think the paper was released by Carnegie Mellon in 2013 coming around they actually created a post-race generator that created English dramatically correct sentences

as candidates for past races and using that honky mother actually were able to crack horse races from the Lincoln bridge but nobody else including jeremy gosti hadn't been able to crack so from how they was just dramatically correct English sentences which was to me was also gonna give up with the paper that was really impressive no so so if you were slipping into the stream of the grammar English then again you soon that's not by the attacker so the idea of a scheme like this is actually picking each element at random uniform oh I've got an idea that may enable the use abilities of approval given that uses likely remember the tentative pronunciation of the passphrase is

rather than misspelling maybe I'll make it simple for them to stealth and sing three Beatle user Isabella password by generative indications of what if they can get replacement members pay 1 l 2 hours and so on so they can verify whether one of maybe 25 possible passwords based on one answer is the correct one ok so I'm going to try to repeat the question the idea is that since people particularly if you using nonsense words will be will be remembering the pronunciation and not necessarily the spelling your verifier should look at equivalences so if you've got a password or a nonsense word that ends in the letter C you might allow them to also put in the letter K it's kind of like

the steam that Facebook uses with with caps lock if you type in your password wrong and they get a failure they will then go and try it again as if you have a caps lock key on it doesn't actually weaken the password it doesn't make that login attempt a bit longer for you and so I scheme like that might be useful feel free to use it thanks so if okay up so I guess I'm busy right optimized to competing next you're trying to one have a half review a verizon will end remember will live you an easily type of mobile device laptop break you're also trying half as much information to a few characters as possible so those two

things whipped I was going to suggest something like book / mention where you abandon one and so we don't want Amanda didn't you annotate electrical repair college the smallest possible obvious use a backwards my strategy for you it works at Liz reaping Souls out of you know syllables revered in some way but you can generate you know long sentences where you have a 10,000 worthless announced a thousand word list of the verbs and adjectives and after that you scream together with actual punctuation and you know cavitation rounds and all that you see as well measure Lee and I'll just enter me that you're getting creo it was log base 2 of 10,000 per ounce and log base whatever you know her your

other organs will say with certainty that your sentence regularly long sentence has a 64 bits or 56 that's where your target enter he is but you no longer trying to happen all data just 30 characters supporting characters and I would like to get a better hope like that because that's a perfect either from people of them memorized mine and tight and it is sort of against the notion of and I mean if we were not constrained by people entering these things in on mobile devices something like that with anybody able to hear that question good okay um then I then something like that makes sense there are couple difficulties in the you in that for example you would have to you'd

have to make either all your verbs transitive verbs or all your verbs intransitive verbs otherwise you'll get uneven uneven distribution because we'll be using a more complex phrase structure grammar and so so you guys can end up shortening some of your lists that way yeah if I don't have to be special if you want to ensure that there's an absolute automatically corrects tentative probably you'll choose a random then I'll have to program your choices a little bit maybe I'll work in the process of creating your twist a little cube it so that you can you know over generate decision or instituted password or the National reduction is so across like a vacation grab an apple

introduces us a fold it you can get your cardio similar well yeah if you know that your own needs producing it by some fixed amount but it's nice if you do something that will generate uniforms with probably getting which means you have to use their in a sense you have to not have state in your generator that is what you do after a verb shouldn't depend on whether it was a transitive or intransitive letter I mean their way of doing this but I mean that's just a romantical sign I love that I just don't foresee us I mean for our usage the how huge limitation is type ability on the line you two guys are all sitting for

lunch yep this stuff can be swung east memphis if somebody could just do your friggin kick stop for unicode 7 you will ice somebody please Oh questions Thank You piace question you know we've primarily been talking about English words or gibberish wars have been fought about or is it but phrases that incorporated words from different languages maybe my bad pass phrase has an Arabic word of Italian word in English word is on song and how would that factor depends transfer turns it into buildings um ah that's something that I've looked at a couple years ago what is first looking at how to expand our service the one of the difficulties is that even if you is that doubling

this boredom is ah you know 18,000 by and another 18,000 from another language is adding now just one big shrimp should prefer her word against okay so the question is exactly how much get from that where you are whether it comes out to be worth it and just back to Paris comment about usually have to need to have a not only need to have everyone here with much again affirmed need to use exactly the same a normalization if they saw you oh ok HIV acute accent II can be represented by different states in UDF if you need to normalize oh but occupy crisis in movie Jess you can try to guess heart yes might be up your alley alley as well

since you were saying that or the major problems you're pointing out a bit on the mobile device I very difficult little walk here since no one I'm curious what your position is now on technologies like touch ID and I'm curious to see what paracrine students and hope every and whenever you see that as a reasonable way around this on the mobile device you know I provided this similar technology to users in I'm skeptical I don't know how well it'll work you know it's it's like here's the rub bag or some such curious yeah obviously I was if we jump with a lot of last night that I don't call it as well and one of the things we discussed is

you know get a spring from out the now moving from a minimum four you get six digits in your PIN codes Hollywood rs9 and my initial thought about that I'd it's a good thing your course my initial thought is that well because I have a tons of people coming up to me like haha sucker try to guess you know try to get access to my phone because i'm using touch ID but i just know you three times then I'm fella well I have a mark of table on you know the most popular four digit PIN codes and that is you know much as so guessing your four digit PIN code is actually pretty good so I think

my theory is that Apple are moving to six digits basically because they have seen messy and they have done with our bishops asking do you still use a four digit PIN code even though you are using touch ID entire time because if you are using touch ID all the time I can't really see the point of just having a four digit PIN code then I would strongly recommend you to have a much of a pinko to eventually a password set so you only have to type that in once every time you eventually turn off and turn back on your iOS device except then Apple I mean as far as I understand and I think this

is actually pretty clear from the public announcements they introduced touch ID to get people to use pink held at all they had a huge problem of people not using any real guns and that as far as I understand was the primary motivation for touch ID um yes using something like touch ID reduces the or or some second factor kind of thing I definitely reduces how often somebody needs to keeping a master password and it's a it's a tough thing for us we actually do make use of on iOS of storing the master password equivalent that people can unlock with touch on d but but we've got this whole set of basically guesses at what parameters deceptive that how often

should have been feared how long force people to to use the masterpath anonymous's they don't forget it but so that this thing isn't sitting around in dangerous places natives times too long well at the same time you I'm using it to encourage them to use stronger master pestles so yeah no this is definitely definitely part of it and if we could have a some kind of second factor mechanism but the four decryption that would be easy for users to use across all of their devices on which they use 1password that would just be fantastic but there's nothing that is fishing easy for everyone to use a cross all their devices most question from should come under way oh okay what is it

Larry I come to measure entropy or dictionary dictionary and ginger increased versus and and whether whether it is how accurate this entropy metrics are and whether it is possible some will drop it dictionary you're interviewing because we're clicking each word from the dictionary with equal probabilities any other word in a circumstance the calculation of the entropy measure is is really simple is simply the base to logarithm of the number of items in that dictionary so because if it works by pulling in uniform distribution that calculation is easy notice how you we are you may have some collisions you mean further shows you mean giving him we have a large number of provisions good sequences that change and I reduce

in producing producing the same thing oh I mean prefixing issues yeah this is something that we looked at so for example we could have we could have the word blue in our dictionary and we have the word burning our dictionary and we can also have the word bluebird in a kitchenette and thus there would be multiple ways of getting the string luger that's yes that's a problem I've simply taken the simple Expediency of saying that we would enforce a separator between units because when i try to pare down the list to avoid that problem it was becoming too much behaviour so I guess that's a very interesting in real problem and i sidestepped it by simply saying we're

going to have separators between DCs second thank you master home from court judges have no one's ever heard a white haploid importance allegiance but other one one interesting thing about five years ago on Steven the influence of second number of offices in frozen north marker basically won't be difficult i / actor / the most unique remembers the ventral conclusively a kind of seem consume that much you know so for those are getting here to question is that if you give people a second factor a true second factor system they can to be less careful about their primary form of atomic choice other their passwords and yeah oh so this is this is actually something that we we worry about if we

introduce the second factor it has to work very very well with apples touch ID on the touch ID is not a second factor is alternative and so it's that sometimes you use your pass code sometimes you use your finger but we've never in a situation where we're using both in combination and so since its alternative that allows you to use the for less frequently or in our case since its alternative that allows you to use your master password most frequently we don't have any data from our users of even what they explicitly come and report to us but a lot of people have said hey now that got this touch ID unlock for many cases like an is a

stronger better faster but of course those are the users were here we have known that's no data targets up now it's done for lunch I'll be with you back here in the room next big rock is kolonko four months of research to your enjoy lunch inside each other so welcome back