
thank you thank you so uh breaking historical Cipher texts with modern means and uh my name is elanka I am a game developer and a crypto Authority and a book author my website is elan.com and hello I'm CLA schme I'm a German computer scientist crypto specialist and I'm interested in the history of cryptography and uh together with elonka we've written a couple books such as this one codebreaking a practical guide so uh introducing this there are many many thousands of old Cipher texts some have been broken and some have not and the question is can they be broken with modern means so here's an example of one this is an encrypted textt by Emperor Ferdinand III from the 17th
century and here is one from North America this is also from the 17th century here is the oldest encrypted thing we know about uh this is from 1300 BC it's encrypted uniform uh someone was uh encrypting the recipe for a die okay so we are talking about crypto analysis today and I'm sure that at least some of you are familiar with crypto analysis and maybe you you even have some expertise in the field of breaking modern algorithms such as RSA or AES or Dees and things like that but Crypt analysis of modern algorithms is completely different from what we are covering today because today we are talking about classical ciphers so let's look at some of the
differences for
examp uh okay yeah uh yes what okay well if you look at um modern crypto analysis usually the algorithm is known the goal is to determine the key we are dealing with sophisticated ciphers the pl Tex is known or can even be chosen uh we have an infinite amount of Cipher text and the cipher text can be easily read well all this uh happens when you are dealing with modern crypto analysis but today we're talking about the classical case and here everything is different usually the algorithm is not known and the goal is not to determine the key but the pl text and so on so we have a completely different situation or to summarize this table
modern Crypt analysis and historical codebreaking are two very different things so when we're talking about uh Crypt analysis or cryp analysis of old ciphers let's start with the most simple case the monoalphabetic substitution Cipher um monoalphabetic substitution simply means that you replace every letter of the alphabet with something else so we have a stable table like you can see here and the most simple case is that you not even have an alphabet change so every letter of the alphabet is replaced with another letter of the alphabet and this is how the encryption works there's also the slightly more complicated case when the alphabet changes so again every letter of the alphabet is replaced with a symbol but
in this case the symbol might be something that is not an ordinary letter not an ordinary number or can be even something you have never seen before but it's still some kind of symbol or some kind of cliff and from a code break H sorry I think I always click on the wrong button I I think it should work uh from a code breaker view uh there's no big difference between these two cases because of course you can always replace the symbols you don't know with a symbol you know and this doesn't change the encryption as such okay here we have a Freemason document from the 19th century in this case we have no alphabet change because
as you see here these are all ordinary letters so it's a mon alphabetic substitution without alphabet change on the next slide we see another freem manance document also from the 19th century but this this time we have an alphabet change because as you see here we have all these strange symbols uh it's a so-called Freemason Cipher and these Freemason Cipher letters look different from the letters we know but as I said uh it's a substitution in both cases so from a Cod breaker's view it's not much of a difference and now the question is how can we break such a simple monoalphabetic substitution Cipher most of you certainly know how it works of course there are several
methods but the most obvious one is frequency analysis so if you know the language that was used you know the letterto frequencies for ex example in English as you can see here the E is the most uh frequent letter it has a frequency of almost 133% followed by the E AO and so on and this helps when we want to break a message I can show you an example here uh this is um encrypted text from the 17th century the so-called wind drop cryptogram um it was created by an alchemist so uh it certainly makes sense to break it because maybe it will tell us how to make gold but um before that we need to know how to break it well
let's start with uh frequency analysis and then we see that uh this letter that looks a bit little bit like an eight is the most frequent one so could be an e the second most frequent one is this one that looks a little bit like a j and so on and if we use the frequencies and do some trial and error we can pretty soon solve this mystery and it turns out that this letter that looks like an eight is an e and the second most frequent one is in this case an i this is not always the case so as I said it requires some trial and error but um within uh 15 minutes or
so it's usually possible to break such a system without computer support and now let's look uh what is written in here well it's an alchemical text take EX ently well purified Bas Sal dissolve or molted and so on well usually the content of such a message is is not really interesting for the code breaker so the code breaker is interested in how the encryption works and not in what is encryption in in what is encrypted and I'm pretty sure that it will won't help us to produce gold okay now uh as I mentioned something like this can be done by hand but um of course today makes sense to use a computer program for this purpose
and in fact a frequency analysis with a computer is not very difficult because there are quite a few programs around for example multi deck by Christan Bowman from Austria or R the ramkin cipher tools by Tyler Akins all this is available for free on the internet and uh just like crypto crack created by an anonymous uh developer group so in in my view the best tool for Crypt analysis and for Crypt analysis of old ciphers is this one here cryp tool it's an open source software created in Germany uh it has been around for 25 years and um meanwhile it's a very good tool with a a lot of crypto and crypto analysis functions and I use it a lot
when I want to do a code breaking work a cryp tool was developed by a team of over 15 50 people some of these are very active in the seam so I I know them well and as I said it's a tool I use quite often so let's now uh look at another example of a monoalphabetic substitution Cipher this one is the so-called bearing gold cryptogram because a British man named Sabine bearing gold in the 19 Cy created this uh wrote a book and in this book um a cipher text is contained and this Cipher text is usually referred to as the bearing gold cryptogram this is how it looks like it's pretty short only two lines and now
let's try to break it well first of all you see that we have an alphabet change here because all these are symbols or numbers and not ordinary letters so so the first step is usually to create a transcript a transcript simply May means we replace every symbol from the cipher text with an uh letter from the ordinary alphabet how this is done uh doesn't matter it just needs to be consistent for example uh you can see here there are two fors in uh in this picture and both of them should be replaced by the same letter of course otherwise you get in trouble if you do text statistics but in this case we use a c to replace the
four then we go on there are two ages again no sorry there are two plus signs again we should use the same symbol to replace these the H in this case and so on and uh what we have now is the so-called transcript so it's the same encrypted text but now instead of an alphabet change we have orinary letters now we could do a frequency analysis but this probably won't help because the text is too short so we need to do we need to apply apply another method and today this is no problem anymore because today in the age of computers we have a very good method named hill climbing well said so now that we've
converted it into letters which is something that our brain can more easily Munch uh we do the system where we uh we're going to generate a random key and that generates a substitution table and then we're going to decrypt it even though it's going to come out somewhat random and then we're going to rate the correctness now I'll come back to that later now we're going to create a new key that is slightly changed again random but we're going to keep the copy of the old key now we're going to decrypt with the new key and again we're going to rate the correctness and again I'll come back to that and we're going to see has the
correctness increased if so then we're going to keep the new key and get rid of the old key if not then we restore the old key and then we rep repeat the steps and so what we're going to get here is we're going to climb a hill where we're steadily looking for better
correctness and often when we get to the top the best correctness we will have the solution not always sometimes we get to something called a local maximum where it looks like we've gotten to the top but we haven't gotten to the top top and and so this is called a local maximum and there are ways of dealing with this there is a system that's called simulated analing analing is a term for Metallurgy where they will heat metal and then they'll kind of heat it and then cool it and what they're trying to do is affect the uh the properties of the metal and with simulate a kneeling you have a program where you're adjusting What's called the Heat and
maybe you're going to have a a fast or a heavy heat or maybe you're going to make it cooler and what you're trying to do is you're trying to get it off of that local maximum and all the way to the real top and we could talk a lot longer in simulated kneeling but kind of outside of the scope of this so correctness how are we rating the correctness well in this particular case which we're going to check on the letter frequencies and we're going to look at the candidate we remember we said we had that random candidate and we're going to look at what kind of letters are in it and if it has frequent letters in English that are
frequent in the candidate then that would be a high score whereas if rare letters are rare in the candidate that's also a high score right but if we're getting the plain text candidate and we're getting the rare letters are being common and the common letters are being rare then that would be a low score okay so looking at the bearing gold cryptogram let's say okay this is one step and so we're going to have our substitution table our plain text candidate the result that uh score and the step we're only looking at successful steps in the in this case all right and then we go to the next step and so forth all right so here at the uh penultimate step
we have a plain text candidate of a MD in the hand is worth two in the mush which our brain probably you know a human brain kind of jumps on that but the computer hasn't quite gotten it and then it'll get to the next step a bird in the hand is worth two in the bush which is the correct plain text for this particular cryptogram so many other monoalphabetic substitutions and more complex substitutions as well have been solved this way all right now we're going to go to a different system yes now let's look at the so-called turning Grill encryption I think most of you have seen this before uh these are original turning grills from the 19th or even the
18th century I can show on this slide how this kind of encryption works first of all we need a plain text to be or not to be and because this one is too short I have added three X's for padding and now we need to write this plain text into this stencil starting with the first four then we turn the stencil by 90° and then the next four letters are inserted and yes well this I think this is a well-known kind of encryption and it's a a transposition encryption that means nothing is replaced only the order of the letters is changed and now um what you can see here is uh the plain uh sorry the cipher text I hope you can
see it even from the row behind well this kind of encryption was quite popular especially in the 19th century or even before here you see a few examples from the Netherlands or from England from Italy and of course well now the question is can an encryption of this kind be broken and uh the answer is yes there are manual methods to break such an encryption so already in the 19th century it was possible to break this kind of encryption but it's very laborious and so now the question is can we use a computer-based method for this task and the answer is yes we can use hill climbing again well um it's clear that H climbing
can be used but it's not really clear how well this works or at least it wasn't clear until 2017 or at least I didn't find anything in the literature so what I did is I created a challenge and published it on my blog so I took um a text encrypted it with um with with a turning gr and a pretty large one 20 by 20 and I published it on my website and I uh told people to solve it and so I was quite excited to know how long this would last and the the answer is um it was pretty fast So aling Co a German Cod breaker only needed a few hours to break
break this encryption with hill climbing so apparently if you have the right tools it's quite easy to break it and of course if you're as good a code breaker as I mean uh the uh the algorithm here is exactly the same as the one Elona explained the only difference is uh we need a different scoring function because letter frequencies are not helpful here because as I said we are dealing with a transposition cipher the the the order of the letter changes but nothing else so the frequency of the letters stays the same so we need something else but what we can use is we can use letter pairs or so calleded diagraphs because the diagraphs change
of course and uh the diagraphs are characteristic of of every language for example in the English language something like TH or n or E is quite uh quite frequent while rare letter pairs are things like QR or CX or PF so if you have a frequent letter pair it gets a high score otherwise a low score and this is exactly how Armen C this German Cod breaker broke this uh turning Grill challenge as I said took him only a few hours including doing all the programming and this is the solution well the solution or the plain text is not relevant here because it was Challenge and the the bottom line here is that even messages encrypted with a very
large turning Grill can be broken with hill climbing today so it's certainly not a good idea to use this um message for this this method for messages that should be secure and there are better methods available today anyway okay so much about um turning Grill encryption the next chapter is CED by elonka again hey nomenclator so also I wanted to say I appreciate that everyone is in here CU I know how hot it is in here and especially with uh the masks and I see the sweat and it's like everyone's kind of melting out of their chairs uh so thank you for sticking with this um so nomen clors uh Clauson I actually had uh quite the argument in uh
for our book on whether to include n clors uh uh he convinced me I have religion now so um what's a nomenclator it's it's sort of a monoalphabetic substitution that also includes complete words they're often proper names and the term nomenclator is comes from the person who would be say at an event who would call out the name of people as they were arriving and it's been around a long time so here is an example not an actual nomenclator say we're going to encrypt the phrase we'll come from London to Berlin and we're going to use this table on the left hand side some of the words we're just going to uh encrypt letter by
letter and sometimes for example for the word from we're just going to put two digits because that's from the table right and then London same thing two we're going to do letter by letter and Berlin just two digits now here's actual nomenclator table from the 17th century and there's several different parts of it parts of it are just letters and you can see there's actually three different choices for each letter in this table so it's also got what's called a homophonic option and there's also these places where you could take each letter and followed by each of the possible vowels and then that could be one of a couple different numbers and then you have the
actual names in this table which are there and then you've got three digits so how do you solve a nomenclator message well the simplest method is you find the table however that's not always possible so um there's a different ways of doing it you can uh derive the table from other messages maybe that have been solved um so I'm going to talk very briefly about the Zodiac Killer the U this is a a serial killer in in Northern California and he had sent uh encrypted messages to the press and saying if you can solve these it'll give you information about who I am and one of the messages was solved very quickly by a husband and wife team Donald and Betty
Harden uh he had had interested in ciphers but she was more uh I think the the Intuit of the two and and between the two of them they they solved one of the messages called the uh the z408 we call that CU there was 408 symbols in it and and then there were other messages that were as well and you can see here from the 408 that it was also what we call homophonic because each uh letter could potentially have multiple different symbols then there was a z340 and this became one of the most famous unsolved codes in the world and it remained unsolved for over 50 years and then it was finally solved by a three-man team
using modern means computers including hill climbing and then they published their solution in December 2020 and it was really interesting because they were on different continents one from the United States one from Australia and one from Belgium y vanike I actually just met him a few weeks ago and Claus and I refer to this as one of the greatest successes in the history of non-military Crypt analysis um so I'm going to talk very briefly about how the the 340 came apart if you take the first 20 letters of it and then you come down diagonally sort of like a night move in chess right and uh and then you make a substitution table again the homophonic table and the
plane text and this is I hope you are having a lot and I'm not going to read you the whole plane text cuz this uh this guy was was not all there obviously but they did clearly solve the message and then that leaves two messages the z32 and the z13 which are not solved yet some people say they will never be solved because they're too short others say well maybe it's going to include some combination of systems from the others so and some say that you know maybe the z13 the potential solution is Alfred E Newman who if you're a fan of Mad Magazine you'll know that character so let's go over to a few other unsolved
Cipher Texs yes so um this is this is the most um or the the best best known unsolved Cipher text at all it's the so-called voage manuscript you might have heard of it it's um encrypted book from the 15th century it's a handwritten and a hand drawn and uh it has never been solved so it's not possible to read it the it's written in a script that is otherwise unknown and uh the there are many pictures in it and the pictures can usually can't be identified so there are a lot of plants in it and it's not really clear what what kind of plans this is supposed to depict and this is clearly one or maybe
in my view it's the most important unsolved cryst crypto mystery in the world but there are others and there's especially one uh milona is an expert in give me microphone so so my favorite is one that's called cryptos This is at the center of CIA headquarters Ley Virginia and um I I've been uh kind of toying with this one for for decades at this point and uh some people say that this is one of the most famous unsolved codes in the world uh and I'm not going to go into a great deal on it but if you look at the uh the plates that are on it we have the ciphers that we call one two
three and four um and then there's four which is still unsolved and it's 97 characters there at the very Bottom now the artist has actually given us some Clues towards solving part four uh this is Jim Sor and uh so in 2010 he said okay well at this location at the 64th character we have the word plain text Berlin and then uh he gave us the word right after it this was four years later uh the word clock and then uh in January 2020 he gave us the word Northeast and then the pandemic hit and so he kind of wanted to stir things up and so he gave another clue which is the word East so
here we have a sizable chunk of the plain text from K4 and we still don't know what the whole thing says so there's a lot of theories on K4 all right and let's see then we've got another thing that we've been working on this is an encrypted postcard from 1873 this was sent to us by a man who found it in his family uh documents he said this was from his great great grandfather father uh George Furlong uh he was the owner of a of a soccer of a of a football team uh a club football club in Luton and it was a postcard that he sent to his sister so we figur that it it can't be that difficult and we
have these things that are underlined and but it's again it's something that we've never been able to solve and there are many many more unsolved Cipher Texs out there so a lot of computer systems have been used on these but they remain unsolved so research is ongoing uh so conclusion very briefly here um breaking historical Cipher Tex is an active field of research uh it's different from cryptanalysis of of modern methods uh because like we don't have an infinite amount of Cipher Texs that we can work with and then try and figure out the algorithm um but the hottest technique that's out there right now is definitely hill climbing and uh there are still many old Cipher Tex left to solve so any
questions yeah how about um all right he's going to hand the microphone right there I saw hand go up yeah I was going to ask if hill climbing uh works if like multiple rounds of encryption have occurred uh okay so you're talking about super encryption where you've got uh Will hill climbing work uh it it can what have you heard yes well uh some like this happens for example when one tries to break Enigma messages uh so if you if we're talking about an enigma it has a plugboard and then it has the rotors and uh doing hill climbing on the whole system usually doesn't work because there there are too many variants but uh something you can do is
uh do two hill climbing steps so first of all do hill climbing for uh the the rotors and then the second time for the pl board and uh this works uh the difficulty in here is the scoring function so it's if you have two different encryption steps you need to uh tell if a text is good or bad based on an encrypted text and uh this is really difficult but uh depending on the system you're working on uh it works or at least there have been examples where this worked quite well uh another way that super encryption uh and hill climbing might go together is you might have something that doesn't have a typical uh graph a
frequency analysis graph for a language and so you're going to go through hill climbing and you're trying to find something that matches that graph where you've got a a peak and lows instead of everything just kind of even yeah good question more questions
are the techniques still effectives uh if the languages have changed over time so like frequency analysis on Old English for example sorry I didn't yeah a little louder please are the techniques such as frequency analysis still effective for old language you can take the mask off just just yeah okay are the techniques still effective such as frequency analysis on Old English for example yeah uh yeah doing frequency I mean it helps if you know which language that you're dealing with like if you know you're dealing with English or French or German or Latin um and and so sometimes again you've got that uh kind of trial and error you're looking for which kind of a pattern what about
languages that you don't know what the plain text uh frequency H languages where you don't know the frequency analysis that would be really hard you have to make one yourself yeah yeah you're you're building it as you go out of curiosity from from from my perspective uh if you want to learn more about this you know how do you get the interest into uh doing this kind of work you know any suggestions on where to ask where to start on this any YouTube videos any websites any blogs or you know how do you start and there's a book up front there by lka yeah so that's a way into crypto analysis anyal questions uh Simon Singh
also wrote a wonderful book called the code book um there is a website called mystery twister uh where people will upload uh either classical ciphers or ciphers they've created on their own and you can look and see each one how many people have solved it so if you want something easy go to something that's got thousand of solves and if you want a challenge you want zero solves or one solve yeah mystery twister yeah yeah more questions do you know of any initiatives using uh generative AI to uh help on solving some of those okay has AI been able to help solve these large language uh models in my opinion no uh when uh uh
when uh open AI came out and chat GPT and I'm like oh okay part four of cryptos I know it's 97 letters I know we got Berlin at the 64th character and what I rapidly found out is that chat gpg cannot count and people say oh that's not true I'm like yeah yeah it is true I said just give me a sentence that's 97 letters long it can't do it you ask it five times you'll get five different sentences of different lengths so um not yet I I think is my answer so far maybe I can add something uh the answer is completely correct not yet but I know of research projects that try to
change this well at least in uh in theory it is very well possible to do all this frequency analysis and all these other statistical tests with artificial intelligence I guess uh this will work well in a couple of years because that's usually the first step you perform frequency analysis and a few other statistical tests and then you draw conclusions so it could be a substitution Cy or could be a transposition ciper and this is certainly work that can be done by artificial intelligence so does any of these techniques you've been mentioning work on One Time Pad ciphers um well if onetime pad is used properly uh no method at all helps because it can't be
broken it's 100% secure um usually or one time pads have been broken before if the uh the random sequence that is used is not really random for example if it's repeated and I guess that things like these can be done or can be solved with the computer it's probably difficult to do it with um uh with the methods we introduced such as simulated analing or hill climbing because you don't really know what your searching for so that's basically the problem maybe AI could help here because you could analyze a stream and then draw conclusions hill climbing probably not would be my guess or or perhaps maybe just the xkcd strip of the uh uh the famous XKCD strip where
you have uh two people saying one PE person say says to the other that blasted they are using a 496 bits key we're screwed and then on on the next picture it says well we'll just use this $5 wrench and punch somebody in the head and then we get then we will get a one time P that's uh also a kind of trip analysis but not not the one we cover in our talk oh you oh yeah you don't have anything about fiscal violence in your book um yes may maybe we should include that in the next Edition it's called rubber hose Crypt graphy yeah or or just interrogation techniques they are very efficient sometimes any more
questions none okay yeah uh do you know if any of your methods to do a cryptoanalysis have been used to actually understand some of the languages all languages that are not necessarily encrypted but uh but we don't know the answer is sort of like helping like my heror Graphics that have been solved already but some of the others well I have to admit that I don't know much about old languages I know that there are certain relations between uh Crypt analysis and uh trying to read old languages but I don't know anything about it I I'm a computer scientist not a linguist so I'm afraid I can't say much about this uh the main thing is
that with modern means we have the computers that can do the large databases and sharing this information around multiple continents and and that's definitely helpful um in terms of uh hill climbing uh I haven't heard of anything but um for example with uh nomenclator tables often we have an encrypted message so someone's gone into an archive they have found encrypted message we don't know what system it is we don't know it's a nclat or what and and so then the next steps are well what do we do well we look at other messages near it right okay so those are no incls maybe this is a nclor does this one use the tables that these other ones did maybe maybe
not uh there was a uh big Discovery uh recently about messages uh written by Mary Queen of Scots where uh the table they didn't have but they deriv it from the messages that they had okay well thank you lka and Claus for coming along to pastor conon and thank you to [Laughter] you and there are a couple of books up here for those interested I'm pretty sure you can ask about them and we'll back at 5:00 with Dwayne Dwayne McDaniel doing the talk live long live shortlived credentials autorotating Secrets at scale so see you back at 5 thank you anyone it's going to Defcon we'll have two talks there as well