← All talks

Elliptic Curve Cryptography for those who are afraid of maths

BSides London28:3755K viewsPublished 2015-07Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
DifficultyIntro
StyleTalk
About this talk
Elliptic Curve Cryptography (ECC) is hot. Far better scalable than traditional encryption, more and more data and networks are being protected using ECC. Not many people know the gory details of ECC though, which given its increasing prevalence is a very bad thing. In this presentation I will turn all members of the audience into ECC experts who will be able to implement the relevant algorithms and also audit existing implementations to find weaknesses or backdoors. Actually, I won't. To fully understand ECC to a point where you could use it in practice, you would need to spend years inside university lecture rooms to study number theory, geometry and software engineering. And then you can probably still be fooled by a backdoored implementation. What I will do, however, is explain the basics of ECC. I'll skip over the gory maths (it will help if you can add up, but that's about the extent of it) and explain how this funny thing referred to as "point addition on curves" can be used to exchange a secret code between two entities over a public connection. I will also explain how the infamous backdoor in Dual_EC_DRGB (a random number generator that uses the same kind of maths) worked. At the end of the presentation, you'll still not be able to find such backdoors yourselves and you probably realise you never will. But you will be able to understand articles about ECC a little better. And, hopefully, you will be convinced it is important that we educate more people to become ECC-experts.
Show transcript [en]

Hello. Oh. Oh, hi. We're we started. Thanks. Um, okay. That's me. Um, 15 years ago when most of you would probably probably were already quite skilled hackers or security researchers, my dream was to become a mathematician. Um I was studying mathematics at a university in the Netherlands and following lectures on among other things elliptic curves which I thought then and still think is one of the most beautiful areas in mathematics. Now life doesn't always go as you plan and in 2015 I'm not a mathematician. I work in IT security, but these elliptic curves, they do play a role in security, namely in elliptic curve cryptography, which means that I'm really excited to be talking about this because it does

feel like like meeting a long-lost friend. Uh, and I'm also very honored that people voted for this talk. So, thanks for that. Um, before I start, a few disclaimers. Uh, firstly, you're not going to learn anything in this talk that you can actually use. Um, cryptography is hard. Elliptical cryptography is not necessarily harder because that's subjective, but it requires one or two years extra of mathematics, which you can't squeeze into a year. You definitely can't squeeze it into a 45minute lecture. If you were expecting to learn anything useful, you probably should go to one of the other rooms. I also don't want to stand here pretending I'm a cryptographer. Um, I I understand the math because I used to do

that. I could implement the algorithms but I'm a sloppy programmer would probably make a few mistakes. I wouldn't be able to find fault in other people's mistakes and I don't know all the details of the research um what to look for in in in uh implementations. And finally I'm really aim this talk is really aimed at people who don't have a much understanding of mathematics. Uh so some I skip over some of the gory details. Um, if if you happen to know a bit about lipic curves, you notice this and I apologize for that. I think it's it's more important that everyone just understands what's going on. Whenever you see or read something an read an introduction into elliptic

curves, they always start by saying, well, it's something that satisfies an equation y^2 = xq + a * x + b for sum a and b. And that's also a prime number because this is crypto and they're always prime numbers. I think this is confusing. This is completely irrelevant for just to have a basic understanding. But what is important to know is that there is a choice to be made. Different A, different B, different P give different elliptic curves. Some of which are better than others. So there's not just one elliptic curves. There are many an infinite number of elliptic curves. And one of the problems is which one to choose. What is more important is to know that

an elliptic curve is something that looks kind of like this. And you see the uh see the curve, you see the x and the y axis at the background and on this curve there are a very large number of points and it's these points that play a role. Now computers can easily represent these points by their coordinates or just by numbers and and in this talk I will assume that points are numbers and for this this is more or less true. That's one of these details matter a lot because that's where you could make mistakes but I'll just assume that a point is represented by and represents a number. The most important property or one of

the most important properties of elliptic curves is if you take two points on the curve P and Q and you take the line through P and Q there's always one line the one such line then there is a unique third point on the curve on this line and a second important property is that the curve is as you can see symmetric in the horizontal axis the x-axis so you can take the vertical line through this and um you get another point on the curve and this point we call P plus Q. I can't emphasize enough that this is not supposed to make any sense. This is just a definition of you define P plus Q this way. If you're studying

mathematics and you learn about elliptic curves for the first time which would make you a minority because it's within mathematics it is a bit of a niche subject. If you would learn this it wouldn't make sense either. There is a bigger theory behind this but this is like even more advanced math and this is what we call point addition on elliptic curve which means for every two points on the curve for every two p and q wherever they are we can define a third point which we call p plus q that's another point on the curve and the second nice property is if you take a point p on the curve um well You may you may wonder some of you may

wonder what if you want to define a point P plus P. Well, you take there's a unique line through P that touches the curve. It doesn't go through it. It's called a tangent line uh to the curve at P. This line has a is also a unique extra point that goes through the curve uh that lies on the curve and on the line. And again you can take the vertical line through it and you find a point P plus P. And we actually call this 2 * P. And this is called point doubling. You can double a point. And you can you can combine it. You can get 2 P + P. Again take the line through it. Take the

vertical line. You get a point 2 P + P. And as you may have guessed, this is called 3 * P. And you're going to get four times P and fifth time five times P and six times P etc. This is called multiplication or integer multiplication. You take a point and a number and you take the number n you take n times that point for whatever n is. And this is the algorithm computers can implement it and computers are very good at that. So yeah. So um we can add points to each other. Take points P and Q. We define we define the point P plus Q. We can multiply points by an integer. And it happens and

you don't have to worry about this but this all satisfies nice properties. If you add P uh P plus Q is Q plus P etc. Um this this means that the points form an ailion group. You don't have to know what it means. You don't even have to remember this. But if you're a mathematician or a cryptographer, this is extremely exciting and mathematicians don't get excited very much. Now, it happens that multiplication by integers is very fast. So, if you get a point P and you want to get the point 100 times P, then the algorithm I just showed which you can implement in a computer, uh you just add P 99 times to this point

P. you get P 2 P 3 P 4 P until you get 100 times P. However, computers can do this much faster and as follows you take P, you double it, you get 2 * P. Next step is you add P to it 3 * P. Double it again, six * P, double it again, 12 * P, double it again, 24 * P. Then you add P to it to get 25 P * P. Then you double it again 50 times P and double it again 100 times P. So these are eight steps rather than 99. If you start to multiply by integers of 50 digits or more and this is what happens if you do elliptical

cryptography the this is extremely fast. I mean eight or 99 from computer doesn't make a difference but we're really talking about many many orders of magnitude. However, the other way around which I call but isn't officially called division u is very slow. So imagine you have two points on the curve p and q and you know that q is n * p for some number n. It could be 100, could be a million, could be 17 quintilion. You don't know really the best way almost the best way to find this number n is to start with p then get two * p three * p four * p until you reach you reach the point

that's q and then you know the number n this is an extremely slow process and as I mentioned if n has 50 digits um this is just undoable for a computer even for a fast computer even for the kind of computers that I have in a data and renew and this is called the discrete logarithm problem for elliptic curves and it's this that is the basic basis of of elliptic curve cryptography the fact that multiplication is fast but division or taking a logarithm is it's officially called is very slow so the most important or probably the most important uh kind of elliptic curve cryptography is the elliptic curve diffy helman algorithm ECDH and sometimes elliptic ECDH E where the E is for

ephemeral but that's irrelevant for now. It's used in the following way or or to to get to solve the following problem. Two entities, let's call them Alice and Bob because we're doing crypto. They want to agree on a secret key over public channel. For example, Alice is a web server. is a web browser and they want to agree on a secret key so that they can encrypt their connection. There are very fast algorithms like AES where as long as you have agreed on a secret key then you can encrypt the whole connection very quickly very fast in real time. However, how do you agree? Uh because before you start encrypting nothing is encrypted and anyone can read anything.

So the algorithm is as follows. First they agree on elliptic curve to use and as I said there are many different curves and the base point P on this curve. Alice then chooses a large random number A. She doesn't tell anyone about that. She just keeps it to herself. And Bob chooses a large random number B, lowerase B. Now Ellis computes a * b and even if a is very large we've seen or I've shown or at least I've told you that this is something a computer her computer can do very fast and she shares this number sorry this point which is kind of a number a * b with Bob over a public channel she

doesn't care if anyone can read it Bob does the same with his secret number he computes b * p b he keeps secret but he shares the resulting point SL number B * B with Alice over a public channel. Anyone can read it. Now Alice has given B has been given B * B. She has the secret number A and she computes another point A times this point B * B which is another point on the curve. Bob computes B * A * B. So B times the point that Alice has given him over the public channel using his secret number B. And they get a secret key because A * B * B is B * A * B. So they

have agreed on a shared secret number and the the discrete logarithm problem means that no one can crack this even if they can read even if they know what curve is used if they know the point B if they have seen a * b and have seen b * b you can actually see this in wireshark um this is a wireshark session from firefox on debian linux connecting to the website of uh Bside London and over HTTPS and why actually shows it. So the browser tells the server well I know a number of cyber suits. Cyber suits is is a combin is a series of algorithms because in TLS or in HTTPS there are you don't you need multiple

encryption algorithms for different things. Um and the first six that are listed they say ECDH or ECD which is the algorithm I just explained. So this the client the browser tells the server well I support these 11 algorithms in this order of preference and the client also says these three curves that are publicly known curves that defined in standards these are the curves that I uh that I support. So if we're going to do elliptic curve defy helman use one of these three curves and the browser responds sorry the the server web server responds and they say okay one of these algorithms indeed using elliptic curve defy helman uh and a number of other things uh let's go and

use that and then the server also um shares um uh uh Yeah, the the server sorry um I should have pointed out here when the browser tells the server three curves that it supports they have numbers and one of them is hex 0017 and the server tells the client okay 0017 that's the curve we're going to use that corresponds to NIST piece 256 which is a curve and most of the other blue stuff what you see here these are the bits of the this is a number is actually the point on the curve that they're going to use. No, sorry, this is not this is the uh the the B * B uh that's shared in a

public the public point and likewise um this is a * B for what the client tells the server. Um I guess this is a bit confusing to which I apologize for. But my my main point I'm making here is you can actually see this happen in Wireshark. Uh Wireshark more or less tells you what what's happening. You can actually follow it. You don't have to understand the crypto the math behind it or you don't even know all the standards. But wireshock tells you all this. So okay, here's my here's my point. And now they have agreed on a public a public on on a on a shared number. And you can you can't see this in Wireshark

because that's the whole point. I mean wireshark can see what wireshark does a man in the middle but the whole point of ec is that you can agree on a secret number secret key while someone is listening on the middle in the middle. So what could go wrong and there are a number of things that can go wrong um by by choosing a wrong curve. Um, as I said, there's a choice to be made for the curve. And there are good curves and bad curves and and good points and bad points. Um, this is not a realistic example. This is something you can easily avoid. But just to give you an idea what what could go

wrong in theory, what if there's a loop? What if it happens that if you take this point P and you you add it to itself, etc., and 101 times P is again the point P. I mean, it could happen. This means that a th0002 * p is 2 * p and 103 * p is three * p and a million and 10 and1 * p is p again there's only a thousand different values of p. The algorithm still works, but someone trying to crack it has only to try only has to try a thousand different values for the uh points that I have to solve, which which is trivial for a computer. And again, these kind of loops, they're

easy to avoid. Uh they don't happen, but this is the kind of thing that could at least in theory go wrong. And that's why you have to be careful. So leaps loops can be avoided but there are a large there's a large number of known weaknesses curves you should avoid and possibly there are unknown weaknesses things that no one knows curves that no one knows are weak because no one knows the weakness or curves that some secret agency knows are weak but the rest of us don't. So I showed in this why session that we were using a curve called mist p256. It's defined as follows why by by this equation with a very large number. This

is the kind of large numbers you see in in in crypto. That's nothing strange. So this what it doesn't refer to the fact that this is a large number or number that looks random. That's the whole point. But there is no good explanation of why NIST chose this number this curve which is a bit worrying. I don't think there's a back door or there's a secret um algorithm, secret um weakness that NIST or their buddies at the National Security Agency know about. But it's possible and I think we should avoid this kind of curves and it's kind of worrying that this happens in in between most elliptic curves. uh crypto when it's used uh to secure HTTPS uses this

kind of curves and it's kind of worrying that no one wonders about this but people aren't apart from some cryp people on on ITF mailing list people aren't really saying hold on we should choose curves that uh where we can explain all the where we so we we should use curves where we can explain all the numbers okay um there's another application of elliptic curve crypto that I think is interesting and random number generators. Um, as you've I've mentioned the word random a few times. Alice and Bob needed to choose random numbers. People need to choose or computers need to choose random numbers all the time and they need to be unpredictable and truly random. And that's hard because by

nature computers are good at doing predictable things. Thankfully computers have a bit of randomness. um things like how long it takes to write things to a a hard disk, how long uh movements of the mouse, but that's never enough. Computers need much more randomness than that. So they use something called a random number generator, which is something that takes as input a truly random seat. So something generated by moving a mouse movement, hard disk, read times, etc. Then it inputs it into some algorithm and some shaking up happens and it gives an output a number. Um and sometimes there's a white blue and and uh a red sometimes part of the output is thrown away and in this case the the blue

output is is the output of the random number generator whereas what was the internal state is shaken up again and it's another output and another output and I really imagine this where you input a number and then there's some kind of churning and you churn out random numbers all the time and um this is what happens uh all the time in in random numbers generators. This is how they work at a very basic level. Um I mentioned the discrete logarithm problem which means that if you uh have a a point n * p that you can't find the number n. So that's kind of useful for random numbers because you take the number n which is say your random seat.

You take n * p which is a new point slashn number. I said we could identify points and numbers which is which looks really random. So you can you can make an algorithm where you start with a seed n0 you take n0 * p this is a number n1 you take n1 * p you get the number n2 etc. And the output of the random number generator is it's just a series of n1 n2 etc. So you start with a random seat n0 using a point p on a on an elliptic curve and you get a nice random number generator. Well there's one problem here. It is reasonable to assume that an adversary can read read the output. Say

they can read n1. Uh that's not a bug. That's a feature of many random number generators. Someone uh random numbers are often shared. The problem is if someone can read N1, they shouldn't be able to predict N2, N3, etc. However, of course, if they can read N1, assuming this is a public algorithm, which algorithm should be in crypto because we want things to be open, they can easily find the num uh the number n2, N3, etc. So, this is not a good random number generator. But you can modify this a bit. And we take an elliptic curve with two points um on it P and Q two different points. Again we take a random seat

uh this is a real algorithm. So it's it's defined as a 32 byt seat some random number a truly random number called s0 in this this standard for some reason. And in again you take S0 * P which is another number SL point S1. And what you then do is you this number SL point S1 you multiply the point Q by this. You get another number point 32 bytes. You take the last 30 bytes. You ignore the first two bytes. And this is an output of the random number generator. And again with this s1 you multiply it this times uh you multiply p with s1 get s2 you get s2 * q you get another

random um another random output etc. And because um knowing s1 * q doesn't give you s1 that is the the discrete logarithm problem. This is a good algorithm. This is a good random number generator. Uh and I should point out that this and the next slide ideas the idea of this or or the expo the way to explain this is uh copied from or borrowed from Dan Bernstein Nadia Hangar and Tanya Langa at a NCSC press conference last year. Now we had this two points P and Q on on the same curve and there's a for a fact um this is a fact you should just this is a known fact about this this curve

about many curves in general there is a large number D so that P equals D * Q um but you can't compute it because that's a that's the discrete logarithm problem now let's let's show the algorithm And uh this is just the same as the previous slide. And also note that there are only two to the^ 16 possibilities. So if if someone sees the blue uh the blue bit the the 30 bytes as output and again that's a feature of random number generator just sometimes see the output then there are only two two 65,000 possibilities to get the whole uh output of S1* Q and that's actually very easy to compute because this needs to be a point on the curve

and it's it's trivial to given the blue bit it's trivial to find the red bit and and let's call this this number R1 this this number which again can be easily computed. Now imagine someone knows this number D. Imagine someone knows the secret number D. Someone somehow had access to it. And again pointing out you can't compute it. It's impossible to compute. Now my claim is if you take R1 which is publicly known and this D which secret but except for to this person or entity D * R1 is the same as S2. You can believe me that it's true but I'll I can even show it to you. um S1 * P which is S2 is the same which is by

definition S2 is the same as S1 * D * Q combining the two yellow uh yellow bits because P is equals D * Q because they work nice you can change the order of D and S1 so it's D * S1 * Q is D * R1 so I'll just show you that the the red arrow indeed works so if someone knows this number secret number D they can crack the algorithm. This is a yeah this is a big problem. So the question the million dollar question or or the $10 million question that's an in joke is does anyone know this number D? Well, this is a as I said, this is a standard. This is defined, you can read

it. It's defined by NIST. And helpfully on the first page of the standard, it says the following acknowledgement. NIST great gratefully acknowledges that uh and appreciates contributions by Mike Bole and Mary Bash from NSA. Oops. As many of you will have re realized by now, this is the the infamous backdoor in July the RBG. the random number generator and and and this is just not just a theoretical problem. This random number generator was through some social engineering and $10 million uh made a standard in um uh RSA uh crypto libraries RSA as in the company not RSA as in the algorithm. So that's um that's a problem. uh and and I think we all assume that

the NSA does indeed know the number D. Right? Conclusion. Um I haven't really explained this but elliptic curve cryptography is is important because we can do it much smaller keys and that that's the biggest the main advantage. um keys of 256 bits give more or less the same security using elliptic curve cryptography as 372 bits RSA and um this also um they don't grow as fast which means um you can do shorter keys algorithms are faster this does make a difference as cryptos being widely deployed there's one big weakness in elliptic curve cryptography I think and I don't think people realize this enough it uses complicated math. Um it's great it's fascinating it's cool but the fact that it uses complicated

math doesn't make it stronger. Ideally you would have an algorithm that uses very simple math that anyone who could do a bit of programming could understand and not something where we rely on on a very small group of people who really understand this hoping that they're on on on our side etc. Um and and again I'm not one of the people who understand this well enough to find algorithms to find back doors etc. Um I don't think the group of people who understand is is too small. I think there are enough people but it would be great if there were more. So yeah um tell your little sister or brother to study math and and

learn about elliptic curve crypto. Um that's it. That's that's me. That's me on Twitter. If you didn't like it you can unfollow me.