← All talks

How Crypto Libraries Affect DoS Attacks via Diffie-Hellman Key Exchange

BSides Budabest · 202237:3844 viewsPublished 2023-06Watch on YouTube ↗
Speakers
Tags
About this talk
Szilárd Pfeiffer demonstrates how the D(HE)at vulnerability (CVE-2002-20001) exploits CPU-intensive Diffie-Hellman key exchange to overwhelm servers via denial-of-service attacks. The effectiveness varies dramatically across cryptographic library implementations—OpenSSL, GnuTLS, NSS, and others—and depends on key sizes, protocol versions (TLS 1.2/1.3, SSH, IPsec), and whether clients can force larger key negotiation. The talk includes live demos on TLS, SSH, and IPsec servers, revealing surprising anomalies in OpenSSL 1.1 behavior and practical mitigation strategies.
Show original YouTube description
This presentation was held at #BSidesBUD2022 IT security conference on 26th May 2022. Szilárd Pfeiffer - How Crypto Libraries Effect a DoS Attack? The Diffie–Hellman key exchange is affected in D(HE)at vulnerability (CVE-2002-20001), a DoS attack forcing the server to compute the CPU-intensive part of the mechanism overloading it seriously. Of course, the effectiveness of the attack depends on the key sizes, the used cryptographic protocol, and the server application, but it also highly depends on the cryptographic library implementation. There are significant differences between the crypto libraries in what bandwidth is sufficient to consume a whole CPU core on a server. I will demo how a server that implements TLS, SSH, or IPsec protocol can be overloaded and how the peculiarities of the crypto implementation influence the vulnerability. https://bsidesbud.com All rights reserved. #BSidesBUD2022 #BSides #Crypto
Show transcript [en]

thank you so uh my name is philadelphia i graduated uh as an electrical engineer almost 20 years ago since then i have been working in i.t security so i'm not a typical electrical engineer but but an id guy i started my career as a software developer now i am a security evangelist as you can see on the slide i work at bolasis which is one of the few hungarianity security vendors earlier it was called balabit it was the vendor of the syslog ng uh now we are the vendor of the zorp obligation level firewall which has a gpr version open source one as you may know uh this story has started with my pet project

which is a cryptographic setting analyzer called cryptolyzer the functionality is same as in the case of the test the test age ssr scan or sslice during the implementation i realized that i have to implement the cryptographic part of the cryptographic hand check to get some information from the server but i wanted to buy passwords somehow uh the cryptographic part because i am a lazy engineer and i don't want to care about the cryptography so i also realized that it is possible to send seemingly correct messages to the servers which are accepted by them and answered by them the question arises how this led to your denial of service attack the answer is relatively simple the

messages i mentioned are simple short and preferably capable most of the times so they are suitable to cause significant load on the server side because the client should not need to calculate the messages connection by connection they it can break up pre-computed and it can send repeatedly to the servers in this presentation i would like to demonstrate how seriously um effects uh the cryptographic library a denial of service attack like this first of all let's see what is key exchange which why um what kind of exchange exist and why they are so important and why they are vulnerable to a denial of service attack all cryptographic protocols have a step during the cryptographic hatchet called key exchange

what is it uh if parties tend to communicate with each other in a secure manner they need a key a shared secret which is practically a some kind of random data which will be used during the communication to encrypt and decrypt messages in a symmetric way these shared secret have to be exchanged without risking that the third party compromises this is the key exchange part there are some key exchange algorithms which can do that work why these keys change algorithms are expletive to a dnial service attack the reason is the fact that they they are cpu intensive depending on the specific key exchange algorithm they are also unauthenticated meaning that in the cryptographic handshake the key exchange is before the

authentication so an unauthenticated client can force a server to do something to do cpu intensive operations and the third thing is that the messenger are most of the times pre-computable so a client can pre-compute all the messages need to be sent to the server to to force them to do the cpu intensive part of the handshake what kind of algorithms exist first of all i have to mention the elliptic curve version of the dpr monkey exchange which is considered the most secure and most effective key exchange nowadays it is still possible to use it to perform a denial of service attack but it requires significant throughput which does not um so it doesn't burst the coast

the second uh mentionable key exchange other it means the rsa which is an unbalanced key exchange algorithm meaning that it requires much more cpu computation on the client side uh sorry on the server side than on the client side it makes it ideal to perform a denial of service attack because the client needs to perform less cpu computation than the server but it have to mention that the rsa key exchange algorithm is more effective and less cpu intensive than the original dv hellman key exchange algorithm so the last one is the original dpl monkey exchange which requires much more cbo computation uh than the all the earlier mentioned key exchange algorithms this algorithm is balanced

meaning that it requires the same amount of cpu computation on the client and the server side it would make it ineffective to perform a denial of service attack except you can decrease somehow the necessary cpu computation on the client side how how it can be done to to to understand how we can decrease the cpu computation requirement on the client side i have to explain the dpr among key without going into the details uh the algorithm is balanced as i mentioned so the parties do the same operations but on different numbers as you can see on the slide the only cpu intensive operation is the modular exponentiation both parties do that twice they when they compute the public key

their own public key this is the second step and when they compute the shared secret this is the fifth step in this way the computational capacity requirement is the same on the client and the server side which is not ideal if you want to perform the eye of service attack but in an ideal situation and attacker can enforce the victim to do a significant amount of cpu computation without making the same on the client side so the other car needs to cheat somehow this cryptographic protocol to do that it is possible because an other care does not really want to do the key exchange it just wants to to to force the server to do the cpu

intensive part of the key exchange so it can send whatever number to the server which is accepted by the server that's the trick first of all the autocad should analyze the server to get the diffie-hellman parameters of the server especially the prime number why is it important because the other cash should pick a number a which is less than d prime because it is guaranteed that the result of the modular exponentiation the cpu intensified is less than the prime so you have to pick a number a after that the attacker can send this a number to the server and the server will accept it because it cannot distinguish it from a computed number it is just a

cheated number it's just just a randomly picked one but it is the server cannot distinguish it from a computed number so uh it will uh do the cpu intensive part on this side um after that the autocad should receive the message from the server to be sure that the server has already done the cpu intensive computation with this mechanism a malicious client can force the server to do the cpu intensive computation on each side without doing the same thing uh on the other care side this is the day heat attack which is named under um named after the fact that it can heat the cpu using or forcing the diffie-hellman key exchange data got cva number uh 2002 2001.

the first part of the cv a number is a date um actually it is 2002. uh this is a year number when the vulnerability was exploited but not the the year when the vulnerability was exploited for the first time but the year when the vulnerability was first uh published usually this year number is the same in most of the cases but in this case there was a publication about the theoretical vulnerability in 2002 anyway i discovered that vulnerability independently from that publication there is a scoring mechanism to prioritize the vulnerabilities called commonwealth updated scoring system or cbss the day heat attack get 7.5 as a score which is a relatively high score number but the question arises why

is it higher the other complexity is low it's very simple to force the server to do the cpu intensive operation and the attack requires no privilege at all and requires no user interaction but it does not affect the confidentiality and the integrity so you cannot break the encryption algorithm by this way and it does not affect the scope so you you cannot uh do a privilege just collision using this attack it's a typical uh denial of service attack and practically uh deny of service cannot get a higher number than 7.5 anyway cbs code is a you need to measure the uh vulnerabilities impact on a software or on a cryptography protocol but it cannot measure

the the the impact and the real world because it highly depends on how widespread is the vulnerable software or the burna labor cryptography protocol so i decided uh to do a research to find out the prevalence of the dpm and key exchange in the case of web servers first of all i have to mention that the d3 among key exchange is completely secure the openness offers it even in the highest security level and mozilla suggested by part of the its configuration generator as you can see on the slide in the top 100 domains uh the support of the dpr monkey exchange is extremely low the reason maybe the fact that that that there is a

performance issue with the dvr monkey exchange and maybe the fact that it is possible to provide backward compatibility with the older clients by using the rsae key exchange which has no performance issue in the case of the top 10 000 domains the ratio of the support is 25 which is still so high but if you see that if you can see in the case of the one top top 1 million servers the ratio is 47 percent which can be so considered high in general we can say that there are more than 55 million https servers on the internet according to the showdown statistics and if we assume that the ratio is the same as it was in the case of the top 1

million domains we can say that more than 10 million servers using dpm on key exchange nowadays after the theoretical explanation let's let's see a demo i will use a digital ocean instance with four cpus and eight gigabytes of memory i will use 2kbit dpr among key size

i will use a docker container which which run apache app server uh with the mentioned 2kbp key size i implemented an application called the heater named after the attack which is they hitch which can which is a proof of constant implementation of the vulnerability let's start it but not in this terminal but the other one

demo effect okay it has started and as you can see the cpu there is 100 percent cpu usage let's measure the attack i start the dcp dump or i hope so okay

okay to starting let's wait just a few seconds one two three four five okay let's see the result if i can stop it once but okay i stopped the attack okay and let's see the result

hmm yes okay okay it was uh 17 seconds almost 18 seconds as you can see the incoming dates are outgoing date on the number of client hello messages was more than five thousand all the public keys were unique and as you can see um 77 requests per second was enough to cause 100 cpu usage on one cpu core

uh so as you see in the demo relatively low throughput was enough to cause uh 100 cpu usage but we can we can question that that how so is it is it a high or is it a low uh throughput i have to say that it depends if we uh consider the fact that cloud flare has mitigated a 15 million requests per second denial of service attack last month we can say that it's a relatively low throughput but let's go back to the reality and the demo environment and let's compare the values if you use different version of openssl or tls and different key sizes as you can see on the slide if you use

openssl version 1.1 and tls version 1.2 there are significant differences in in bandwidth and throughput if we use eight kbit key size instead of two kbit key size uh it's not a surprise large the larger is the key size the better is the effectiveness as would expect it if you use openssl version 3.0 which is the latest version of openssl and tls version 1.3 the result is significantly different as it was in the case of tls 1.2 and open ssr 1.1 um the if you use the uh 8kb key size only 1.2 requests per second was enough to cause 100 cpu usage on a cpu car the reason could be a performance issue in the case of

openssl 3.0 but it's just a speculation at that point to prove that i decided to measure the attack effectiveness um with the different cryptographic libraries and different key sizes to see how crypto libraries affect the denial of service attack before i started to measure i wanted to know how important is the key size is it true that the most popular key size is the 2 kbit key size the answer is it depends i measured the key sizes in the case of the the web servers in the top minimum domain which supported dpm on key exchange as you can see on the slide the 2 kbit key size is definitely the most popular one the base majority of the servers use

that key size um one kbit key size is still so high considering the fact that the logjam attack states that it is vulnerable by a nation state a 4kbit key size is relatively high but it was a surprise for me that it is low as you can see so therefore we could say that i have to focus on 2kbit key size but i have to mention something in the case of the tls 1.2 protocol originally it was only possible to use one key size but there is an extension in the dls version 1.2 which makes it possible to a client to choose between key sizes in the case of dpr monkey exchange this extension is part of the

tls 1.3 protocol definition however unlike the less popular cryptographic libraries openssl 1.1 does not support the extension in tls 1.2 and does not support ddpmonkey exchange at all in tls 1.3 this is the case if you are talking about tls but day heat is a protocol independent attack you can use that attack in whatever protocol so let's see what what we can measure in the case of ssh for instance at the first dance it may seem that the situation is the same as it was in the case of the tls according to the showdown statistics the 2 gigabit key size is the most popular one but the 4 and 8gb key sizes are supported more than 60 percent

of the servers which is so high so we can say that equity size can be indeed be forced in in 60 percent of the servers and there is a uh there is an important thing i have to mention that the ssh protocol just like the tls 1.2 with the extension and the tls 1.3 allows the client to choose between the key sizes so and autocare can force the server to use the largest key size which is enabled in the server configuration this is the variatic column in the on the chart so presently more than 90 percent of the servers can be forced to use the 80b with key size as it is enabled in the default configuration of

the openssh for instance so the key size matters especially in terry because the attack effectiveness highly depends on the key size moreover not linearly but logarithmically however it should be noted that some more components are imported such as the application server implementation and the protocol so i decided to measure so i decided to perform some measurements i wanted to focus on the cryptographic libraries so and i wanted to exclude as many other factors as possible i use the same environment that i used in the demo and i and to avoid to measure the details of the application servers instead of the details of the cryptographic library i have to find an application server which supports uh more than one cryptographic

libraries this is the light httpd because it supports several cryptographic libraries not only openssl which is the most popular cryptographic library but also gnu tls which is another popular cryptographic library nss which is the cryptographic library of firefox and amp and tls which is an embedded system cryptographic library let's see the result of the measurement what we can see on the slide we can see how much throughput is necessary to cause 100 cpu usage on the server this is the vertical axis and the horizontal axis is the different key size not surprisingly the larger is the key size the lower is the necessary throughput which can cause 100 person cpu usage as the computing capacity requirement is

higher with the larger key sizes so a server can respond fewer requests the chart can demonstrate that there are significant differences between the cryptographic libraries in the case of nss it was uh uh 150 requests second was enough to to cause 100 cpu usage with two keys in the case of embed tls slightly more than 50 requests per second was enough to to to cause the same result in the case of the new tls less than 100 requests per second throughput was enough to close this 100 percent cpu usage what is the conclusion the shape of the curve is is similar independently from the cryptographic library however the exact values demonstrate that there are significant

differences in the effectiveness in the case of the different cryptographic libraries the next major event shows the differences between the tls version 1.2 and the diaspora 1.3 solid lines are tls 1.2 and dashed lines are tls 1.3 as you can see tls 1.3 is significant in the case of 100 1.3 significantly lower throughput was enough to cause the 100 person cp usage as it was in the case of the ls 1.2 the reason is the fact that in the case of tls 1.3 the client is required to send its public key during the handshake in the initial message the result is the fact that after the server receives this is this initial message it can compute its public key

and also the shared secret so it can um do the cpu intensive operation the modular exponentiation twice uh meaning that um that is causing the fewer requests is enough to cause 100 percent cpu usage at this point i decided to use a different unit of measurement to to make the different throughput values comparable with each other and to make the different the values comes from different environments and use different applications service compatible with each other this chart displays the same data as last one but with a different unit of measurement the star point is always one and the lines are increasing because the poisson the points show how many times less throughput was enough to cause the 100 percent cpu

usage using a larger key size than using a smaller bond the vertical axis is not linear but logarithmical as the values are also logarithmical the curves demonstrate the auto effect devious it's not throughput it's out of effectiveness as you can see on the chart the adductive effectiveness can be twice using a 3 kbit key size compared to 2 kbit key size it can be 5 times higher using 4 kbit key size compared to the 2gb key size and it can be uh up to 40 times higher if you use 8kbit key size instead of 2 gigabit key size so as you can see increasing the key size increases the auto effectiveness dramatically you might notice that data but the

most popular cryptographic library open ssl has not a b yet it's not not a not cons coincident the red line demonstrate the attack effectiveness in the case of openness of version 1.1 with tls version 1.2 and as you can see there are some oddities on the chart in the case of uh three and four kbit key sizes the shape of the curve is the same as it was in the case of other cryptographic libraries but in the case of the uh six gigabit case size the auto effectiveness is much lower uh than as it was in the case of three kbit side key size as uh and um it is also lower uh but the

other effectiveness is also lower than than it was in the case of two kbit cases which is quite surprising in the case of 8kb key size the attack effectiveness is almost the same as it was in the case of 3 kb key size which is quite strange for me at that point i decided to measure the openness by changing only one component at a time to find out what what is the reason of that anomaly the result was most most more serious more mysterious than the original measurement i started the investigation with openssl version 1.1 tls version 1.2 but i used different application servers the result the red line shows the result and the result is the same in the case

of the different application servers so the anomaly is independent from the application server so i decided to continue the investigation by using the tls version 1.3 with openssl version 3.0 only opens version 3.0 supports the amount key exchange so that's why i changed the open ssl version as the blue lines show the result is the same as it was in the case of the other cryptographic libraries and and there is no anomaly so i continued the investigation by using openssl version 1.1 tls version 1.2 but i changed the diffie-hellman parameters to random months i used a dpm parameter which comes from another rfc and then random one and as you can as you can see as the

gray line shows the result was similar to the other cryptographic libraries again and there is no anomaly so i continued the investigation with the open sssp2 which measures just the diffie-hellman key generation not the whole key exchange just regeneration and and the solid and dashed white line shows the result is similar to the other cryptographic libraries and there is no anomaly so uh the anomaly exists only with openness version 1.20 as version 1.2 at

it is not fully clear for me what is the reason of that anomaly but i am going to continue the investigation anyway the developers of the open ssl know about that anomaly and they said it's strange so let me know if you have an answer why why what's the reason of that animal to sum up my presentation let's see what properties uh makes the cryptographic protocols uh more exposed uh to the um the heat attack first of all i have to mention the zero round trip um if it is necessary to send only one messages to a server to force them to do the cpu intensive operations the cryptographic protocol is more vulnerable this is the case if you

are talking about tls and ike the in the case of ssh the client is required to require to send at least two or three messages to force the server another important aspect is the client client key size selection possibility as i mentioned this is the case in the case of ts 1.2 with an extension in the case of ts 1.3 in the case of ssh and i key the third thing is the pre-computational ability if the client can pre-compute the messages the effectiveness is a little bit higher but i have to mention if it is not possible uh the auto effectiveness is a little bit lower but it cannot mitigate the attack itself you can compute the messages in the case

of tls and ssh in one cases but in the case of openssh you have to compute the messages connection to connection in in the case of some cryptographic protocols there is a mechanism to protect again the denial of service attack discord cookie cookies just a random number the server sends this random number to the client and the client should repeat it and send it back to the server in this way uh you can decrease the effectiveness of the attack as you can as you should compute the messages connection by connection but it cannot mitigate the attack at all it can decrease the effectiveness a little bit but it cannot mitigate it so uh what method can uh a user

use to to mitigate the data attack if you are talking about private services i would suggest to disable it if you have a monkey exchange and and that's the solution the elliptic conversion of the dpr monkey exchange is the most effective one nowadays and the most secure one so um if you are talking about private message private services it can be done uh if the backward compatibility is uh is a must in that case i would suggest to control the number of unauthenticated connections unfortunately there are just a few applications there as we support that a very good example for that is the open ssh where it is possible to control the number of unauthenticated

connections so you can you can set a maximum number of unauthenticated connections and it can decrease the effectiveness of the attack another possibility is use a third-party tool such as fail to bond which can parse the ip address of a malicious client from the log messages and after that it can create uh fireball rules to bond that client it it's uh it's a very effective way to mitigate this attack but unfortunately the default load level is not enough in several cases so if you have to increase the lock level which may mean that it caused significant amount of log messages so you have to care about that if you are talking about public services i would say that you have to consider

the compatibility with five to ten years clients i mean that the most browsers support the elliptical version of dpr monkey exchange for five to ten years other kind of clients also support the elliptic curve version of dpr monkey exchange for five years and in the in the case of the public services you can also use the rate limiting [Music] using the application server specific configuration or using the uh mentioned failed one on the github page of the application day heater you can find configuration snippets how to configure how to disable the dvr monkey exchange in a different application servers and how to read limit the number of concurrent connections this qr code contains the url of that

github page where you can find the mentioned configurations thank you for your attention [Applause] thank you salad do we have any comment or for still out any questions for sillard at this time take a photograph of the qr for future reference if you have something later here's a question

thank you have you done any measurements on the elliptic curve dhe and how how much better is the performance there yes i have done there are six there are significant differences i mean the elliptic curve uh version of the the amount is uh so you need that at at least uh um up to ten uh uh through puts to close the the same uh the same results so in in my humble opinion the elliptical version of the defeatment is not not so good if you want to perform uh deny of service attack but it highly depends on the curve anyway so there are several different curves and if you use a curve which has a

larger key size it can be it can be suitable to to cause a significant load on the server and there is another uh important aspect the post quantum cryptography which can be useful to useful if it is useful uh so it can be useful if you want to to do or deny of service attack but but i have to measure it

anybody with any question for zillard looks not so once again salad thank you very much thank you [Applause] you