← All talks

Automating cryptanalysis of HTTPS

BSides Athens · 201630:38330 viewsPublished 2016-07Watch on YouTube ↗
Tags
Show transcript [en]

good morning um we are three researchers from the University of b Eva and we will present to you Rapture which is our framework for attacking um hgps and specifically this framework is designed to uh be a compression side Channel against HPS it implements uh a popular U compression side Channel attack method uh called breach that we have extended uh significantly this is our lab at the University so hcps is currently broken we uh can actively decrypt it right now um some history of how it broke um in um or black high 2013 angel PR Etc introduced a breach which was an extension of an existing attack called crime and um in in Bri Angelo pra and um

his colleagues attacked uh string ciphers like rc4 after that attack people upgraded to AES so people thought that oh this six is a compression side Channel attack but that's actually true and uh we will show you that even with u a it's uh still broken and we hope to also show you a demo today and the overview of all talk uh it's going to be a bit like that but um since the talk is a bit short I want to ask who has seen the breach attack before or who has heard about the breach attack before okay so since several people have heard about it we will skip the introductions and maybe go show you a

little bit of what we have worked on and specifically our contributions and what what we have done in terms of like statistical methods and uh our optimization techniques um and specifically how how blog cers work okay so uh our contributions we have extended the existing breach attack that was introduced in 2013 uh this is our implementation also in our open source framework um and the the specific things that we do the extensions that we do is first we can attach noisy end points that means that um we are lifting basically the Assumption of um of PR Etc that required that the website always return Returns the same response uh to the same requests uh the second one is that they

had an assumption for blogy for well for stream ciphers we are extending it to blog ciphers and then um we will not show any mitigation techniques now but you can read our paper on our website we will provide a link and and the third contribution is that we also have a few optimization techniques we will show you uh three optimization techniques today uh so let's go through that so so the first thing I want to I want to talk about briefly is how exactly we deal with noise and um maybe um yeah let's talk about what noise is first so uh I'll maybe I should go over the like the attack real quick we have

some all right so uh what is the noise so I will remind you real quick what the reach attack does we have um and a victim that is on some Network that the attacker is controlling so the the attacker has access to the network and the victim is visiting a Target website which is using hgps and hsps security the attacker can make the victim make several requests to the Target website this is given the existing web standards and web security and then the attacker can capture the cipher text from the network so this is a chosen plain text attack of which the attacker can see the cipher texts and um the basic assumption is that there's

some sort of reflection so uh the the uh URL of the Target website contains some preter that is reflected in the HTML body let's say uh and this is all properly escaped and everything is is done correctly by the implementor uh so uh what the noise is is the rest of the of the portions of the website so not the reflection not the secrets that we're targeting to decrypt from hgps but anything else and that anything else can be changing from request to request and since we're doing a chosen Cipher text attack uh that means that we uh we will do several requests and it's convenient if everything else doesn't change except for the portions that we choose to but

noise makes this uh difficult so this is the response part that changes peray request and now here's some examples of noise uh if there set time STS in the responses uh that's going to be some portion of the noise if there's some Randomness or some random tokens it's going to be portion of the noise if we're attacking let's say uh Twitter then they have some uh suggestions of people to follow they changeed per request so that's the kind of thing that makes the attack difficult um and we also have some small like details like the halfman header encoding which is specific to compression we don't have to go into that today um some HTP headers

also like the connection can be closed or keep alive this Mak some difference in what we're doing um and also the con encoding which can introduce Ching so these are some of the the problems and the way we deal with this is quite simple really what we do is we do some uh repeated attacks we repeat them specific number of times today we're going to show like 16 or uh 64 64 okay U so we we do some repetitions of the experiment and we basically extract the mean um if you recall breach the basic idea is that uh we only extract the length of the cipher text and this is what makes it difficult in the case of

noise so noise changes the length in a in a way but it's a predictable way so if we extract the statistical mean we only have to do square root of M times the requests that were needed in the original uh breach attack without noise if we have an N size noise uh portion and um eventually the the length that we measure on the network will converge to the correct length after we take repetitive uh me okay okay so how do we do it for cl cyers instead of just doing one request per candidate we do 60 A6 requests and in each one of these we we add some padding and the padding will result in

sometimes the block being colide or not so we have some artificial noise and this is some some R symbols that we we that are shown the reflection value uh uh and this will result in sometimes that we cross the block boundaries and sometimes we don't and by this way we can tell if we have the the correct uh symbol or not and here's an example let's say that we we want to we have we already know a prefix secret and we want to to see what's the the next corre letter and we test if it's t or v as you can see we also have X Y uh but with this ping the first one to 15 the other

two but because wece as these are back to 16 but if you see that if we we put the correct padding which is XY Z then the correct answer will have will be compressed to 16 bytes but the wrong ones will create one new block and so we'll see and don't it's just one because of Block C we'll see 16 bytes so we we can tell which is the right one which is not okay so uh let's see how our reest covers the structure first of all we have the target point in this case it's point in side and it's just a a PHP which contains some text and a parameter that we pass as a AR and

this parameter is the value of this parameter is structured as is this one and it's structured as follows the first part is a known secret now this is necessary in order to Bo the attack so we need to know some characters of The Secret it doesn't matter if they are the start or the end or the middle but we have to know at least two two known characters of The Secret after that we concatenate it with the testing character in which case it is a so in this uh iteration of the attack we try the letter A to see if it is correct and in order to bypass half noise as said before we have to add the half so

what this does is De balances the effects of the half after that we have the alphabet that have mentioned and in this case you can see that for different requests we add another character in The BL alignment alphabet so for each of these requests it is possible that if is the correct character a block will differ compared to the other characters that will test the other characters are the characters in and finally uh we have an aning parameter which is actually D parameter and what this allows us to do is try the same reflection M times per iteration without the browser casing the request so we will see the package over the network okay so uh let's see some of the

optimizations that we we have developed and the first one is the div uh up to this point what we did was if we have an alphabet let's say of digits we try each one of these uh candidates separ for zero one two Etc and uh we'll see who which one uh presents the block differentiation and then we assume that this one is the correct one but it doesn't need to build that up uh in this case what we do is split the alphabet in two two portions that should contain same amount of cand and try one like that so we assume that the the part of the alphabet that contains the correct character will result in a

length difference over the network so if we measure for example that the first part of the alphabet is results in smaller LS over the the network we assume that the first part on and we go that way down to the three after we find the Le which in this three and we assume that this one is soest is technique that we develop and it's quite simple uh as I said before because of block alignment we have to do 16 requests per candidate uh but and that's why we need this optimization uh these 16 samples are not adapted based on each other so we can can do multiple requests all together uh so the the attacker can say to the

victim okay this is the the candidate want you to try this is alphab for and let's say run it 10 times so we want to perform 160 requests uh and we really don't care about uh the respon don't come pipeline they may come out of order but we don't care because what we want to take is new l so we perform all the and we s we tell the client to run all these requests together and the browser has specific behavor for that um and after after that we we take the the whole length and we divide it by the by the total the amount of the request that we made so and we get and we get the the L that we wanted

to but if we do this thing without considering the paration the browser us it's pipelined so it's one request up after another but we can notice that brother allows six bar request per time so here show one time we have six requests and next time we have six more requests and so on and we really don't need to differentiate between one another we run all of these together and in the end we we do we we divide by the the amount of the request that we did and that's why we call it sh because it's all the requests together or all responses together and we we exract by dividing by the amount of the the

requests okay so our framework is called rapture open source it this is the website where you can read about it and download it it's also where our white paper exists which has a an extensive description of thisch and how it works uh our source code is from GitHub it's licensed so uh you can go there and forkit make some requests we have many open issues and many bugs and any features that we want to introduce especially uh this summer we're working making the attack uh extract the whole um response not just chosen Secrets this is the back tracking uh we making work against I6 um what else we are introducing a very easy to use UI web UI to control

this attack um because right now the configuration is all C and yeah we're adding support for Speedy because this also works this only works against hgps but not speedy so you help with any of these we we'll be doing them until septe this September so you're welcome to uh join us okay so uh I want to tell you a few things about Rapture U because it's a useful tool for security researchers like both you uh it's not only useful for compression side Channel attacks but also for other attacks against TLS it's a modular um attack framework it's quite extensible um and um well it has several the way it works is that first uh given

control to the uh victim's Network it injects JavaScript into all simple HTTP connections that are unauthenticated and this JavaScript is is opens up a control channel to the attacker to the adversary that remains open from all tabs of the victim and uh speaks at command and control protocol and all these tabs remain dormant until the attacker chooses to wake something up and ask for work to be done and our case workk is to make requests to the htps website and due to same origin policy we cannot read them from these infected tabs so we have to TCH the networ um but you can imagine that this can be used as a framework to mount different sorts of attacks um it's

also quite robust so if the user closes a tab a different tab will wake up uh so this is all detected if the if the uh the victim reboots their machine and opens it up again the attack is going to resume from where it left off and also we have a Ser client architecture that allows for persistent storage of the uh data that was collected for future analysis you you algorithms are developed so now I are going to show you a short demo of rure and how it works against uh htps uh first of all we need to set up the environment okay here was was set up the environment and we going to configure

the Target and the victim here are the Target that are created they all La points and here are the victims which have the the Target and we'll choose which one we want to yeah in should also have the of the V computer in this case it's computer because I'm myself but uh you need to know the site in order to to ad it so uh it's all s and we'll just uh launch each module uh separately in order to be more specific what we do but you can also run all the frame at so

uh and for the snier we need because it access my computer so here page is deployed and we around we know that the we have the known alphabet these are the candidates and the alignment alphabet and it starts uh sneaking the network and we can see the length and how many records have been passed okay so uh okay yeah let's open the brows and but uh this so we same point okay uh the side actually has just text Etc and what we want to do is extract some part of this text without knowing uh what you see here it is uh so it's okay it's supposed to be secure okay so uh consider if we didn't

know anything about this point we would start talking and uh the victim needs to uh open the C and this one you can see here that be the injected file that we if we're in the network and we have a victim uh we would inject the job code here is just an HTML with a plan

code okay so here the B start listening to the requests uh actually the L will send the work request to real time then we connect to B the is responsible for man and we start issuing the requests and here we see the different and the L over the by so uh we iterate over the the alphabet of English letters and uh when it sometimes we might cut some errors because either the client disconnect or the network is kind of slow uh so in this case we have to cut these errors and magnate them according and uh if uh there's an error for a candidate we reissue the attack for this C so in this case you see that

after finishing the alphabet we start again from those that were the successful page and uh when each C has a successful capure and uh we have a l compare with the rest of the the um we'll see that okay yes here we are the first letter is D and it's it's shorter than the others uh so we have a decision that uh D is the right one with some confidence it's quite good but you can see that it's there are other letters that also have L than the rest of them this could be because they they could match to some other content of the text but here we have D which is the winner yeah also in this case you see a

practical example of noise because for example T may have like that because noise was optimal for T compared to the the rest of the but we assume that in any case this in this case we assume that is the correct one should result to the the minimum there so uh what do we see here we we knew the part of the secret was these five letters we and we assume that the following one is the let's see if we were correct okay so you see here that data continues and this is the whole T we knew this and we see that the is indeed so we have extracted a part of the the base that is supposed to be secured and uh we

have done this with with pretty good confidence and uh if confidence is about one we the framework assumes that this is indeed the correct one and uses this in order to De the rest page yeah and

um okay this is a no I think have questions oh yeah do you have any question so do you need the user to stay on the same page forever while the attacks happening okay you move around no uh we just need any be page you you're in CNN which HTTP if you if you close it the attacker waits for another HTTP connection and when you have another connection theack follows from where it was St we have Implement an injection technique in order to inject the J C that is the module of the that communicates with your so any time you open AC connection we inject the J there and we resume or if you visit the end

point that we control we have in there and we continue theack any other questions hav't thought about connecting it with B which does pretty much the same I mean on the injection part of the JavaScript so to use the B framework in order to inject and then use the injection part of in order to continue your attack so we used an existing framework called better C for the injection portion uh we so rupture ruptur injection module is basically a specific deployment of better so uh I don't see why we should use something but yeah we we have not of course we have not developed our own injection framework it's it's a something open source that

we use and that's not where we go on your on the AES attack how have you related that to some of the other a attacks that have been described over the past couple years I'm thinking the CBC attack uh so this this specific attack is not um it's not it's not specific to AES as features so we don't manipulate the inner workings of a to do the attack it's the side Channel attack and it would work against really anyc Cipher the reason why we are showing it against AES is because AES happens to be the most popular Cipher but the we're not Levering we're not leveraging any features of a specifically uh if you can

do some of the attacks that you mentioned that are cryptographic in nature you can certainly combine with Al we we do work towards implementing other compression uh based attacks like crime which is medicated but is useful for educational purposes so we will release some implementations for crime or other compression side Chan attacks that work similar any other question okay and uh because some people did not know how exactly breach works I assume it was really hard for you to follow we show new features only uh please go to our website we have a easy description on the paper the introduction of the paper is pretty detailed and you can understand how it works so here we propose some some

future work uh our okay uh as mentioned in during the summer we will work on issuing a kind of a release of framework that uh enables users to attack the frame points easily and more uh with more confidence uh our aim is to attack a popular s like Gmail or Facebook which have some know vulnerabilities but we haven't up to this point managed to to leverage this and uh our aim is to to create a framework that uh has a a use a UI that is friendly user it uh it is able to run against uh most protocols that are used uh today like6 or as mentioned and uh in order to do this we need some help we

have many bus and we need to to use the community sus in order to enhance so if you have some suggestions if you you're interested in this please uh follow us on I ask some questions and uh offer some advice and uh what you should take away from this uh presentation is actually pretty simple HPS is broken uh we should not consider uh HPS side safe because we can do this at this point uh for at some extent we assume that people smarter than us can or May have already done this better so scps when composition with compression is vulnerable we should address this issue and Rapture actually allows people to attack SPS and our applications that

us thank you very [Applause] much there's questions in the audience you said HS broken SLS broken yeah I think that's probably more

accurate No No so this is this is incorrect because what they drop is the compression in the TS headers but what we're attacking here is the compression in inside the payload of T and this is not fixed in t yeah uh and uh so this is what what this is is that all all websites use gzip to compress their contents and this is Cap encapsulated by T so if you look at gmail Facebook Twitter all these use gzip to speed up the connection but it doesn't [Music] have no it doesn't right we'll be here all day so yeah take