BSides DC 2016 - Tales of Fails and Tools for Message Integrity

Name: BSides DC 2016 - Tales of Fails and Tools for Message Integrity
Uploaded: 2016-11-13
Duration: 49 min 24 s
Description: Cryptographically secure data integrity checking may not receive the same level of attention as encryption, but lies at the core of many security technologies from code signatures to passwords to cryptocurrencies. The problem of verifying that a message has not been modified may seem straightforward

BSides DC · 201649:2492 viewsPublished 2016-11Watch on YouTube ↗

Speakers

Jacob Thompson

Tags

CategoryTechnical

TopicCryptography

ResearchTechnical Deep-dives

StyleTalk

Mentioned in this talk

Protocols

WEP

Concepts

AES CRC HMAC MD5 RC4 SHA-256

About this talk

Cryptographically secure data integrity checking may not receive the same level of attention as encryption, but lies at the core of many security technologies from code signatures to passwords to cryptocurrencies. The problem of verifying that a message has not been modified may seem straightforward, but there is a variety of algorithms used to solve it (some good, some not so good, and some homemade and broken) and even standard approaches can fail if not implemented correctly. In this talk I demonstrate a number of catastrophic failures with notorious past examples, explaining how each one worked and the mistakes that the designers made. Then I briefly review the primitives used in message integrity, such as one-way functions, message authentication codes, authenticated encryption, and digital signatures, showing the capabilities and limitations of each. Jacob Thompson (Senior Security Analyst at Independent Security Evaluators) Jacob Thompson is a Senior Security Analyst for Independent Security Evaluators, where he specializes in high-end, custom security assessments of computer hardware and software products. With 10+ years' experience, a propensity toward hands-on security assessment, and proficiencies in reverse engineering, DRM systems, cryptography, system and application security, and secure system design. Through his 3 years' work with ISE, Mr. Thompson has partaken in multiple major vulnerabilities and assessments, customer visits, and progress presentations. He has presented his research at DEFCON 21, BSides DC 2013 & 2014, DERBYCON 4.0, and ToorCon 2014. Thanks to our video sponsors Antietam Technologies http://antietamtechnologies.com ClearedJobs.Net http://www.clearedjobs.net CyberSecJobs.Com http://www.cybersecjobs.com

Show transcript [en]

the b-sides DC 2016 videos are brought to you by clear jobs net and cyber sex calm tools for your next career move and Antietam technologies focusing on advanced cyber detection analysis and mitigation okay so this is tales of fails and tools for message integrity so I'm Jacob Thompson I work for independent security evaluators for a security evaluation and consulting company located in Baltimore so we're local a quick commercial break so our company were composed of computer science academics ethical hackers we do all sorts of white box style security evaluations for our customers so very much in attackers mindset and so forth trying to look at a product and tell them where the problems are and how to

fix them and so forth back to the talk so thinking about message integrity in terms of cryptography so personally I think there's much more of a focus on confidentiality so AES and so forth in the terms of say enthusiasm but actually integrity is just as important and by integrity I'm talking about two ways one of them would be modification of data while it's at rest sitting on a disk somewhere or in transit like over a network and passwords would be the second example it's important to note though that in many cryptographic protocols the ability to modify a message or tamper with a message can also lead to the loss of confidentiality so I've often especially earlier in my

progression as a security person in the scoping and attack surface recognition part of an assessment they talk about we use such and such a es for this and they didn't have anything in for integrity and I have a habit of saying well like how likely is it that anybody will tamper with out anyway but it can be very important and I think it's also more nuanced and interesting than confidentiality sometimes just because a lot of the problems are not as close to being one hundred percent solved as they it might be in symmetric encryption and so forth or at least to a degree so the approach I'm going to take is on I'm going to go

a tour of a lot of past vulnerabilities past exploits and various projects and the attack methodology behind them so that you can kind of see the ways in which integrity checking often fails and then I'll close out with a review of best practices so if your security analysts also you'll have some things to think about as you do assessment so we'll take a look at fails so common ways in which puts a programmers or architects fail when they're designing a system so I would say one of the most common is assuming that encryption of data also protects it from being modified if not well then good enough so first example and this is the worst-case

stream ciphers and malleability so remembering what a stream cipher is a one-time pad right rc4 basically you have a key stream and you have plain text and you XOR them together and you get cipher text just as I Show on here so hello world take the ASCII values you have a key stream you get data so just like a one-time pad right or it could be something a PR ng generating key stream decryption is just the opposite so biggest thing to notice encryption and decryption of the same operation they cancel out so if you encrypt it twice with the same key it's not twice as secure looking at how decryption happens if the if we've tampered with this

message in some way the important thing to note is flip a bit in the ciphertext the same exact bit gets flipped in the plaintext after it gets decrypted so this is called malleability it affects every stream cipher so in a way one-time pad might be the best for confidentiality it's the worst for integrity unless you have something else in place to protect that so seeing that let's take a look at some ways in this in which this might actually fail in a real system so I made an example here everybody knows about client side session management kind of being all the rage right now so you have these restful api s and load balancers and you get a

different server with every request and so forth so a lot of developers have this idea let's just store the session data in the cookie rather than having the cookie refer to it right so you do something like that so I made a small web application you have a session cookie it's some encrypted and base64 encoded string right I formatted it so that the layout of that cookie is an initialization vector concatenated with cipher text that was encrypted in aes counter mode so counter mode just recalling turns a block cipher into a stream cipher so it is also affected by these problems looking at the plain text format of that cookie its JSON data username is admin and a timestamp so the

question is using what we just looked at about malleability could we flip is admin 0 to 1 without knowing the key right so looking more closely at the actual positions and layout of the cookie you may not know what the key is but if you have known plaintext you can determine the position of the bit where the zero ascii value could be flipped to become a 1 and we do just that so look down at that is admin 0 line and we can flip the ciphertext bite 40 to 41 knowing that from the ASCII values of 0 and one that causes it to become a one does that work of course so taking a normal request to our fake web

application we get you are not an admin and presumably it won't let you do non administrative things if we tamper with that cookie and then read base64 encode it send it to the server again we get exact same request except now we are an administrator so encryption doesn't protect the integrity if you have a chance to quickly copy down the URL if you're interested basically demo dot security evaluators calm / integrity / street and then we'll take a look at another example so everybody knows that about stream ciphers in many cases but also block ciphers can also be affected by tampering so CBC mode cipher block chaining is one of the more common and easy to do correctly modes of encryption

so it's often recommended is like if you have to implement it yourself maybe do this in terms of a formula you have a special first block called the IV and then you have encrypted blocks of plaintext where during the encryption process you add this kind of noise term which is the previous cipher block cipher text block into the encryption and then because it's a block cipher you can only encrypt in multiples of the block size you might have padding at the end which I've added in this case as well so this might be a typical block cipher encryption of the string hello hello world three times in the plaintext decryption is the opposite process except your IV and padding will

be thrown away now in stream ciphers if we make some change to the ciphertext it made the same change in the plaintext right it was very easy to make modifications oftentimes because of what is called the Avalanche property if you take within one unit of ciphertext in a block you flip a bit you have zero control over what will happen to that block when it decrypts but take a look at what happens in CBC mode let's flip that six a bite on the second line to 7a all right that block that D Crips when that block decrypt you get noise right at some random data because of that Avalanche property however because of how CBC mode operates that cipher text

will be that X or will happen and notice we have a very controllable modification to the next block where we changed some ASCII value so that instead of an X or a set of an H we got an X so that could be a problem now if whatever application you're working with is very rigid about its data format this is probably not going to work if it's some special case maybe it will another thing we can do in CBC mode is a cut-and-paste attack where because a block can be moved around as long as the block before it is correct it will still decrypt the same thing so let's look at another made-up example to

demonstrate that so here I've made a python program that is kind of a very boil down demonstration of something that might be doing so it starts it takes one command line argument pointing to some directory it does some kind of sanity check and if that check passes it does something dangerous maybe RM dash RF / have it for real or maybe it does a backup or something right or upgrades a package but I've made this this is actually made in a very specific format for convenience so that all the boundaries between cbc blocks will line up right but the core of this is this if statement at the bottom if we focus in on that then for demonstration purposes

the sanity check is always true so if true don't actually do the operation else do it so our question is knowing what we've learned about CBC given only an encrypted copy of the script and no key is there a way to modify it so that the else block execute instead of the if so you know what the script has inside it and you have an encrypted copy you don't have the key so you can't do the straightforward modify it and be encrypted well if we actually break that script into blocks then we notice that what is a triple quote string in Python multi-line string literal right so this is in such a way that what if we could

take this multi-line string literal have it not end where it does and move the end of that string to somewhere below the code we don't want to run right so I have abbreviated it some but you see that else block has an if above it if that if is no longer in the syntax of the program the else will instead be paired with the one at the beginning and will run because the command line check would succeed so what about CBC mode can we do we can move blocks around so let's take a look at what happens to the plain text when we do actual working implementation of that that I'll show briefly afterward so suppose we look at

the script and we do three things so in that usage statement it used to say if a security error is detected it terminates with a nonzero exit status or something like that what has happened here is that block before the block that contained the triple quote I just removed and put somewhere aside what did that do the next block that came after it is now corrupt but this it recovers after that so one block was lost in one block became corrupt take that prior block and a second copy of the triple quote block and reinsert those in front of the else so just after the if and exactly what happens is that triple quote gets moved below the if

statement and is now no longer code the else will run how would that code actually look to do this attack well it's easy to do with DD with all the skip and seek and count options so a block in AES is 16 bytes long 128 bits so if you have an input file an output file you can copy the beginning and end verbatim and mess around in between and that's basically what this does this also is on that demo site how would this actually can we come up with a scenario where this might actually work possibly maybe we have some system administrator or developer with large numbers of cloud VMs or something and they want a

scenario where it reaches out during the boot process to run some script that has some routine system administration tasks that they want to be able to change they put it on plain HTTP they recognize that as dangerous so they decide let's encrypt it so nobody can modify it right so they implement something in the boot process where we download an encrypted script over HTTP decrypt it and run it one thing that people can be misled by is that openssl has a command line utility that will help you encrypt and decrypt things and like most command line utilities it returns an exit status for success or failure the exit status is based only on the padding being

correct so if someone just learns a little bit too much to be dangerous they could be very misleading that oh it's still the same because of padding succeeded let's take a look at that actually happening so suppose there's a man-in-the-middle attack where they substitute out the original script for this modified version that we just showed okay say it happens normally this boot process script runs as that script was written the f true gets hit and it prints out detected a security error you lose substitute out the modified version instead now it runs the else so no security problem detected dangerous operation happens like the other one the sample code is in this directory so if

you back up there's actually directory listing if you want to see all the examples so another problem say you know like okay I need integrity protection but you implement some kind of integrity check that is not cryptographically secure and now I'm getting into examples where I have some some actual historical exploits where these were big problems so I would say the most common like misuse security or redundancy check cyclic redundancy check is probably the most misused integrity check in terms of not defending against the right attack scenario so crcs they're quick to compute they're easy to implement in hardware so okay we'll use that it as an integrity check right the problem is that the system model or attack model of

crcs is called the binary symmetric channel and this has to do with radios and wires and stray alpha particles and whatever else you might call it that might randomly flip a bit as it's being sent down a wire so with some high probability you send a zero the other side ceasar zero you send a one the other side sees a one with some small probability independent and identically distributed in all of those things it gets flipped right well the problem is if there's a malicious adversary sitting in between these this is not random chance anymore they can flip it's maliciously and whenever they want so they can sit there and wait all here's the check saw or here is the permission

level and just flip it more details about crcs and where they're actually used and things to know about them they're commonly 16 or 32 bits long when you're in cryptography and you see something important that is 16 or 32 bits long what does that mean brute force right to short so brute force attacks say you have some system that takes messages and verifies the CRC and does something you just blast all 22 the 16th possible values and you'll and you'll get it if that's a security mechanism in that in that application when someone says CRC just the CRC it's not the CRC as there might be the shy one or the md5 CRC is kind of

a building block that needs more parameters so a polynomial actually goes into there and like AES and other cryptographic and error correcting code and so forth theory there's Galois fields and polynomial reduction and so forth going on behind the scenes in addition to that so certain polynomial needs to be picked implementations will commonly do some tweaks to try and prevent like CRC of zero being zero and other bad things like that so they will often say do bitwise not the input before processing it or at a post processing step like that they have to decide whether the bits will be processed little-endian or big-endian and so forth so I'm just mentioning those because the CRC you might be Miss

Lydell attack doesn't work or what I thought isn't true in this case they may have just done something like this the actual appropriate places for using it is as a defense against some kind of random corruption so Ethernet say the zip files and hard drives are all places where it's actually used but not in this attacker modification scenario so say you ignore this advice and use CRC for some security relevant purpose well let's look at some of the problems with its properties so starting out i'm using a python implementation built in just for ease of use we take the CRC of the string hello world we get a 32-bit value the end constant there is to force it to

be unsigned so that's not important so the CRC of hello world is some value the bitwise not of that CRC is exactly what you would expect first problematic property of CRC what if we take the CRC of hello world concatenated with the bitwise not if its own CRC well in the Python implement a should we get all one bits not something that's going to happen with random chance right if you saw a sha-1 or md5 of zero or all ones that's not something that happens by random chance so this is a big problem with crc as it cycles through these bits there's a state inside that has like the current crc of what i've looked at if it takes the crc

of itself it kind of resets the internal state to like 0 or 1 depending on how its implemented so not good for cryptography another problem so given this ability to reset CRC whenever we want say we take hello world crc of hello world hello world again we get a certain crc now knowing that we can reset it whenever we want we can do that as many times as we want and that allows us to produce as many messages as we want that all have the CMC RC value so this could come in handy if you're modifying something right how about some real world examples where crc was misused so everybody uses ssh and there's ssh to point oh and then there's

this ancient ssh one that like nobody's supposed to use and is disabled and has all these problems well one of those in ssh version 1.5 was that when it came time to verify packets are not modified they settle it so you see RC and this is actually an excerpt from the draft RFC document for ssh v 1.5 it's basically saying you take some plain text message you add some padding you add an IV so forth you encrypt it but before the encryption you take a CRC of the plaintext and put it in that packet right so if somebody flips a bit the CRC will be invalid well when they redesigned ssh2 they recognize that is a

big problem this is from the openssh website among the things they wanted to do was switch from crc to a proper h mac or real-wheel h-back algorithm as they call it here due to an insertion and tack so that sounds weird an insertion attack in secure shell where you're like sue do this do that that could be a problem so let's look into the details of that attack it's actually I guess 17 years old so here's a here's a bug track posting for that so someone actually was looking at SSH 1.5 I guess four years or so after it came out and they actually found by combining some of these block cipher cut and paste things I showed before as well

as the ability to flip a bit corrupt that block but control the value of the next block when it was decrypted they found they could kind of synthetically construct a packet containing some command by modifying an existing one and still have the CRC check out in the end by flipping bits and controlling what the CRC would be so you can imagine you could inject like sudo RM v RF / or something like that into the SSH command stream not a good thing so that's why ssh2 should only be used among other things it uses real integrity checks that are cryptographically based the other thing in addition to kind of the collision problem is modifying data that

is protected with a CRC and being able to predict the effect that that will have on the new CRC of the modified data so here is something as a stream cipher might do we have some message x sword with some other message of the same length so i have hexadecimal and ascii so all B's XOR with always produces pound signs in this scenario we take the CRC of both of those incoming messages and then we get the new message now the question is without actually taking the CRC of all the pound signs could we predict what that value would be given only the CRC of X the first message and the value of y that we rec soaring

against it right so we actually could do that we end up with the CRC if we compute it explicitly is this 7 f 95 so forth and then in the Python script underneath you can see the CRC can actually be computed without looking at X you take the CRC of X XOR with the bitwise not of reset the state and the aldi ace so in other words take a message tamper with it by flipping some bit as long as you do the correct thing to the crc will still check out well what if that x value were actually unknown to the attacker or only partially known but they still know the CRC well that could be a problem or

alternatively what if both that original X message and at CRC are encrypted using a stream cipher well stream cipher we can still flip bits wherever we want and that brings us to an infamous security problem which would be wet so remember WEP 802 11 40 bit Ivy's too short injection so forth so how the web frames actually looked they sent a frame containing an IV and then rc4 encrypted data composed of a message in its crc so some researchers basically from like two thousand i like to 2010 and i guess nobody cares about WEP anymore but basically was one attack after another each one improving on the last one of those earliest papers was recognizing

the fact that looking at these two properties XOR on ciphertext does the same thing to the plaintext and we can fix up the CRC after modifying it they gave a much more precise and theoretical description of exactly that basically saying we have a message and a delta or changes to that message if we take the CRC of the message the CRC the Delta and do everything exactly right we can produce a new message based on an old one having never known the key so that led many things the most one matching up with this particular vulnerability being cafe latte attack which someone actually presented a torque on like six years ago or something but basically they

recognized that rather than doing typical approaches to breaking WEP which is I'm standing here the access point is there let's interact with it and get it lots of IVs they took the opposite approach which is corporate Windows laptop leaves the company like somebody goes on vacation or to a conference they noticed a property about the versions of windows at the time where there was this remembered network list it would like reach out actively for like where is this network where is this network you could pretend to be the access point and then they notice a couple things like it would send out an ARP request upon connection which you couldn't answer because you didn't know the key or maybe

you could because using this malleability property and the ability to compute the correct CRC they could flip around the correct fields and the message to send a message to that device and every time it would we spawn with like an arc response and one of the properties of WEP is if you have to to the 24th packets because of the IV being 24 bits long then with increasing probability you've seen all of the IV's and all of the key streams and then can take advantage of various biases in our c4 and recover the WEP key so this is like the nail in the coffin for WEP if the other 50 attacks before it were not

so that's a problem with crcs now say your developer knows okay cryptographically secure one-way function so cryptographically secure hashes wonderful let's use that well what if they forget about the fact that these typical functions are uncured have a message just like CRC anyone can compute the hash of it as well right well one of the problems if we take our CS CRC approach is the guarantees and properties of even on keyed hash functions which are the famous three here that are always confusing and trying to remember what the distinction is so collision preimage is second preimage which basically boil down to you can't find two messages with the same hash value for one and also given a

hash value it's computationally infeasible to generate some message of the actors to that value or at least when the hash function comes out it meets all of its properties but more all the story this whole idea of flip a bed and like somehow fix up the hash value is not going to work and it's catastrophic Lee broken in some way so we have to find some other attack or look at even worse programs than before and this one I'll actually look at something behavioral so who is ever downloaded a linux distribution all right maybe using W get or something in the background so it doesn't crash or you accidentally close the tab or whatever right and you might use HTTP or

if you're old-fashioned FTP or something else right and once it finishes you don't want a coaster so you want to verify that the image downloaded correctly so you take a look at a file that looks like this right so sha256 some you compute it locally see if they match and if they match you're good to go right well if your attack model or threat model as you call it was about random corruption to that image as it was downloaded fine but one of the other reasons why they produce these is that if you think about it there are large numbers of mirror sites that they just kind of transparently send you to if you think about certificate authorities

having like 50 people that are all trusted to do the same thing maybe not too good of an idea so if you think about a compromise or malicious mirror site this may not be good enough because they modify the ISO they can modify the hash file as well well this is not something that they overlooked as ubuntu and debian or read at whoever the property that ubuntu and debian actually do is they put the signature in a separate file that looks like this so of the people that download linux distributions and compare the hash who actually remembers to verify the signature of the hash not so many one okay so that's important this was just a

behavioral or example but anybody that implements in their soft or some kind of auto update feature I've often seen its file and hash a file and you're good to go they don't think about well if you're a burp suite or whatever I can just put whatever hash I want so that's important you need to think about not only is ash correct but who's capable of generating it say they do recognize that we want to produce a keyed hash or keyed one-way function which is a message authentication code but some people think of that and decide let's make our own let's take a key in a message combine them in some way and hash them well quickly what a message

authentication is supposed to do is just like a hash except there's a key so only someone in possession of a key can generate that hash but don't forget only someone in possession of that key can verify it either so it's like symmetric cryptography as opposed to asymmetric cryptography what a Mac does protect a message from being modified a lot of people see Mac and just immediately think of H Mac that's probably the most popular way because of various rfcs and ietf standards but it's not the only way to implement a Mac in fact it's kind of a passing thing if you look at some of the newer cipher suites that I'll get to later in TLS and so forth but a hash is

probably in isolation in one of the fastest ways so it's often the one you think of first don't confuse them with digital signatures just like the previous example so it's symmetric this math software distribution would not be a place to use an h-back because you would have to have the key to verify it and you've had the key you can modify it don't try to make your own so how can you fail well it turns out that all three of the immediately obvious approaches for constructing a Mac or insecure some worse than others so basic problem you have a key you have a message you want to hash them so that you cannot generate that hash without

the key well one first one maybe you take the message and you concatenate the key on at the end well the the problem there which if you actually want an H Mac to be secure as any other form of Mac suppose the hashing algorithm has a collision in it that approach would be broken because of the layout of the message coming first but it's not so bad necessarily the one that is worse is the second example key concatenate message and I'll get to why in a second and then actually the people that really care about these things say you have two keys and you do a key the message another key that can be shown it snow snow more secure

than one key so all three of these are not something a crypt analysts would think very highly of but this middle one is catastrophic ly broken the reason is because of a length extension attack and the reason for this when you have a hash value that hash value represents the internal state of that hashing algorithm when it was done processing the message if you can turn that take that state and put it back into the hashing algorithm and feed more data through it the keys already been through so you'll compete a correct hash in a real-world scenario the key is unknown but you might know its length six bytes in this case hopefully it's actually bigger than that

and let's observe that we're interacting with some system that we are able to convince it to do the H back of the string hello world and it returns this value and it uses md5 which md5 4h Mac is not considered broken unlike we're all md5 just an important point now can we use a length extension attack to somehow feed earth-mars onto the end of hello world and still be able to commute compute the Mac without knowing the key almost so as stated you have a hash you can put that back in and keep going the problem is that md5 and sha-1 and shot to share a property where they like a block cipher actually have a block size

they don't just go like bit by bit or character by character and knowing the property of md5 you have a message before it hashes it it concatenates a 1 bit or a 0 80 bite fills it in with 0 bytes to meet a boundary on the block size and then or just before it and then adds to that the length of the message in bits and as a 64 bit little endian integer so the length part is important otherwise you would have the padding at it in different strings could end up with the same post padding representation notice though that the consequence is this is because md5 only operates on a certain block size of 64

but knowing this going back to our problem we don't know the secret but we know it six bytes long and if you do this value on hello world you end up with knowing that internal state contained this as the hash was generated the first 0x80 is that one bit the second 0x80 is the length of the string hello world in bits how do you reconstruct the state well this is one of those corner cases that i always find where scripting languages make it more difficult than c so in openssl you can actually take that hash and just force it into the struct contexts that it keeps around for maintaining the internal state of md5 so we take this

128-bit long hash and after looking at the RFC or Wikipedia article of Hell md5 works you know where the various values go and you know how to reverse the bits or bytes in the correct order the NL is just keeping track of how many bits of message have been seen so we've put that state back into md5 and we can feed more things through it let's do exactly that run that Earth Mars through there and md5 final will add the padding again and release the hash and we notice we run that program and we get this 16 15 and so on hash what this told us is the hash of this synthetically constructed string underneath of hello world padding Earth

Mars and in reality padding again is that value so we can check that out by manually taking the md5 of this string and seeing if we get the same value and we do so that's a length extension attack similarly to the other problems some sample code there if you want to play around with it the important thing to note is it's not just md5 sha-1 and sha to have the same property with different block sizes and maybe different byte order depends on who designed it but this is something that nest and others actually took into consideration as they design sha-3 and it's not susceptible and the only distinction is if you don't reveal the entire state of the hash you can't like

rewind and go back into that eight and continue so this seems like something who would ever make this mistake well flickr did which may have been more important when they made the mistake than they are today so i think they're associated with yahoo or something so they had an api which kind of if you think about amazon web services they have this idea of a signed URL which in reality should be called a mac to your elf because it's not a symmetric but they had some URL we call an API and you have a bunch of parameters saying like user ID equals this action equals that maybe photo ID equals something else and then a hash

value so that you could hand this as a web application you can hand this URL out to a user's browser they could reach out to the Flickr servers and those servers could in turn verify that the user and application everybody is okay with this and it's not just some random URL that your brute forcing so they had had an API that looks like this I the paper at the bottom linked but the moral of the story was they were able to do the length extension attack combine this with some properties of HTTP query strings as well as the way in which they computed the hash to have a URL that looks like this and you can't really see

because of this projector but where I have the md5 padding it's very similar to what i just showed with a 80 000 00 and then a link and then they were able to have essentially the original request padding and then more data added on the end that did something completely different than the original one and the server accepted it so this is a big problem but if you just use standard algorithms like H Mac you would have been fine so kind of embarrassing for them another problem as these systems were thinking of thinking about get more and more sophisticated say you are indeed doing encryption and authentication well the problem you have is what order to do them in right at the

two systems that you see are see it was integrity check and then encrypt well it turns out that if you think about the three ways to do that modern practice is you do the encryption first and whether the others are insecure or not depends on the specific protocol you're looking at but they're called encrypt then Mac which is as it sounds you have message and it's hash or other integrity check and encrypt the whole thing and then or that's Mac then encrypt encrypt then Mac is you would have the okay this is very confusing as you can see so encrypt and mac vs mac then encrypt versus encrypt then mac so do you encrypt the hash value or not and

do you do it before or after encryption is what it boils down to and the encrypt then Mac is what is recommended today the problem with the others is the issue of padding so if you hatch or integrity check cipher text and then encrypt the plain text and then encrypt that plain tex actually changed a little bit in the process of encrypting by adding padding so block ciphers have to do this only and encrypt then Mac because the encryption happens first is the value of the padding covered by the authentication check and all the others you can tamper with it and the check will still pass unless the padding itself fails but what if the padding

itself is not very well defined and it's not covered by the HVAC and you can flip bits or cut and paste blocks around if this starts to film familiar this is actually poodle the kind of stake through the heart for SSL three-point oh the actual root of the problem is CBC encryption unlike the actual rfcs that have come out since that protocol existed the padding block which is 16 byte was defined that just one single byte in there had to have the size of the padding right well what they noticed in that paper is by taking advantage of these cbc attacks and kind of controlling values before earlier in the process and just cutting those blocks

away and putting them later you have the single bite that has to be correct and by random chance you have one and 256 of matching and if you match that they were able to go bite by bite and reconstruct values of session cookies and so forth and once you do that despite it being over SSL now you can just go to the application directly and be that user however if they had authenticated that padding by using encrypt then Mac they would have been okay so minor little detail like that is why that's why i say this seems a little more interesting than just encryption alone often times and the secondary thing you thought an integrity check was

for integrity but in this case it actually helped you out with conf yet confidentiality as well looking at the other purpose of integrity checks as I get ready to close out here so the other thing that kind of falls out of the design of an integrity check that's now used almost universally is you want to verify that a message you receive today and a message you receive a year from now are the same without storing it which is exactly the problem you have with passwords right but all the immediately intuitive ways of doing that are insecure pretty much so you might by looking at a hash function say ok one-way transform great you can go one

way you can't go the other so you take a password password one you hash it and you get a hash value and you throw that in a database because of preimage resistance this wonderful property that we studied of this algorithm we know that taking that hash and figuring out what the password was is infeasible right well if the user use a weak password every common password is already out there as its hash has been pre computed in a dictionary disk sizes are large enough to soar very common words or even less common words you google an md5 of something that's not cryptographically secure or not a good password it's going to come up in one of

these sites so this is you think this would be a solved problem but if you look at kind of these niche market web applications that we see often like this has not been realized and fixed even today and I actually want to mention the LinkedIn breach like two or three years ago I think they had plane hashes I'll get to the correct way a little on so given all those failures what are the actual tools that we covered and how do the one of the guarantees they provided a little more detail so I kind of covered one-way functions or hashes that they're not the only way once again you take a message you get an integrity

value message authentication codes a keyed version of that authenticated encryption I'll get to in a little bit and digital signatures so one-way functions you take a message you get some transformation of it that can't be reversed hashes are not the only way to do that and another way is you can take a CBC mode cipher and kind of manipulate it in a certain way to get what you might call a hashing algorithm remember those properties I don't want to reiterate them again because they just confuse me and you most likely maybe not everybody but I think a notable property here is unlike symmetric encryption which we still have plenty of things from the 70s and 80s that are failing

because of key length rather than other properties like Triple DES or something or even our c4s kind of just in the past few years been like unusable they tend to catastrophic lee fail not very long after they come out I've noticed so md4 still has some residual uses and some NT password things I belief and then md5 and then sha-1 all three of those or like do not use at this point because of collision problem so if you think about it like sha-1 something coming out in the mid-90s and twenty years later considerably completely unusable and it's reputable people that came up with these things it's obviously much more difficult than just a plain cipher so

recommended today are Shaw to and Shaw three sha three is more recent and may not be in all the library so you want you that we would use shot too can be a little confusing because sha 384 is considered a 384-bit long version of Shaw to and that notch 03 if you need to implement method message authentication codes or use one remember or that it's not just adding a key to a one-way function H Mack is probably the most popular it's not the only way don't build your own if you look at the implementation of H Mac it's actually a hash inside of a hash with these weird constants added in there for various reasons that theoretically give you

certain guarantees so I don't tamper with that and an interesting point as I mentioned before because H Mac is so well defined or well-designed even if a collision is found in the hash the H Mac might not itself be broken and this is something that's a little important to me because I write recommendations and reports say they're using hmm b5 well this is not like critical places on fire you can say there are RFC's out there that are kind of like halfway on the point like updating considerations for md5 and if you search for that you'll see it but it says like hmm v5 is not necessarily broken but in the spirit of deprecating algorithms when they're weak

like it's a very weak statement like that hmh cha one is considered like good to go still so as you look at that problem what if you could encrypt something and authenticate it in one step it's difficult to do incorrectly and that's the idea behind authenticated encryption and part of their that's less opportunity for mistakes if the encryption process is a black box API to the developer and this is the mindset behind AES GCM or gal counter mode in addition to some others that are less popular and the reason this is important is particularly for best practices in TLS and ssh both of those offer GCM and modern versions and that's the best way to go because the whole historical order

of mac and encrypt just goes away if you use that it's also faster than doing the to a separate steps so that's important when you think about a ESN i and other performance improvements and making sure you get one hundred percent of the benefit of those final review on digital signatures it's basically a Mac but asymmetric so anyone can verify it but only one person can the important thing though is that these use one-way functions like hashes in their raw forms internally as part of that process so a RSA certificate has a shot if you see 256 hash of things inside it so if the collision is found unlike H max the whole system goes away

so md5 this is like 10 years ago at this point you can modify a certificate and reconstruct the hash and and break TLS or other system in that way when you think of digital signature think of RSA and ecdsa two most common implementations so closing out as a developer or security person what should you know about these things if nothing else and that will be the recommendations so if you are developing an application or just as an end user you run into a situation where you're doing something security relevant that requires message integrity if you're making some system that has multiple network nodes or IOT since I'm from the IOT village just use TLS if you can or

maybe use SSH if that's more appropriate for what you're doing don't design your own and if you do insist on designing your own as a blank slate today really look at aes GCM mode for performance and the stronger and more difficult ways of messing it up on the other side if you're doing passwords don't say well we'll md5 this and our salt is this just use one of these built-in algorithms like PB kdf to be crypt s script and so on they're actually designed for deriving a cryptographic key from a password but you can use them for verification as well current recommendations are enforce a minimum length and perform a dictionary check this is actually things going on with

the nist SP process right now where a lot of the old wisdom like password aging and so many exclamation points and security questions all that is actually not just security people frustrated with that the actual government standards community is starting to realize that and move away from it as well so Oh closing out at 4957 any questions

okay so i will have these they're not on our ifc website immediately but i'll get them there sometime tomorrow and in the meantime if you go to demo dot security evaluators calm / integrity you can play with some of those examples if you so desire all right thank you

BSides DC 2016 - Tales of Fails and Tools for Message Integrity

Related talks