Securing Protocols With (Not So) Formal Methods - Jean Snyman

BSides Bristol · 202536:2639 viewsPublished 2024-01Watch on YouTube ↗

Speakers

Jean Snyman

Tags

CategoryTechnical

StyleTalk

Show transcript [en]

hi everyone uh my name is John um I'm giving this talk on uh well it's titled securing protocols with uh Noto formal methods so the idea here is to present a particular approach to looking at and thinking about protocols whether you are um you know building a protocol uh for some purpose or maybe you are U trying to find flaws in an existing protocol cuz uh Protocols are everywhere um so a bit about me um I'm currently working as a security uh research engineer at H Packard um I did my Ms in information security um and chronic procrastination um and I was uh for that I received an award um so yeah yeah doing pretty all right I think um so

formal methods uh what exactly are they um simply um we use them to analyze programs or systems uh using maths uh please don't leave um the uh but actually there's going to be very little math actually no maths in this in this talk so you're all safe um but actually formal math are pretty cool because they are what gives give us confidence in our cryptographic Primitives um that you know the algorithms we use are secure so what do I mean by Prim Primitives um these are your sort of standard cryptographic algorithms things like AES RSA Shot 2 256 Etc um and we call them Primitives simply of course because they're not terribly useful on their own we use them

to construct bigger systems uh for example protocols um so you may have heard that designing cryptographic Primitives uh correctly it's uh kind of hard um well good news um using these Primitives correctly is uh super easy it's a total walk in the park uh oh no wait that's completely wrong um so luckily for us we can apply for methods to things like protocols also and many people have um but first um just to be clear what do we mean by protocol so for the purposes of this talk today uh we're going to say that a protocol is simply uh a defined way of communicating uh between devices in service of some goal right um and on the slide we have an

example of a pretty bad authentication uh protocol um and uh but you know using formal methods we could take this protocol and prove that it is secure or maybe that it's not so secure as we might have thought um and we you know we have proofs for many different protocols um TLS uh Kerberos uh even like you know secure messaging apps like signal and Matrix you know people have analyzed those protocols and uh found flaws in them uh oof is another one um so yeah so if we look at formal verification for security um this is roughly how it works okay so we have a system um and that you know in our case we're talking about a

protocol it could also be a program perhaps that we're looking at um and this system will have different elements right uh different parties uh different algorithm sub routines uh or you know if we're talking about protocols individual me messages right and we can take these elements and model them using mathematical constructs right so we translate uh you know something in the real world into some you know conceptual model basically um and we write all of this down in excruciating detail so then we take our model um that we've written out and uh we write based on that model a set of security definitions so what does it mean for something to be secure you know that's a quite

open-ended question and it's going to depend on your particular use case is is confidentiality important for you how about non-repudiation maybe you're looking for forward secrecy well you need to capture those in formal definitions um and then finally um you produce a set of proofs uh based on uh you know proving those security definitions in the context of your model right um and uh yeah I just clarify though I have a picture of dice on the screen there I'm not suggesting that you should take Dice and you know do a roll of the dice to figure out whether your protocol is secure or not I'm I'm actually referencing a particular type of uh proof there called

a game base proof but uh that's neither here or there so uh there is a problem though unfortunately with uh formal verification so recently I uh use formal methods in a research paper um which I'm still working on actually um so here are the pages that give the model and the proofs but okay that's a lot but we're done oh no wait I lied there's more um so it's a lot but at least at the end of it we have fairly good assurance that this protocol that's been analyzed as secure but that's about 18 Pages give or take um and there are ways in which you you know there are methods of doing this which

don't require you to do it by hand like I've done there uh but it's still it's still you know very time consuming uh it's it's a lot of work um so in summary if we look at at formal methods or formal verification um we can kind of look at it as like a function um a function that you know takes as arguments a protocol Blood Sweat and Tears and returns either a proof of security or failing that gives you a flaw in the protocol now now I have a theory and that's that you can take this and modify it slightly that you can simplify things and get rid of the blood s and tears that goes into formal analysis now if

you do this you're not going to get a full proof of security but you can get some general assurance and sometimes that's good enough because you can still use this to find flaws in in a protocol so in the remainder of this talk I'm going to be talking about sort of lessons that that we can learn from formal analysis and hopefully we can apply these lessons uh whenever we approach a protocol not just when we need the rigor of formal proofs so to do this we're going to work through a real world example of a protocol to together all right so I'm going to present to you a problem um or a given scenario really so we're going to be

working with this open- source project that uh is for you know it does remote attestation and verification so what does that mean so remote attestation is uh a technique that allows a device uh to sort of report its state to uh another system right um and then that other system will verify the state so it will make a determination about whether the reported state is first of all an accurate representation of the system state but also whether that state is sort of as is expected um and we'll talk a bit more about what that means in a second um so the project that I'm talking about is called key liim um and it can be found at that URL

um but yeah so so okay if we consider this system of ours this this remote attestation verification system we have some device right and this can be a server it could be a network switch it could be you know potentially a workstation um and this device could be in a secure data center or maybe it's somewhere without good physical Access Control maybe it's in a retail uh environment right so anyone can maybe walk up to that device plug something in maybe swap out a PCI card for a different PCI card maybe change the the boot order um of the of the system um install software that is not authorized Etc so we want to be able to

detect if that happens um so this device uh of ours it uh contains inside it a TPM so if you don't know what a TPM is it's basically a a sort of a a cryptographic processor um it does cryptographic operations but more importantly it um has sort of Secure Storage or storage that is supposed to be prevented from tempering so it if you store a key in the TPM you're pretty sure that it's not uh going to leak hopefully um and the TPM sort of receives uh sort of measures of system State and we'll talk more about that in a second Additionally the device will have an software agent running on it and this agent is what will gather the

system State and report it to that uh Central server So speaking of central servers we have two of them we have a registar uh which simply keeps track of all our devices in our uh Network or deployed in the field and we have a verifier and that's what actually receives the state and makes a determination of whether it uh verifies according to a given policy or not so at a high level let's quickly think about our trust assumptions here um first of all we obviously trust our TPM uh without our TPM uh none of this works um so we we trust that um the TPM is is built according to spec and it's built by a trusted

manufacturer and behaves as we expected to similarly we trust the regist and the verifier obviously because well in the case of the verifier um we need to trust what it tells us about that the state of that device but here's but we don't actually trust the agent or the device itself or or perhaps more accurate way of putting this is that we might trust these devices conditionally so once uh if the verifier says that the state is as we expect then we might uh trust that device and and that is exactly what happens in uh workload authentication if anyone is familiar with spiy or Spire that's exactly what happens um spiffy or Spire will issue a credential to a

device but conditionally based on what whether or not that uh attestation has verified successfully all right so to achieve all this we have sort of different phases um that take place we have uh first of all a registration protocol at a high level how this works is that the agent when it starts up for the first time it will retrieve um something called an EK or a endorsement key uh from the TPM um and also a ekert and that that's a just the x509 certificate which contains the the public portion of the EK and it is signed by the manufacturer of the TPM all right um and then we also uh get a attestation key or we request a new

attestation key be created and we retrieve that um so the agent receives these keys and it reports these to the registar now when the registar receives them it checks to make sure that certificate is issued by a trusted TPM manufacturer it also will check to make sure that the EK and the AK it receives are both resident in the same TPM and there's a specific mechanism defined in the TPM spec by which you do that all right so that's that's registration now for the actual attestation when this when the device reports it state to the verifier how does that work so first of all um periodically loog various logs from the device will get sort of

read in a sense by the by the TPM um but the TPM doesn't store the logs because it doesn't have the the capacity to do so it simply stores a hash of those logs and by the way I'm talking about logs here these can be um things like ufi logs which will contain uh you know your PCI devices that could connected the the version of the firmware that's running Etc um there's various things that it could be there's also IMA which is a Linux uh technology which reports on what files um are you know the contents of files when they are sort of read into memory um so there's these logs can be various measures of

system State um what the TPM basically stores a hash of of those logs um so the verifier will periodically request um an attestation from the agent and uh the agent um will then request what is called a TPM quote from the TPM and this is a rough handwavy approximation here because the details aren't super important to us um but basically the TPM will certify that you know it has this particular hash which again is a hash of our logs um in you know the in its uh registers registers somewhere right um and then and it sort of provides that quote to the agent which the agent of course oh well okay the the agent receives a quote and combines this with

the actual logs so it has a hash of effectively a hash of the logs and the logs themselves and then it reports both of those things together to the verifier um and that hash is sort of signed by the TPM if you would um so now that we understand rough L how this whole system works let's build a model together all right so we're going to do this very sort of back of the napkin approach we're not going to do it according to the the the the painstaking process of uh that you usually would when you're doing a formal uh model um and we're not going to use precise mathematical terms um and our focus is

going to be on what the attacker can actually accomplish uh this is what is known as an adversarial model all right so very quickly we have parties a verifier register or tenant oh um the forget about the tenant uh we have a verifier or registar um and uh we have sort of many many different nodes or attested devices right um each node has a TPM and an agent as we discussed before and you can view these as parties themselves right um because again uh the TPM has a different has different trust assumptions from our agent um even though they're form part of the same node um and each node also has a representation of its state um which

will sort of abstract away as sort of a particular value we don't really care what the value is you can think of it like a Json file maybe if you wanted to um you know that has say your PCI cards in it or whatever um and then each TPM contains a hash of its no nodes State value um and so you'll notice here that we're being we're not being very precise we're sort of approximating things and that's sort of how you would do a formal model as well that's how formal modeling works as well um is that you know you don't you don't get every detail you just get you know you get a close enough approxim

um so we also have some keys um our register our verifier and our registar both have a TLS certificate they're both issued by the same certificate Authority um and the they possess the private key obviously um and they all trust that that CA and the agents in our system also trust the ca the the ca is sort of preloaded in the agents um and we have rtpm has that that EK and AK that we talked about previously so policies now the our verifier keeps a mapping of nodes to policies so a policy simply describes what the node State value is sort of expected to contain uh and you can think of this as sort of like an

authentication policy or authorization policy I should say um so you have a set of rules about uh what you expect to see in that state value um all right so now we're going to talk about our adversary um here we have what is known as a network based attacker so the attacker controls the network and it is the network um which means it can intercept it can block it can modify any messages which transverse the network um and but and that's quite standard you know we have these network-based attackers in you know in most protocol analysis will have this but we're also going to add some additional capabilities to the adversary um so the adversary May additionally corrupt a

node at its discretion so this means that it can access or modify any and all data stored by the agent um including any private Keys access and modifies the node State Val Val um intercept the agent reading of the state value um and exercise full control over the channel between the TPM and the agent so why are we doing this right because we are trying to detect modifications of that end system so we have to allow attacks in which there is where in which the attacker is resident on that attested system right so that's that's why we're doing this all right um but notably the adversary cannot corrupt atpm and it cannot break any cryptographic

Primitives all right so now our security uh definition um often it's useful to think what's the purpose of this particular protocol that we're looking at so in this case you know we're trying to get this verification result at the end of the day either yes or no it matches policy it doesn't match policy and we want that result to actually reflect the state of the device right um so we'll call our security property verification security so a successful attack against verification security is we'll say that it's when there's a mismatch between two things a verification outcome as determined by the verifier and the correct expected verification outcome for that particular node so this means that an attack in which

verification passes when it shouldn't pass is a valid attack and the inverse of that is also a valid attack but attacks which exploit poorly defined verification policies um so your policy doesn't you know capture that particular modification of the system you know that's out of scope also you know these measures of System state are just measures they're not a complete picture of the system state right so attacks which involve modifying those ports Parts which aren't reported to the verifier also of course out of scope um all right so now using this let's actually look at some protocols together so here's an example registration protocol for key lime in fact this is the current registration protocol that exists today

so we have three parties rtpm an agent and a registar or the registar and notably this protocol doesn't use TLS at all it's all in clear text all right so our agent sends that EK Pub uh that ekert and the AK Pub as previously discussed to the registar and the registar verifies that CT um against a TPM manufacturer certificate and if that passes it generates a new random key which we'll call k um and now the the the regist is going to use that K value to check if there's a binding between the AK and the EK so the way it does this roughly speaking this is not actually how it works in practice I am simplifying things but

basically um the uh the registar encrypts well first of all it concatenates the AK with the that random K value um and it encrypts the the result of that uh with the EK Pub and the agent just forwards that to the TPM now and this is a defined process in the TPM spec again so the TPM will decrypt the EK or sorry will decrypt that uh package that it's received using the the EK or the the private portion of the EK right and then it's going to check to see if that AK that's contained within it if it's the owner of that AK if it has that AK stored in uh in its uh registers um and if so only if that

check passes does it release the K value to the agent then the the agent performs a hmac uh of its ID uh using that K as the key um and sends that to the registar and now the registar is can be certain that that AK and EK are resident within the same TPM okay but there are some issues here so let's sort of look at think of some of the things the adversary can do so what if the adversary uh steals the EK and the AK remember that all of this is in clear text and the adversary has full control over their Network so it can pull out that EK Pub that certificate that AK Pub

right and this is in a run between agent one tpm1 right okay so the confidential confidentiality of these keys aren't protected right but now so it takes those and injects them into a different run of the protocol so now we have a registration protocol happening between with a different agent agent two and a different TPM tpm2 right so instead it swaps in those EK that EK and the AK from the previous run because again it can do that and then as a result of that um the the sort of value that the registar sends back is actually um encrypted with the other agents EK Pub and it contains the other agents AK and now we don't want that message to go

through because if this TPM receives that you know it's not going to be able to do anything with that that would be pointless so we just go ahead and block that request but we what but the adversary saves that for future and so if we return now to our first one of the protocol we can take that that encrypted um message from the other run and inject it here in place of the the encrypted message that the registar sends back so the registar sends back that encrypted message and our adversary swaps it out for the other encrypted message and that always obviously carries all the way through the agent forwards that to the TPM but actually

note now that this EK Pub is is the EK Pub for this tpm1 right because we swapped it in in the other run of the protocol so this TPM decrypts K Prime which is the K that from the other R of the protocol successfully now because of our adversaries additional capabilities remember we've said that the adversary can corrupt an agent right so we can actually retrieve that K Prime from the agent's memory um and now if we return to our second run of the protocol we don't even have to do the rest of this protocol in fact right because we have knowledge of that K Prime now so using that knowledge of the K Prime we can just do this hmac

ourselves as the adversary right and now as a result of this we have the registar associating um the wrong TPM or the is associating a TPM with the wrong agent if that makes sense um so agent one's TPM is associated with agent two effectively so that breaks of course our security uh proc property right because that would cause could cause uh verification to fail um because it's expecting it for one system or the verifier is expecting state from one system and it's actually getting state from another system now I've run out of time here there are more attacks that can be done against this protocol and uh I look at we've look at them you know by

upgrading uh the protocol uh with TLS on the registar side so server authenticated PLS uh turns out even if you do that that still doesn't fix the issue um we also try to fix the issue by um implementing Mutual uh TLS between the uh agent and the registar but that also doesn't fix the issue because again the adversary can read the the file system of the node so even though there's Mutual authentication it can retrieve that private key from the agent and so it can still sort of impersonate the agent um so yeah here's the example with mutually authenticated TLS um and again our adversary can take control of that agent and uh impersonated going forward so suffice to

say that this protocol is uh quite broken um now I I wish I could I had more time to go into this more um I've been working with these keyline protocols for the last I don't know 6 months or so um and there it's not just the registration protocol it sort of goes beyond that um and we're working on fixing this actually um there is a a a a proposal that I've written um you know to to sort of redesign these protocols um the community is working on so good news there um so quick summary um by sort of doing this sort of informal protocol analysis um we've actually managed to discover attacks on these

protocols we have two attacks in fact uh we didn't get into attack two really um but both of these attacks result in a verification policy being applied to the wrong node and so we achieve this by sort of very coarsely modeling our system outlining the attacker's capabilities and defining certain security properties so I would encourage you that next time you're looking at a sequence diagram for some protocol that you think try and think in terms of the adversary who is our adversary what and what can they actually do do because if you do that then you are sort of developing this mindset which helps you to identify weaknesses um and helps you to think through the implications of your design

decisions so that's it that's my talk um anybody have any questions yes question I have a question so it looks like we

verifying the manufacturer of that right so I'm not very familiar on with how TP work but is there any way that besides verifying the manufacturer of the tpn I can have sort of personalization of PPS because let's say that we use brand a and you have physical access to my device you swap the tpn and is yours because it will pass the manufactur verification but what happens if I have a TPM sign with exclusive key or identify for my company even if you swap the TPM you need to swap my myy material whatever personally my own so it's just an idea but I I don't know if that somehow mitigates the threat of physical access

well so interestingly so you're not supposed to be able to swap TPMS they're sort of soldered onto the motherboard typically um but you know you can sort of in effect um and that's kind of what we do in one of these attacks um the um basically the the attacker in this attack you know has its own TPM right and it uses that TPM to do bad things because you're right we're just trusting any TPM right because a TPM doesn't identify a device in fact in fact the spec expressly says that it's not supposed to do that so if you're using a TPM for purposes of identifying a device authenticating a particular device then that's not an appropriate use case and

this is I think the Trap that the keyme people sort of fell into is thinking that oh we have these Keys we have these certificates we have authentication we don't actually have authentication of that device in effect does that make sense does that sort of answer your question standard

stand yes that's a very good point yeah actually one of our proposed fixes is through the use of I I or Dev IDs and uh yeah yeah yeah then you bind the TPM to a particular device with a particular serial number and

that any other questions all right well thanks so much thanks for your patience everyone

Securing Protocols With (Not So) Formal Methods - Jean Snyman

Related talks