
um so I work in open source I've been working in open source for 20 uh years one of the things that I love about open source is that folk have a problem they want to solve come together write code release the code and it's available to everyone so this talk is uh about how some developers with inside Amazon uh had a problem they wanted to tackle which was the problem of authoriz ation uh and then created a project called Ceda which we open source last year so um that is the project that's sorry that's that's the talk um now I'm going to start off so first of all I make no assumptions whether you know
authorization or authentication actually that's the previous slide right so I miss that one out previous slide you know people get them mixed up but authorization is the process by when we build applications as developers how we make sure that princi users of our applications multi-user applications only have access to the resources they're allowed to have access to and interestingly enough in the last 30 years I've been working in Tech whilst the authentication problem has been pretty much solved the authorization problem remains a big problem and actually last week um someone sent me this it's now the number one item on OS web application developer um issues um so I guess kind of begs the question as to you know if we've cracked
authentication why is authorization so hard and the reality is that the actual theory of auth authorization is actually well known and we've developed over the last 30 years different ways of doing authorization so we have uh arbac roles-based Access Control we have ABAC attribute based access control and we have reback relationship based access controls and a few others which I didn't put on here and uh the difference between these are things such as role base being hierarchical so you can think of a directory structure active directory if you use that um attribute is where you assign attributes such as for example I'm the author of something and therefore I can provide authorization against that and
relationship kind of combines this and other things so the theory is well known but the reality is that actually implementing authorization is hard so we go back 20 years uh to when everything was monolith a piece of code that you could run there were fewer programming languages fewer approaches to doing things this was still hard right because we had to create authorization systems that allow us to create policies that would allow us to make sure that only the people had access to the resources we had to um figure out how to provide third party audit access to make sure that we could provide confidence in those authorizations and we need to make sure that we built this in a way that it was
easy to maintain as both when we needed to update the code but also when we needed to update security and um so this was 20 years ago and it was hard then to do this but if we go to where we are today we have sort of microservices written in different languages if you're building an application that you know combines Java rust go and you have to make sure maintain authorization request across all those different languages that makes this problem so so much harder and and what developers have tended to do okay you don't have to put your hand up if you you know uh recognize this but developers do this a lot now this is a very simple example in
Python but effectively what they do is they iterate in their code the different levels of access required for particular application resources and some of these can get very very complex and I've seen some absolute Horrors in my time and as Developers people put their sort of like you know heads together think you know I can build this I'm a good developer I'll build my own authorization system and um I I've done it myself right and it's super hard it's super hard for a number of different reasons the first is when you want to try and model authorization a lot of inv a lot of complex mathematics is involved um and logic uh and unless you are you know a master of
that you are going to make mistakes um a key characteristics of authorization is that it's got to be fast and it's got to be reliable because if you have an application that's being used by thousands or hundreds of thousands of people and it either goes slow every time you try and do something uh or just is unreliable in crashes that's not going to uh be well received by your customers and then how do you actually make sure that people trust your authorization engine because I know that when I built mine I would not trust you know my my house my life on it right I just wasn't sure it would actually always work and so if you're a company
how do you make sure you trust the authorization that uh you're using and so this is why these are the problems these are the challenges that these developers with inside Amazon decided that they wanted to tackle and they created something called Ceda so bit confusing cuz it's like a tree but um I don't know the origins of the name uh we're still trying to figure out what that is and CED that is three things the first thing is it's actually a policy language it's a policy language written in quite simple language that allows you to uh as a I guess application owner Define policies of who you want to access resources in your application the second thing is it takes
it's an authorization authorization engine that takes that policy plus artifacts from your application to proably provide correct authorization decisions and then one of the key characteristics of K is it provides analysis tools so you can start to reason about the policies you've got and start to ask questions such as you know uh do I have too much um authorization for this particular um principle or uh do I have duplicates within my authorization policies so I'm going to go through all these these three things and the first one is the authorization language so this is policies so policy defines what users principles can do what actions against your application resources it's pretty straightforward it's it's a very
simple example there and what it allows us to do is if we use the same code example from before it changes and simplifies the code from all that boiler plate code to just one function call which is is is authorized uh and so this is how effectively this can make your applications much simpler and irrespective of the language you're in it's always doing the same thing now when you write these policies in Ceda there are kind of three things within the policy the first is the effect uh which is allow or deny the second one is the scope so kind of based on what you want to do so a combination of the users the actions and the
resources what what you um want to provide authorization for and then the final one is uh optional which is attribute based so condition so I might guarant I might want to give uh say John Doe access to uh manage photos um but I only want to do it if they are logged in with MFA okay and so this is what they look like okay when you actually write these so permit um is your actual policy we have the optional in purple the conditional which is the uh when clause and one of the things that um I always get asked okay is why do we separate the two things out why do we do it this way uh
and the way that we do this Amazon is that we create something and then we get some customers and we say look this is what we're thinking about um you know what you think and they told us because we gave them a number of different approaches and they said they wanted this because it was simpler to read and this is important because one of the things we did before we created this is we looked at what other open source projects out there existed that we could contribute or work with to do this and the of the main one is Opa open policy agent which uses R now we looked at this and it's very flexible it's a very flexible policy
language uh which you can use for infrastructure you can use for application authorization however what we found and what we what we found speaking to customers is it was just too complex you can compare these two policies that do the same thing um and you know trying to when you've got lots of different policies is actually makes it significantly harder um so so effectively we decided that whilst Opa is interesting and we we support Opa and other things um it wasn't going to do what we needed so that's that was the kind of catalyst for using uh or creating Ceda but of course um when you create policies your policy need to reflect what your application uh
resources so in one of the key parts of when you're using Ceda is you have to model your application so in in the examples that I'm going to be showing you'll see action equals something uh in this case I've got uh the demo I'm going to show is um photos sharing application now if you think about the characteristics of a photos sharing application they're going to be things like add photo create photo manage photo there might be some other things such as site admin that kind of thing so every application is different so you define your application characteristics in a in a schema and the schema is what allow your policies to um or you can Define stuff
in your policies about your application so in this example here I've got view photo Ceda knows about view photo because it's defined in this in the schema so I mentioned that there's two effects permit and deny or forbid and the scope is effectively the kind of resources that you want to access now you can either be explicit in the resource or about any of these this applies to principles resources and actions so in the top one we've just got action which means that within your schema all actions are permitted in the in the second one we've got an explicit which is deny photo so effectively what this means is that unless you meet the criteria for deny a delete photo you'll
be able to do all actions apart from delete photo because delete trumps uh permit now in the examples I'm showing including in the code I use very simple easy to recognize names but you wouldn't do this if you were doing this in reality you would use uh unique references um because there's always especially if you got a user ID you know us users have this habit of changing name um and you don't want to hard code that so basically you create an abstraction an entity ID U ID that references your user and then you do that in your application logic now the conditional Clauses are very flexible um but they're not super flexible and this
is again one of the reasons why we create a Ceder is that there's always a tradeoff okay in performance characteristics and reliability characteristics um versus what you know developers want and so there are a few operators you can use um so in that respect it's much less flexible than things like uh Opa and um you can group things together so if you want for example Define one action or one principle um you can do that uh through lists for multi tented use cases you can actually Define name spaces as well where you can actually Implement uh you know multiple policies against different M main spaces so that's very quick overview of actually what policies look
like and the main characteristics of policy icies so how do how does this actually work so on the left hand side we have our policy we have our application information so the what the schema and then the entities within our application so the users things like that um and then we have context so this is the session so it might be things like you know I'm logged in my IP address my location that kind of thing so this is input into the authorization engine um and then the authorization engine will take that input and give you one of three outputs I know it says only two there but there's three outputs one will be allow second one will be deny
the third one will be an error the way that the evaluation engine works is through this model so if there's an explicit policy match you will get access if there's no policy match or there's an explicit deny or there's a combination of both you will not get access to your resources so we default in no access um which can be a pain when you're writing these things because you realize you every time you're writing your applications you don't get access and then you realize why but one of the questions you may be asking yourself is how can you actually trust I talked about building your own okay and how it's really hard so how can
you trust something like Ceda right versus your own code and this is where we look at how uh within Amazon we've got a research team that um is uh looking at how we as an organization can use automated reasoning people here familiar with automated reasoning put your hand up okay I I'll do I mean I'm not an expert in this right so I'll give a very very simple example but at school um I remember still uh you know working on Pythagoras theorems and calculus where you had to do for example proofs and all that kind of stuff and this kind of works in a very similar way um in that you define from a spec you start off with a
specification um so in our case you know what are what are the characteristics or what's the specification for an authorization system you then model these uh uh we use uh uh modeling tools such as daffany and lean and uh these are very expressive languages that allow you to create digital proofs in the same way as you used to do in mathematics with Pythagoras um but they also provide tools that allow you to test it to make sure uh you've got characters like correctness and then once we've got that we then Implement that so as an example okay this is what one of the specifications for an authorization system uh uh looks like so this is
actually any explicit deny in any policy logic overrides any allows right very simple and so taking a language like Deanie okay you can start writing some code that says okay well this is what we want to do but when you when when you think about this this is actually not um uh low enough detail right because what we need to do is when we're using um formal reasoning we have to look at the chain okay so we want evaluate logic evaluate logic needs a match deny match deny requires match statement match statement requires this Etc we get down to the lowest level and so what we end up having to do is create again in in
the tool logic that we can use to prove a string will always ignore case and so rather when when you're doing when Pythagoras you know did a a s = B2 + C2 we have to do that for every single part of the code so this thing gets really really big very very large this is the difference in when you're applying formal reasoning in code so it takes a long time um and it you know so so it took three years for example to to to create all of the the stuff that we did so the question is if it takes such a long time and it's so complex why do it um so the reason why we do it um is
that first of all it's the only way we can um provide proof of correctness in the systems we build okay so in this instance it's authorization and you know it's a really important characteristic to be able to rely and depend upon the fact that if you do an authorization request it's always going to be right and correct based on the policies you create but I think the more interesting things about why this is um uh useful is that it allows you to do things you can't ordinary do and I think that um I've got those uh coming up actually so one of you know the the things we want to do is we want to make sure we evolve
authorization code over time right so we want to make changes we want to improve it we want to add new features if we don't have a way of proving the characteristics or the correctness of security outcomes that process is very hard so think of it as like unit tests okay but once we've got proofs that allow us to check that the authorization results are always true we can go faster we can make changes we can do optimizations and so we use it for example when we when we built the new I am uh authorization engine uh which processes 1 billion requests per second of authorization requests so at this scale okay if we want to make changes we
need to make sure that we make the right changes and so this is where um effectively formal reasoning is really the only way that you can combine uh that proof of correctness with the ability to then move faster and so this is how Ceda was developed using a tool called or a methodology called verific verification guided development and effectively what this means is we took and we created using uh lean this is not not dhan we created from specifications some working code okay now we can't on that code in production it just doesn't work at scale right so we then built it again in Rust memory safe language very performant and we built that we built it again but we
don't have the the same Tools in lean that allow us to prove the correctness of the code in Rust so we know that the code will always give authorization requests in the lean implementation but we cannot say the same thing about rust and so this is where the the differenti testing comes in so we take basically inputs into both of the code bases that are created so we take a policy um we test it now if the rust code is right then it will give us the same output so allowed or denied or error as the lean version all right so we can we can test the code but what we want to really do
is test it literally hundreds of millions of times using fuzzers to create random uh uh policy inputs and as the output of that we get you know uh confidence that the rust code is equivalent to the the code and actually what happens is that every time we do a release so new feature gets added we go through this process and we were surprised how many bugs are rust code generated and what I mean by bug is the output from the lean versus the output from the rust was different and if that's different then you know you've got a problem because we know the Baseline is the lean implementation we know that's provably correct but if our
rust code differs we've done something wrong in our rust code um and so we do that until we get no errors and that means we can then do a release uh if you interested in that there's a blog post um I've got I've got a QR code at the end which you which will have all the links and everything so I've talked about what policy is I've talked about the policy authorization um engine and the next one is actually you know as an organization you want to now basically uh reason about the policies you have uh and typically this is because you want to check for things like equivalence Assumption of uh access within the users
you have and so there are tools CLI tools that allow you to do that so you can basically ask questions okay does this principle have this access and you can do it through automation um which means you can now start to um Implement modern application development techniques to test if your users got too much access or if you can simplify your policies um so I want to show you actually now how this works the code is available on repo it's python based um even though the Cod the actual engine is only and will only ever be deployed in Rust uh python has a thing called uh P30 I think it's called um which allows you
to have bindings very good very high performance bindings with rust um uh so yeah so that's so that's this code um and I'm going to take a very simple application it's a photo uh sharing application which you might typically have bunch of users groups uh that want to have access to uh a photo album you might have an admin that kind of thing and I might want to model some kind of typical authorization requests like this um so I talked about modeling your application uh I'll show you what it looks like but this is the schema of my application okay so I've got users I've got groups I've got photos I've got photo albums it's a very deliberately
very simple uh example and what I will do is I will take the stuff I've got in my schema I'll take the entities which are things like my usernames um the album names I will put that together with my policy into the authorization engine and it will then give me whether uh this resource or this request is allowed to proceed or not so let's now I'm going to sit down down and show you this stuff which you can run yourself so now so I've already I've actually already started the python application I'm not going to bother showing you that because it's pretty boring and I but I do want to show you what the code looks like so um a look
manage photo now this is looks okay up there so um this is deliberately verose the code um because I want it to to make it readable you wouldn't typically do it this way if you're coding the application yourself um but so and also this is flask appliation so if you're not familiar with flask it's a very simple uh tool you define a rout uh and then in this instance I've got login required I then Define the function of actually what executes when you access that rout and here I've got um my action my resources and my user and the request and then effectively I read in my policy and then here is the authorization request so I do authorization result um
the Diagnostics basically just outputs um the policy that was uh invoked you wouldn't again necessarily put that in there and then the actual logic is if it's deny then generate a 403 um and if not then show the page and that code is the same for every single function we have in there so if we had our own application like I've done this for Java as well in a spring boot application you could have a really simplified um application where you know you might have uh you know a front end react application that hits you know Java and uh python it's both using the same policies um so one policy you have to manage but the authorization logic is
still the same it's still still using SE the covers um so this is what my um schema looks like it's very difficult to see on this uh but you can kind of see the kind of thing that you know that you're effectively um showing how your application looks like and at the moment this is my these are my ques and they're all at the moment grayed out they're all disabled which means if I access my application so this is the application let's hope it works so this is the application very simple if I log in I'm want to log in as J
do um so again I can access the homepage but in theory I shouldn't be able to access any resources and as expected I can't let lock out so how do I change that how do I let Jane have access to resources well I can define a policy now there is one policy here um that I want to specify so where is it is it this [Music] one it's this one okay
so this is a very broad policy and I wouldn't ordinarily give this one to Jan but I do I want to show how how it works with regards to the conditional so this allows basically um permission access to any user against any action so those are the actions against all resources where the owner uh or the resource owner is the principal and in my entities I have to find this user Jane Doe and she is the owner of um one of the photo albums so she should have access from this policy to access now photos so let's now so I can access public photos and I can do man photos but I will not be able to access those now we
got John do who isn't the owner I don't think he's the owner anyway I've got that wrong let's see what he can do so he can't see anything so even though he's in the same group because he's not the owner of the resource he can't do that now I can log in as admin as an admin policy uh where is it
there we
go and this allows basically admin to have access to
everything there we go now in the in the in the example policies here I've actually um provide other examples how for example you can tie this into things like if you've got F MFA and to provide additional ways of enriching policy um stuff with the um request you've got now this is written in Python but what if you don't uh want to use Python there's this um thing called seed agent it runs as a its own process and um I can basically now I've actually got all these um where are they yeah I think they're here so I can actually now uh show you the so it's all all done through Swagger I can actually things do things like get
policies execute and it will show me the policy I've only got one policy at the moment um uh so you can use this and then use just standard uh uh API calls so using you know obviously I'm doing uh curl on the command line here um let's have a look if this one works
so allow access uh say allow and if I go back to the actual thing you'll see that basically outcome has succeeded so this is where for example if you don't want to use or if you're using a like esoteric language that's not supported by um the the bindings then this is how you would uh you would use that so that's the demo very quickly just I just want to show you how uh how it actually looks like now um there's we do a the research team does a lot of um papers uh and I mentioned Opa before um some of the the design decisions we took uh around uh Ceda and the specifically design around policies has actually
turned out to be very very uh wise because we can see that tail latencies as you scale up for other authorization systems start to go up but Ceda remains flat which means when you're doing like billions of requests per second then the performance of cedo is always going to to be uh uh good so effectively Ceda allows you to be performant it guarantees correctness and you can actually review um the policies there's a command line tool that allows you to automate this through the command line so what we see is we see people now starting to implement authorization through their um cicd pipelines so now you can do it and treat authorization as code which is something
that you've not traditionally been able to do and use the validation C to guarantee the policies are meet your test cases and you're not accidentally giving access to people you don't want to access um so there's a number of different ways of getting started this is a really good repo uh awesome Cedar it basically shows you both the official bindings language bindings to Java think Ruby um as well as the Community Driven ones the ones I showed you are Community Driven ones the CED one sorry the python one uh but provides also resources of how to get started and I showed the C the agent running so you can see that there now one of the questions I asked
okay is that everything I've showed you is running locally and you can do yourself but a lot of people start off doing it themselves and then realize that actually you know what I want someone to do it for me so there are companies like permit permit. strong DM or ourselves Amazon that provide effec a managed service of Ceda um and that's pretty much my talk really so if you want to learn more there's an online Workshop which you don't have to do anything you don't have to install anything you can just run it locally and um it's got policy playground so you can play around with it um there's a GitHub repo that allows you to get all the code um and there's
also a very active uh community on slack um that allows you to uh you know get involved with that because it's still a new a new project uh there are still rfc's being created that are changing what features this this seed is going to have going forward so I think we're on 3.2 at the moment but there's already about four or five rfc's these rfc's are how open source projects work right so as a community people are using it they say you know it wouldn't be great if we had this so hot topics are typically what operators they want to be seen in policy so that allows a much richer experience um when creating policies if
you want the slides um there is a survey um to do um you'll get it's a small small amount of questions I ask and then you get in return the SL I think for the first five you get a $25 Al credit C code as well so with that thank you very much and I've got time for I got time for questions yeah yeah I got time for questions a great talk um I noticed you said that the lean implementation not used production correct youate so the characteristics of what you want to run in production okay if you think about it it's got to be performant and it's got to be robust okay those are not characteristics that
the output of lean or defy or tlaa give you what they do is they provide tools to guarantee correctness or prove so they they create resolvers um and they have provide tools to to effectively guarantee that you are or you meet the the requirements of your specification so um you wouldn't you wouldn't use that in production right so the question is then you've got to then Implement something else right so for for um uh for for for ad us we we are now moving we're about 80% Java right but we are now moving more to rust especially critical systems around security are being built in Rust and so effectively what that means is that we'll build it
in Rust but rust doesn't have those capabilities right so how do you make sure that you you you you can have confidence in your rust code and there's no way right there's no there's no formal reasoning resolves which is why we came up with the differential verification methodology so it's not I mean you can you can run the the lean implementation you know Standalone on your on your on your laptop but you would not want to run it in production I mean there's no I mean uh lean or definitely don't have for example memory uh uh they're not type or memory safe right so who knows who could potentially compromise that I was just wondering did
you have something that would take lean as it is that into so so those tools do have that so I'm not sure about rust but they there is a um transpilation into Java right the problem is um what we I mean I I don't have it here but when you look at the code it generates Java code it's just it's it's dodgy it's not dodgy it's just it's correct it's well no it's not that it's it's it's hard to read right typical translation is that it doesn't think about an end user so it actually works okay but the way they Define for example variable names like like a sort of two pages long that kind
of thing so it it uh so there is a presentation actually we gave about four or five months ago that actually walks you through the code uh and what it looks like when it went through that translation process I don't think at the moment it supports rust as a output only Java um a lot of these tools actually tend to favor Java which is fine for us because we we we tend to write that but because we're now going to rust we now then Implement that through rust thank you that's right any other questions no I can't I've I've got the wrong glasses I'm not sure if youve if any questions up there okay so uh just a quick one because I noticed
that when you were doing the example uh you didn't actually have to reload flask uh when you updated the configuration is this just like a testing or is that actually like yeah so what I did for for the route if you saw I I'll show you the code
so in the code so the root okay you can see that I actually load I've got a function that reads the policy you wouldn't you wouldn't normally do that right I am only doing that because um to make it the application easy to to show how it works I mean you could if I mean I if you were doing system production bearing in mind the this is typically done for systems at scale you would profile this and to see what the performance impact of that was um but typically it will be a either a background reload task that maybe happens once an hour once a day or if you wanted to um because your application load is not that high you
could carry on doing something like this so yeah so this was a deliberate a deliberate reason just for the purpos of demoing yeah in in the same way that this code here um is very verose um it's it's designed really to show the the the way you would do it I mean if you were implementing it you would do it a lot better than this cool all right right well thank you very much then [Applause]