Jonathan Magen - SPNDL: Security Policy Notation and Description Language

Name: Jonathan Magen - SPNDL: Security Policy Notation and Description Language
Uploaded: 2020-12-04
Duration: 27 min 32 s
Description: SPNDL: Security Policy Notation and Description Language How can you specify security policies so that computers can analyze and enforce them? SPNDL began as an eccentric idea for building a policy domain-specific language (DSL), and evolved into one of the most in-depth research projects I've ever

BSides Philly · 202027:32222 viewsPublished 2020-12Watch on YouTube ↗

Speakers

Jonathan Magen

Tags

CategoryTechnical

ResearchMethodology Technical Deep-dives

StyleTalk

Mentioned in this talk

Tools used

Graphviz InSpec Prometheus

Platforms

AWS GCP Kubernetes Microsoft Azure OpenShift

Frameworks

OpenTelemetry

Concepts

Sigma rules

About this talk

SPNDL: Security Policy Notation and Description Language How can you specify security policies so that computers can analyze and enforce them? SPNDL began as an eccentric idea for building a policy domain-specific language (DSL), and evolved into one of the most in-depth research projects I've ever undertaken. Before its conclusion, the effort yielded not only syntax and semantics required to formally (and unambiguously) specify system-level security policies, but also an entire family of programs for working with them. This talk will begin by introducing SPNDL, the Security Policy Notation and Description Language. The presentation will detail the goals leading to SPNDL's inception and development, while also providing a theory of operation. It will feature real examples of SPNDL policies as well as delve into the architecture of its surrounding toolset. After detailing pros and cons, this talk will conclude with a brief enumeration of future work opportunities.

Show transcript [en]

[Music]

hi i'm jonathan megan thank you so much for coming to besides philadelphia and thank you so much for coming to my talk uh today we're going to talk about spindle the security policy notation and description language uh so this is sort of us bringing spindle to the world uh i'm super grateful that you were able to come and uh let's get started in case you were wondering who i am i am a principal i'm a computer scientist at cigna uh or rather at a large healthcare company that rhymes with cigna um i work on a little team called team t t stands for technology empowerment and advancement which is like a super corporate little joke our logo is a teapot you can see i

got my team shirt on here uh i've been at cigna for about uh six years coming up on my sixth signiversary i did about five years in startups before that before that i was in the public sector and i'm really fascinated by security as a computer science problem i think that there's a lot there i'm really very excited to talk to you today um and share some of the research that we've been doing uh over the past few years so the premise of this talk is something that i think most of us would agree on which is that for a lot of companies and a lot of organizations in general application security reviews are really

hard they're really very hard everyone seems to struggle you talk to someone and you hear about humans needing to balance safety with delivery pressures you hear how labor-intensive and error-prone things are you hear about how the process itself might not be ideal because it's after the fact it's sometimes even too late to institute design changes right you know cue that meme what i checked all the boxes we're finally secure right it's hard so let me tell you a story once upon a time there was a company and this company emphasized security process over security practice and that doesn't mean when i say that that doesn't mean that paperwork supplanted people actually rolling out controls it just means that some people focused

on the process more heavily than the practice and as the company fixed this problem an imbalance reared its head and this imbalance was significant security reviews are always going to exceed the number of available humans required to perform them and that can be expressed with an inequality specifically that s the number of security reviews will always be greater than h the number of humans available to to perform them that's pretty significant if there will always be a backlog then there will always be pressure to hurry and if there's always pressure to hurry that's how mistakes get made in an already error-prone process nothing can be completely perfect but trying to get it as good as we can

is something that's really important around this time a frustrated practitioner and a security specialist partnered to look through available research academic and industrial research didn't have too much to offer there were things like openscap which came out around that time but truth be told there wasn't much there so we sort of started to wonder i mean they sort of started to wonder could automation have an answer and this is where things got very interesting the pair were unable to find any real prior art that covered all their bases and so they built something themselves quite obviously the practitioner was me and the security specialist was actually my colleague spelled it wrong and my friend baird

keiki that is spelled right uh baird is a wonderful wonderful security uh uh individual great guy and we set out some goals for this project transparency and consistency for all there should be one set of clear rules for everyone continuous enforcement and compliance security must be carried from design past review through implementation and user ergonomics we want to build stuff that we want to use and so we made spindle the security policy notation and description language spindle is a domain specific language or a dsl for specifying security policies and an ecosystem of tools for working with them so we're gonna break this down because that says a lot all right so let's go over it one more time

spindle is a domain specific language a dsl for specifying security policies and an ecosystem of tools for working with them so we said that spindle is a domain specific language dsl so what's a dsl well it's a computer language of limited generality useful for solving problems within a specific domain and this is adapted from martin fowler's definition the point is is that you want to move your tool closer to your problem so that you can solve it better so what is a dsl in real life ordering coffee all right ordering coffee i'll have a grande double shot nonfat caramel macchiato with soy into splenda um i don't know what i just said but is it of limited generality yes it

is we don't speak like that in regular conversation is it useful only within a specific domain to solve a particular problem yes that's right is it a computer language no so technically it doesn't fit the definition completely but two out of three ain't bad for specifying security policies it's a dsl for specifying security policies well how do you model a security policy simple make the design a property graph so you make the security design that you're describing a property graph what's a property graph so it's a graph which is a mathematical structure g which consists of a set of two sets so you have v the set of vertices or nodes you have e which is the set of all edges

or connections between them and that graph that structure is labeled with arbitrary properties usually key value pairs so think of a network structure that's annotated with data a network structure that's annotated with data the properties on the graph make it a property graph the network structure is the graph itself and it's an ecosystem of tools for working with those policies and this is where it gets really interesting okay so i'm very excited to to show you this tools in the spindle ecosystem uh there were four main components there was spaceship which was a policy authorship dsl for writing policies speed racer was a policy visualizer spot was an automated policy checker and spider was a compliance robot all

right with spindle you could write a policy with spaceship syntax visualize the diagram with speed racer check that policy with spot and then monitor your implementation with spider so you write the policy you visualize it you check it and then you monitor right so let's go through these one by one because there's a lot here one write a policy with spaceship syntax the syntax is a subset of elixir syntax elixir is a programming language it's usually used for a pretty high reliability high availability software because of the of its lineage where it comes from um it used this wonderful function in the code module called string to quoted that parsed code to an abstract syntax

tree and then you could serialize that to a json graph so compilation and the reason that we chose elixir syntax was because if you create a new dsl from scratch with new syntax then no editors support it but almost every editor out there vs code emacs even even vim they all have modes for elixir already so we sort of got a lot for free there this was the original spaceship syntax which is a non-elixir you can see that there was a policy declaration you know description whatever author baird author jonathan you know system microsoft microservice a maybe a host name a kind sort of all different kinds of metadata um it was okay right it was okay the

problems with the first conception of the domain-specific language were that an externally hosted syntax requires a custom parser and when parsing this particular language there are certain linguistic ambiguities that we ran into and also from an ergonomics perspective there were lots of special characters or sequences and lots of familiar characters or sequences that were used in unfamiliar ways so this was kind of tough all right the second generation elixir syntax which was much less typing used uh well it looked like this so take a second and study it you can see that there's a policy with a title it has a description it has a list of authors it has a system just like everything else

except it's a little bit more readable and a little bit more familiar perhaps the biggest change is really equal sign which is less typing than two characters the arrow you can see the connects two declarations on the last two lines of the system stanza so the takeaway is that the string to quoted function in the code module of elixir is money all right why is it so great because it takes elixir syntax and parses it to an abstract syntax tree which provides excellent syntax errors okay so if there's a problem if there's a mistake if you have an illegal piece of syntax the string to quoted function will actually spit out that error for you

that ast that abstract syntax tree can then be transformed to other structures which is how we compiled it to a json document and it doesn't interpret or execute code at all so you don't run into possibilities of trouble like you do with a proper turing complete language it just parses the syntax and then gives you the tree it doesn't actually interpret anything or execute anything next you could visualize a diagram with speed racer so once you had this compiled json policy you could actually convert it either to dot which could be visualized with the venerable graphiz toolkit or it could be displayed interactively in a browser and that was really important for us we wanted both printable and web

quality diagrams but what we also wanted was we wanted the ability to play with the diagram and move nodes around and say okay well what if we move this system here and what if we took this piece and we stretched it out and put this in between how would that change our design so that interactivity was really key for us this is what a early version of it looked like um you can see they're sort of microservice a b and c with ridiculous host names and their servers and they're running an old version of red hat and they're connecting over http it's i mean all kinds of stuff right this was rendered using graphviz from an

actual totally bogus but policy three you can check the policy with spot so when spot does a policy spot check get it spot splat you feel it all right okay it does a three layer analysis so the first one is it's a structural or a schematic analysis the second one is a rule-based analysis and the third one is a graph analysis so the for the structural or the schematic analysis it would answer questions like is the policy well-formed is any crucial information missing can this policy pass audit do we have everything we need that sort of stuff for the rule based analysis did you violate any of the never should i evers good question right uh we sort of took

the rules that we wanted to implement and made sure that the policy passed all of them does the policy's basic structure comply with standards right those kinds of questions but for the graph analysis we were able to get really really really interesting what choke points are there in this network in the security design can we identify any unprotected paths which could be threat vectors and we got pathfinding algorithms for graphs that have been around for a long time a star dijkstra's algorithm the bellman ford algorithm stuff like that so by picking a graph and modeling things as a graph we got a lot of different algorithms for free more or less out of the box

and because of that there were no surprises everybody knows the rules teams know the rules reviewers know the rules auditors know the rules automation knows the rules everyone uses spot no surprises when you are seeing how your security review is going to be viewed by others they have the same tools that you do and you have the same tools that they do and that consistency was key four monitor your implementation with spider spider robots a compliance robot which continuously checks your infrastructure against your policy is a cool concept and one of the most interesting things that we did with the whole spindle ecosystem this spiderbot was awesome if spindle is an os right if it's an

operating system which is at its core a research a resource manager then spider is the kernel the beating heart right it's actually inspired by microkernel and nano kernel architectures we called it the spock architecture the security process orchestration and compliance kernel because when you work at a big company you have to have acronyms so we called it the spock architecture the spock architecture is like an os it has a scheduler it has drivers it has connectivity services system information aggregation all the things that you would want and appreciate the scheduler operates on a preemptive deadline-free tick system which is a fancy way of saying that it kicks off it does its thing sleeps for a little

bit and then kicks off again and drivers are modular security profiles those are what actually communicate with systems and they can be composed and layered in to add functionality piecewise so as an example if a policy specifies that a jvm is connecting to mysql using tls then spider would say oh there's jvm here we're going to lay in the job we're going to layer in the java profile there's my sql we're going to layer in the mysql database profile and then tos we're going to make sure that that tls collect connection is ok right so it would layer in these functionalities uh sort of making sure that the baseline covered the technologies that were present

and connectivity services it had agentless operation to components anything in your system as long as it could be connected to via ssh or winrm there were no remote requirements other than a user account which could be unprivileged we used a for the time customized version of inspec by chef well the policies were customized we didn't actually touch inspect itself um the chef folks have done a really cool thing with inspect i strongly recommend that this if policy and infrastructure and compliance as code is interesting to you check out inspect i'd love to hear what you think about it and also system information aggregation it would log to the sim we would have metrics go to

dashboards via uh prometheus metrics a whole bunch of stuff really interesting so this is sort of what the architecture looked like there are four boxes inside a big box so you had system information aggregation connectivity services drivers and a scheduler kind of like an operating system the workflow behind spyder was really interesting and the workflow behind the whole spindle ecosystem was even more interesting so you would write your policy you would visualize it you would check it and then you could enforce it you would use spaceship to speed racer to spot to spider and this workflow is really cyclic i know it's linear on these slides but it's really cyclic because the policy will evolve

as the product changes and grows so as you iterate on the product you're also iterating on the policy that supports it as well and that's a really key piece right so when the policy lives with the code not in some i don't know other repository full of diagrams and a bunch of stuff they can actually grow in tandem so any time that you needed to uh give additional privilege or permissions to the product you would modify the policy accordingly so they would grow in tandem right and that's really important because when you have your security policies separate from your code i mean diagrams or whatever if you know let's say you were under attack by

a threat actor you would not pick up the red phone and go quickly get the diagrams bigrams don't keep you safe right but when you have your policy that lives with the code at least they stay in sync right we call that co-nascence or uh connoissen some people call it but it's really they are co-nascent so we built all this cool stuff we had a lot of fun and we tested it with users and systems we had a lot of fun and we learned a lot i mean a lot right there were some things that went really well end users loved writing code instead of filling out forms teams loved it because they could always

consult the policy files they were sitting right there and reviewers loved it because they could work more efficiently and that is to say that automation is really great automation is great but augmentation is better than automation and for us this was key because it preserved the role of human judgment and centered the role of human judgment preserving this ability and it really really enabled these people who were stuck doing paperwork to use their security expertise and their knowledge and ask the hard questions work directly with teams right really elevate the level of practice preserve that role but at the same time it freed them from toil which is a really important thing we called that at the time continuous

compliance that's not a term we made up but that's what we adopted and if we had to do over again we would do things differently one we would have static types right we would have a static type checker which which would have type inference for the dsl and we think that that would yield better tool support and errors in advance maybe we couldn't use elixir anymore don't know but we think elixir is the right way to go we think that we would leverage elixir even more so we would write as much as possible in pure elixir largely because one language heterogeneity is really hard right when you we had a little bit of ruby a little bit of

elixir a little bit of shell a little bit of this a little bit of that okay enough right one language would have helped us and also elixir and otp they're really resilient it takes a licking and keeps on ticking and that matters when it's three a.m and there's a problem it would also leverage existing technologies some of which which didn't some of which didn't even exist back then right so today we could use kubernetes to run workloads right primarily because no one wants to write a distributed scheduler right which isn't to say kubernetes is that great right i mean it has some things that does really well it has some things that maybe not so well but we would probably

use that to run workloads we would also want to leverage ci cd a little bit more by integrating better and more tightly with ci cd processes probably so that we could validate policies and deployments maybe even on every change and for us that would look like yes you checked the policy on every change but you could also ask spider for ad hoc enforcement to determine if the new deployments match the latest version of the policy if they don't break the build trigger a rollback right and the last action in a successful pipeline would be to submit the policy to spider for continuous compliance and monitoring so a great example of what we're talking about with the blue green deployments

right you have two copies of your infrastructure your blue and you're green you deploy you're serving from your blue you deploy to your green and then you gradually move over and if anything goes goes wrong you go back but you can't switch colors without spiders okay so if you're serving from blue and your new policy covers what's in green if what's in green doesn't match the policy right can't switch colors so we could also integrate better with external systems now we have things like the open telemetry standard for community now we have things like the open telemetry standard for communicating results and that's in order we would use that to make the results of spider more

digestible specifically by other systems we'd also make much better use of available apis there's now so much more than ssh and winrm for running remote commands you could inspect containers running on orchestration platforms like kubernetes and openshift you could use cloud management apis for aws azure google cloud platform check out an s3 bucket or a database monitor a serverless function whatever it is we could look at things in much greater depth than simply connecting to them and running some commands we'd also want to align spindle spindles goals to organization specific compliance and security requirements sigma is a big company we would want to make sure that we're covering our bases with this in help and let it help us with our

socks obligations are hipaa uh regulatory burden um pci dss all those kinds of things we would probably want to use frameworks like nist to help us categorize our controls again we learned a lot we really did this brings us to another question what's next for spindle so in 2017 we shared the spindle project at the cigna technical conference when it was just an early early early research project it was awesome we got some really great feedback a lot of hey did you think about this and we were like no we didn't but we should and a lot of i see a little problem if you're gonna do this and that you gotta do wow okay so that helped us to develop

the idea and the project even more this talk however at b-sides philadelphia four is the first step in sharing more broadly it's the first step we need to start a conversation in this industry right security reviews are always going to outpace the humans available to perform them s is greater than h all the time that inequality isn't going away we need to start a conversation about technology solutions to security problems we need to make sure that we're centering and preserving the role of human judgment emphasizing it even more we should be talking about bringing bringing cutting-edge computer science into the discussion research technique the experimental mindset it's important we're learning more about what it actually means to

shift left what defense and depth means when you look at it as a continuous practice and not just a one-time design decision we want to make it so that it's easy to be secure by design everywhere and verify uh especially iot though uh that's really important because when your toaster is a threat vector you have you have two problems yeah so we also think that we should be looking at developments that have come out in recent years uh i just learned about recently the katala language which is a domain-specific language for specifying legal documents one of the hard things about dsl design is that the problem is at the intersection the junction of the people piece and the

technology piece so the socio-technical aspect is really interesting so i studies dsls that come out the catalan language for legal documents um my wonderful wife who supports me and all things is a lawyer so uh that would be pretty cool if i could show her that there's also the lean theorem prover which if you get this joke it's funny it's a better coke than coke you can see the xena project the lean theorem prover is really interesting the xena project is an international effort organized by some folks out of the uk to formalize and prove the foundations of mathematics and make it available as a library that would mean that we could have some really exciting

abilities to statically analyze a security policy to make sure that it's internally consistent i think that there's something out there that has to do with an sat solver maybe a boolean satisfaction or constraint solving but there's a lot out there i think that's really cool there's the cocoa language both lean and coca happen to be from microsoft um coca is a programming language with first class effects i think that that would be really cool for learning about um how for being able to model how components interact with one another i think there's something else there and you know please reach out because we love this stuff and there's a lot that we haven't thought of but

we'd like to hear from you about what you've thought of you can follow me on twitter i should also say you can follow cigna on twitter i'm super grateful to the organizers for giving me the opportunity to be here for accepting my goofy cfp or my cfp application and i just wanted to say thank you for for watching i can't tell you how much it means to me

you

Jonathan Magen - SPNDL: Security Policy Notation and Description Language

Related talks