← All talks

Security in Continuous Delivery Pipelines

BSides Newcastle · 202033:318 viewsPublished 2020-11Watch on YouTube ↗
Speakers
Tags
About this talk
Sam Hogy explores how to embed security throughout CI/CD pipeline design and execution, moving beyond end-of-cycle security checks. The talk covers threat modeling of pipelines using STRIDE, automated security tooling (SAST, DAST, dependency scanning), and securing pipeline infrastructure including repository controls, secrets management, artifact storage, and service account permissions.
Show original YouTube description
@samhogy: Security Continuous Delivery Pipelines This session details why security needs to be at the forefront of our thinking when building a CI/CD pipeline. We will cover tools, techniques and principles involved when developing a pipeline, from vulnerability scanning to the principle of least privilege. We'll also cover how to build security into the application that you're developing through testing practice. You will come out of this session with an understanding of existing approaches forsecuring CI/CD pipelines." Captured using OBS: Open Broadcaster Software®️obsproject.com Edited using OpenShot Video Editor | Free, Open, and Award-Winning ...www.openshot.org
Show transcript [en]

okay yeah thank you very much i'm in a room that is on fire both in temperature and in and in virtual zoom backgrounds so let me just share my screen

and then get my contacts set up correctly all right yeah um so hello everybody uh am i the first um am i the first speaker actually from newcastle i think i might be yeah so um i'm gonna be talking today about um security in the icd pipelines um this is mostly based off my experience in some previous jobs some things i've picked up along the way before i start though i do want to make kind of clear that i am not an expert i don't think there are any experts in the room we are all learning in one way or another and just because i'm on this side of the stream rather than on the other side of stream

doesn't give me any kind of authority over this whatsoever i've designed this talk to primarily be a jumping off point if you're looking to set up a new project and you're wanting to embed devsecops within that process here's a couple of things that you can think about um while designing both your application and the pipeline itself so yeah i'm sam i am i'm a geordio well i was born in gateshead i live in whittley bay now um i've been doing software development of all sorts for 10 years a lot of it has been behind paywalls and ndas because i work for banks i'm currently at tesco bank on corn business park in newcastle i've been there since may and that was quite

interesting because that was in the middle of lockdown and it was kind of day one you walk into reception they give you a laptop and tell you to go home but it's all been fine it's been good fun um and like i said the the vast majority of this talk comes from the previous project i was on um when i was doing some consulting gigs um where i was responsible for a small team of devopsy folks and we basically built the cicd pipeline and cloud infrastructure for a new product and this is for um well it was a it was a new kind of prototype mvp type application but it was for a large uk financial institution who you will

know of um i'm going to avoid kind of talking specifically about their stack or anything like that i've genericized everything appropriately so um let's get the business buzzwords out of the way first so we have this concept of devsecops and yeah it sounds a bit buzzwordy but i think it is it is useful um as a kind of industry term because a lot of my experience has been security is something that happens right at the end of the process it's some bureaucrat securecrat in a different team somewhere else who will put a stamp on your project and say yes sign off good to go no not good enough and then you never hear from them again

or maybe you hear from them every quarter or half year or year to do the annual pen test and it bears no resemblance to what actually goes on the whole point of devsecops just like dev ops is about embedding security throughout the process of building an application rather than waiting until the end you may have heard this in the testing community as the concept of shift left which is basically taking all that useful information that we generate and exposing it and making it available earlier in our development processes giving it to the people who are actually empowered to make the changes where it's cheap to remedy and actually in the context of security before it goes out the door and you're

affected by it what i want to do is kind of talk about the threat modeling thinking to go behind building a ci cd pipeline i'd be using kind of concepts from strides or spoofing tampering repudiation information disclosure denial of service and elevation of privilege i think i got those right um all the way through in various examples and are going to be pointing towards tools both free and commercial and open source which can kind of help you in each of the areas i identify but if you're not familiar with the concept of a ci cd pipeline you may have heard of continuous integration ci this is where you have the process of everybody checking in

code to a repository at least once a day and then you'll typically have some automated build server which will compile your code run your tests and produce an artifact and stick it somewhere continuous delivery takes that one step forward and it's about using these automated processes and tools to get your application um into developer environments into into qa pre-production environments and then out to production it's about making the entire process automated repeatable and scalable so that you can deploy as often as you need to for your business that you can continuously deliver value to your users whatever that business value may be and we're all familiar with tools to do this gitlab jenkins the new github action stuff go

cd which kicked off a lot of this stuff um it's a it's an increasingly a very very important tool in the software that we build it's responsible not just for building the software it's responsible for kicking off a lot of the quality checks that we perform it will eventually handle our deployments and in production it doesn't just stand there it will very often perform ops monitoring or be able to hook into that stuff to know whether we need to roll back any changes that we put out i like to describe it as the beating heart of the application it's literally involved in everything and when you draw these things out what you end up with is

a almost like a diagram of steps that are performed and you can do some of them in sequence and some of them in parallel um but in order to actually do that job of just building and checking the code and then deploying it if you think of all the components it needs to touch it needs to be able to pull in source code from a repository it needs to then be able to push built artifacts whether they're executables binaries or jar files or whatever they need to push those to some kind of repository where they can later be accessed and eventually then needs to actually control your infrastructure i've put up a few cloud providers here

because that's kind of where everybody's moving these days but it could be on-prem kit it could be i don't know it could even be printing cds or whatever but the idea is that it needs to touch every single piece of your infrastructure in order to do this job and it's good that we do this in an automated way often in a manual way because if somebody's doing stuff manually there's very little accountability it's at stake but it makes this pipeline itself as well as your application as well as your kind of production infrastructure this this pipeline itself is a prime candidate to be attacked and it can be attacked by external threats these are clearly people who want to steal your

source code or your binaries want to find a way to exfiltrate your data or just push malicious code into your production environment like this is a really good way of hijacking what you put into production to run some crypto mining for example or to i don't know redirect bank redirect transactions or gather stats or anything like that you know if you can hijack the vehicle that you use to put this stuff into production well you can you can get everything but it's not just external people we need to defend this um defend this system against we need to defend it against internal actors too both accidental internal actors and malicious internal actors you have in this case

all the same threats as external actors but internal ones tend to have a bit more knowledge about how things are strung together you need to make sure if something could be done accidentally or accidentally on purpose that there are controls in place to prevent that somebody for example bypassing all of your controls to deploy a code change that they've made locally that hasn't been reviewed straight into production and it does happen giving them direct database access and writing queries that they shouldn't be doing you know this stuff happens all the time and people kind of justify it as we have manual processes around it these are these are attack vectors and can get your organization and you personally in

trouble so they're really worth considering but i want to break this down into the two components because the pipeline is something that we should secure in and of itself but i also want to talk about how the pipeline can be used as a way to secure your application and obviously security isn't the binary thing it isn't something we have adorned it's always going to be based on relative and absolute risks um so like i said earlier i don't want to claim that you follow all these steps and you will be secure i think these definitely put you on the right track to having some confidence that you have the appropriate controls and measures and stuff in place

so your pipeline is going to perform a lot of things um the type of things i'd be looking for in a decently built pipeline or secret detection so um the last thing you want and i've had to correct this before in the past it's a right pain in the ass is if somebody commits some kind of secret information to your repository like you can with great difficulty go back and and correct that but by and large once you push something to a repository if somebody else pulls it it's kind of gone and by secrets i mean api keys access tokens client ids private keys there are tools out there like get leaks which can identify if a repository already

contains secrets and it's worth running them because you'll be surprised um the better tools out there and get leaks also handle this can run as pre-commit stages so this is before somebody even pushes something from their local machine to the shared repository and they can at this point intervene and say i'm not allowing that to happen you're trying to commit a private key this goes a long way to making sure that secrets are properly handled that you can't just kind of bodge them into the repository and if you get the discipline right on this from the start it's going to set you up very well for the future most of the software we build these days

contains third-party libraries third-party dependencies open source components and in large organizations having oversight of that supply chain is really important not just for security but also to make sure that you're fully compliant with any licenses that you're subscribing to most pipeline tools will give you this kind of stuff for free and github actions for example has depend about built-in dependency scanning is basically looking at all your third-party components looking for the licenses that they they use and reporting on those looking to see if there's any security vulnerabilities most of the security vulnerabilities in these packages are already known and are already patched so a lot of a lot of the time what you actually need to do is just make sure

your software is up to date so they'll identify if you have unpatched dependencies and dependable will actually do that for you it'll just every time there's a new update to an application it'll put up a pull request in github for that individual dependency but this is really good because it allows you to get from a point of just having an audit log of which dependencies do we have what versions are they to later on actually kind of enforcing compliance if your tools can say we accept things that have the mit license but not the facebook bespoke clone of the mit license then you can make you can stop these um kind of libraries that are not approved

from even getting into your process in the first place so dependency scanning is really important i'm going to be talking a lot about containers as well because i think a lot of us are now using docker images and as a way of kind of creating a box that some software can go in and then you deal with the box in a standard way and because it quite nicely encapsulates all of your dependencies and the operating environment in which it runs it's really cool um but containers in and of themselves can have have security risks too so there are particularly specialist forms of dependency scanning tools like clair and the claw integration if you run a private docker repo

to detect known vulnerabilities and base images so if you're deriving from a particular form of linux for example it may have um some kind of unpatched security vulnerabilities in some of the more kind of fundamental libraries your application is built from and again from a licensed compliant and a security audit point of view you need to have visibility of that this will allow you to put in place the appropriate guard rails to allow you to make sure that everything you're using is is up to date and is appropriately secure to the best of your knowledge so that's the third party stuff out the way and now we've got to talk about your own code and there are various automated checks

that a ci cd pipeline will use in order to provide you assurances that you've not done anything we know about that that's bad practice the first one of those is static analysis sometimes called sast this is where you have a tool like spot bugs um which will look through your source code it knows how to kind of understand java or understand c and it can identify where you've made potential programming errors which could lead to security vulnerabilities and that can be as simple as detecting potential null pointers out of bounds exceptions memory leakages all this kind of stuff if you have that at the pull request stage if you have that information when you want to

merge a new change into your repo and you can say right okay we've got a couple of actions to take here again really cheap to fix those what you've got to be aware of with these static checks those they know nothing about the context of your application so you can get a lot of false positives that come from them and they're not going to kind of intelligently be able to work out how to use your application and how to break it but it's a useful first check to have once you get past pull requests and once you're starting to put together some kind of release you can use some of the more interesting tools out there which fall under the

category of dynamic application security testing of dust tools like all whatsapp because surprisingly the all wasp top 10 is the top 10 for a reason there's a huge amount of stuff out there that is still vulnerable to these basic cross-site scripting sql injection style attacks um dust overcomes a lot of the shortcomings of sassed by actually running your application and then driving behavior dynamically trying to inject security faults and you can do this at various levels of of tolerance with sap um you're going to get fewer false positives that way um you're probably going to have um a fair few security issues to deal with but again knowledge is is power and it's better to

have these integrated and fixed so that's the kind of stuff a pipeline can do for your application but like i said earlier the pipeline itself needs to be secured as well so we need to look at kind of how the pipeline operates mechanically what it touches and then what considerations we have to take into account we've got about 10 minutes left so let's go for this securing the pipeline going back to this diagram um like i said your pipeline touches your repository it touches your artifact storage it touches your your production environments so um some of the things i'm going to talk about are about the pipeline itself and some of them are actually about

these related components the first one is the repository so your repository should have the appropriate controls on it to make sure that only certain actions can be performed in certain contexts so what i have here is a typical kind of git flow type branching model where you have a main which represents what's currently on production you may have a branch from that which is your active development branch and then individual devs may branch from that for individual features now it makes sense that the feature branch could be potentially deployed to a dev environment without a huge amount of checks in place because that's going to allow devs to get fast feedback on their work it makes sense that um only the the main

branch is allowed to go to production and um sufficiently advanced repository setups can enforce that that only only certain pipeline jobs can be triggered on the back of certain branches and you can put in place controls to say like you can't force push to main you need to merge to main fire a branch it needs to have a pull request that needs to have a certain amount of approvers so this can kind of make sure that code can only get to the appropriate environment whether that's production or a test environment if it's gone through the appropriate hoops in your process the one thing that people tend to forget though is the whole idea of somebody

a repository because um and that may not be external that may be if you're running github enterprise you can take the shared copy of a repository and fork it into your personal space you want to make sure that the pipelines aren't inherited by that because they they could then push to their copy of the repository into maine and then deploy to production so you need to make sure that the uh the providence

um increasingly common and it's worth taking a look into one because you don't want the pipeline so if your pipeline's running on a vm for example um you don't want to have to run that as root um so the other the other interesting thing about um running docker um or building docker images is that um image tags themself can be modified they're just pointers to a particular image in your repository so if you build something and you sorry ben do you have an issue there yeah the audio dropped out about five minutes ago oh well not about two and a half three minutes ago for a minute yeah i've just fixed it so you might want to

recap the last kind of slide sorry okay uh what i'll do then is i'll start from i'll start from the top here um so yeah what i was saying before the audio failed me was that your docker demon um if you're building docker images runs as root and that therefore there's a potential for the malicious code inside a container whether you put it there or somebody else did to break out of that container and then run as root on your machine there are tools out there for example google's canonical builder builder and podman which run rootless which is awesome because then that allows you to kind of have a bit of security when building those images

it's common when you build a docker image to then tag it um beware because tags are modifiable they can they just point to certain um docker images so um what that what that means is potentially if you build an image and then one later on in the pipeline deploy it that can that can be quite annoying because it could have changed so there's docker uses a concept of digest like a sha256 fingerprint and that gives you guarantees that that image hasn't been modified throughout your pipeline so if you need that kind of reproducibility and auditability then do that 12-factor apps advise strictly separating your build and run stages if you're using docker to both kind of

build and then specify how the app is going to be run you can use the concept of builds to separate out those two steps and it means you're not bringing a whole load of extra dependencies into your runtime docker container don't put any sensitive information into a container don't put any passwords in and don't mount sensitive directories as well your containers um scanning stuff should pick this up but it's always worth double you know kind of restarting it and if you're going to be deploying into something like kubernetes consider pod security policies which can kind of enforce um minimum security standards in any docker images that you want to run and custom admission controllers because they can

then check that for example the image that you're wanting to deploy comes from somewhere known and trusted it's verifiable you're not just trying to again eat something in the production the big one is secrets management to speak to all of these different systems a pipeline is going to have to control a whole load of secrets now the 12-factor app um kind of ideals advice store and configuration in the environment so you'd think all right okay this is simple i can just pass in stuff as environment variables and sure you can but you've got to be aware that that isn't foolproof if you pass environment variables into a docker image they can be inspected quite easily because the layered file

system you can kind of jump to the layer where the environment variable's being used until it and read it and you've also got to be aware because certain applications when they crash will just dump the stack in the context and you risk then dumping environment variables in plain text the only time that in my opinion environment variables or secrets of any kind should be stored in plain text as in memory in the context of the process who needs it only every other time whether it's in rest or in transit it should be encrypted um so there are tools out there like google secrets manager and gcp amazon credentials in aws hashtag vault which is kind of my favorite tool

which allow you to kind of dynamically retrieve secrets that runtime so your pipeline can make a call out to vault grab the secret that it needs and then inject it into an application or or use it to authenticate um some of these can actually then rotate the secrets after use immediately which means for each kind of request out to vault or google secrets manager or whatever you get basically a one-time password with a short expiry it's almost like pulling the lever on a fruity and getting a new and unique credentials um so you really need to think about how the secrets are managed in in your pipeline whether it's passing secrets into running systems um so if you if you're running

in kind of kubernetes for example what i like to do is i like to um pull files from vault as a sidecar container store them in memory and then mount that in memory location to your other container so that the secret can be run then read that way and it it's it's a pain in the arse i admit but it's much better than just passing plain tech secrets around um artifact repositories right we've already talked about docker hub um where do you store the code there are public repositories available and sure you can push to those um but running your own has a lot of advantages you can trust it um it'll be on a private ip so the fact

that you know being able to attack that via dns attacks is going to be significantly reduced you can have access controls you can use it to enforce your own license compliance and it in the event of something like left pad where somebody just decides to delete or override a dependency in the public repository you're not going to be bit by that it allows you to build up like this golden source of approved compliant and known to be secure dependencies that then people can use and again from a compliance point of view it just gives you that kind of surety that everything you're using is aboveboard um running out of time so going to go through this very quickly

the way that these pipelines will authenticate with all of these different systems is through service accounts these are accounts in your systems that robots non-human actors use um in some of the major cloud providers the default service accounts can have a huge amount of permissions that they don't actually need so what i'd advise doing is making sure that for any given operation in your pipeline it uses a different service account and that service account has only the rules that it needs to but you also need to be aware that if the user triggers an action in your pipeline such as deploy to main you need to be able to in your logs associate the actions the service

account performs back to a user because it's useless to say that service account b1237 did this deployment it's you it's more useful to be able to know sam was the one who asked you to do that so this is your kind of repudiation actions and then finally um i want to talk about the networking and the compute side of things so where are your ci cd pipeline agents running on are they running on vms in your own network are they running in the cloud or are they running on the public machines that get lab and github offer um how secure those are differs massively some are going to be your responsibilities some aren't um i like to have different runners for

different purposes so the the runners that have permissions to deploy to production they can only take on jobs to deploy the production they don't do building they don't do scanning they don't do anything like that and that allows me to kind of lock those down um in in kind of bespoke ways you've got to be careful when you have these pipelines that can deploy to dev and deploy to qra deploy the prod um that you don't create accidental conduits between your environments so if you do something like vpc peering in in a cloud what you're actually saying is there is a route between your different environments so um you have to be really careful and

make sure you use things like access controls and firewalls to really monitor those conduits um and again keep your build servers separate to what is in your production like it's really easy to just have one big kubernetes cluster that does your production your qa your uat your dev environments and you build all through different name spaces but that makes you really receptive to denial or service attacks or again if something can break out of a container it could then start to wreak havoc in your production environment that's a lot to chew on i appreciate in half an hour feel free to reach out to me on either my website and you get my email address

that way or via twitter but i do hope that this has given you something to figure out and do we have a few minutes for questions or are you wanting to move on theoretically it should be a break just now uh okay but it's a break for 15 minutes and then we're into adam anderson's presentation i guess if anyone does have any questions do you want to jump into the slide all quickly have a look at the sliders here we go there's nothing there at the moment but i guess someone could have put in a question quickly at the end thinking oh i can get one in feedback in the slack as well um love

for your talk slider is really quiet now um i'm terribly sorry about the audio uh dropping what i what i'll do is i'll upload the slides to something else that people can look over them again to be fair chances are given the way things have gone today it's probably us not you so don't worry [Laughter] maybe let's not listen sam that was awesome thanks for that thanks for giving i know it's a lot to try and get it's a lot for people to get to digest in half an hour and i don't mean any any um insult by that but it's also a lot for you to try and condense into half an hour and

actually present in a meaningful way so thanks for that yeah i mean when when i built when i built the pipeline that this talk was built that was kind of um developed from we had three months and i was a complete noob at this stuff um when building this so a lot of it was speaking to a security architect they would say have you thought of this and we would go no let's quickly go and build something to sort that out um and and yeah it's kind of it's really easy with these tools to do stuff that is not that secure by default in really subtle and damaging ways if somebody can get in like the fact that um yeah docker

is is run runs as root by default means that like if something can break out of that container and doing that is fairly trivial these days like there are you know there are known documented methods to do that like your entire infrastructure is effectively owned at this point because you then have a build agent which by default will have a service account with permissions well be on what it actually needs and yes somebody can just go through and enumerate all your network topology then get your database and get access to it if especially if the environment variables passed into it containing plain text your database credentials like it's terrifying that this can be done relatively easily

cool um thanks very much for for speaking thanks very much for spending the time with us we've we've had a blast please do hang around if anybody does have any questions for sam i'm sure he will love to receive a million dms from everybody so do go and dm them uh do you have open dms on twitter just interesting questions i will do in i will do in a few minutes you absolutely shouldn't it's the best thing ever you get so much crap on there it's hilarious um but no listen i think uh so the other sam