Secrets Management and the Software Supply Chain: A Maturity Model for Secure Development

Name: Secrets Management and the Software Supply Chain: A Maturity Model for Secure Development
Uploaded: 2023-08-28
Duration: 31 min
Description: Hard-coded secrets and exposed credentials pose a critical risk to modern software supply chains. This talk examines how six million credentials leak onto public repositories annually, traces real-world breaches, and presents a practical maturity model—from detecting secrets locally to implementing

BSides Sydney · 202331:00115 viewsPublished 2023-08Watch on YouTube ↗

Speakers

Mackenzie

Tags

CategoryTechnical

TopicDevSecOps Supply Chain Security Threat Intel

DifficultyIntermediary

TeamBlue

ResearchCase Studies and Incidents Analysis Methodology

StyleTalk

Mentioned in this talk

Platforms

Docker GitHub

Service

Amazon S3 GitGuardian Stripe

Vendors

Okta

About this talk

Hard-coded secrets and exposed credentials pose a critical risk to modern software supply chains. This talk examines how six million credentials leak onto public repositories annually, traces real-world breaches, and presents a practical maturity model—from detecting secrets locally to implementing dynamic secrets and continuous scanning—that organizations can adopt to move from reactive incident response to proactive security.

Show original YouTube description

Software applications are no longer independent monoliths, instead, they are built up from thousands of different tools, components & services. This creates both huge opportunities and security challenges. Exploring recent examples & monitoring hackers in real-time we explore why the internet is broken and how to fix it. Mackenzie is a developer advocate with a passion for DevOps and code security. As the co-founder and former CTO of a Australian health tech startup Conpago, he learnt first-hand how critical it is to build secure applications with robust developer operations. Today as the Developer Advocate at GitGuardian, Mackenzie is able to share his passion for code security with developers and works closely with research teams to show how malicious actors discover and exploit vulnerabilities in code.

Show transcript [en]

[Music] and today I'm going to be presenting a model on what organizations should be aiming for to achieve when uh able to kind of secure uh their secrets within their environments so we're gonna problem of hard-coded Secrets I'll quickly explain what I'm referring to when I'm talking about secrets I'm going to show how these are used in some high profile instances so how actual breaches in the real world have occurred because of poor Secrets management we're going to be going through what our unsecured development practices and then we'll be going through what the the model of of a maturity model what a mature model looks like to manage these secrets so let's start off with the problem of

hard-coded Secrets so what am I referring to when I'm talking about secrets so I'm referring to pretty much anything that is a secret but specifically I'm talking about digital authentication credentials so these are things like your API Keys your security certificates your username and passwords to for instance your databases and what's specific about Secrets is that these are made to be used programmatically which means they're machine to machine encryption keys or machine to machine Secrets these give access to the innermost workings of our organizations uh today so they're extremely sensitive oh mood lighting so we absolutely want to make sure that these you know remain secret but it's actually really hard to do this because

everything that we use today runs on secret our supply chain is full of third-party services and third-party tools which we're all leveraging these so if someone can exploit these they not only can get into the inner workings of your systems but they can also poison other applications which I'll talk about so we often we know what we're talking about when we hear the supply chain um when we're referring to kind of physical uh Supply chains you know in world of manufacturing in other areas but it's very similar for software so you know you have your open source packages and dependencies you've got your Ides your development environments and then once you've built it it needs

to go into testing and then we kind of send it out through artifacts so this is a very nice pretty picture of really what the the software supply chain can look like but what does it really look like in in reality well it gets a lot more complicated because our dependencies are dependent on each other so it's not just a linear line we have lots of different areas and in between all of these you'll see these little keys and that's the secrets component that's what gives access through all of these and so if you trace back all these lines in this very convoluted matter you will find that you'll be able to poison huge amounts of different infrastructure

systems data by obtaining these secrets so it's really really important that we manage these correctly and that these don't end up in the wild so I want to talk a little bit about you know how these secrets end up out there in the wild how do they get exposed so we're probably familiar with this site this is GitHub it's the largest code sharing platform on the world 80 million developers use GitHub to be able to publish code sometimes it's open sometimes it's closed Source but it's a huge amount of data in there last year in 2021 a billion commits or if you're not familiar with Git and that word you know a billion code uploads you can

think of it was made on GitHub out of this billion commits that were made publicly we decided to scan all of them to try and identify if there's any secrets in there and we found a couple in fact we found 6 million different credentials and secrets publicly committed to github.com six million so this is a huge amount of sensitive information and these can be for anything um for instance you know two cloud service providers into databases into lots of different areas so it's a huge amount of sensitive information that's just making out the in the public web space you don't need any authentication to be able to view this code and it's out there for

everyone to see other places that these secrets are really being exposed is at the end of the supply chain and in that build stage where in areas like our Docker images so once we've compiled application we might put it into what we call a you know a Docker image which is basically just a mini virtual machine it contains lots of them lots of kind of different services that your application needs to run and the problem with Docker images is that they can be convoluted they can be quite confusing and often people don't really understand how they work but one thing that's very clear is that if there's secrets in your code and you can pile A Docker image there's going to be

secrets in your Docker image so there's 8.8 million Docker images on Docker Hub publicly available we decide to scan lots of those and we found that about five percent of these Docker images contained a plain text credential a hard-coded secret so that's a huge amount and if we want to talk about the supply chain security there was a very real threat that happens as a result of this with a product called codecov they had a plain text Secret in their Docker image the official Docker image attackers found that we're able to turn code Cove malicious and then 20 000 uh people that were using that product had their private infrastructure breached because code COV had a plain Tech Secret in their Docker

image so this is exposes how integrated secrets are into that whole idea of the supply chain and why it's so important that we work really hard and actually making sure that these remain secret and we have good maturity over our kind of keeping these this way now the last place I want to talk about you know like kind of finding these secrets is inside private repositories so we talked about public repositories how we found six million secrets in public GitHub but you might be thinking to yourself well all of our source code all of our developers we use just private repos right so I don't really have a problem the problem with private repos is that

source code is a very leaky asset and it's really bad at staying private um so we at get guardian we did some research into how many secrets we typically find in private source code repositories so in the private Version Control Systems that we store our source code so if we take the average company of around about five 400 Engineers 400 Developers we would find about a thousand unique secrets in their private source code repositories so if we break that down we also found about 13 000 occurrences of those so the total number of Secrets when you ignore duplicates is 13 000. now if you're looking at an average company you've probably got four appsec Engineers for 400 developers that's

three point four thousand secrets that each appsec engineer has to go through sort through every single year that's an impossible job that's 10 a day if you didn't take uh any holidays so the problem is absolutely massive and source code is a very leaky asset so you can think that oh well this is uh private but you're talking about the credentials that give access to all the inner workings of your organizations your databases your payment infrastructure and if you think about who in your organization do you want to have access to those it's a very small handful of people that should actually have access to those but now if you think about who has access to your source code in the

organization is huge and there's no way of being able to distinguish that so it's a massive problem of having these secrets in private repositories so let's talk about some high profile incidents that involve these so I want to talk about Uber so Uber this year had a massive breach and in my opinion it could have been so much worse but I want to talk about exactly what happened so a bad actor bought some credentials on the dark web for Uber's VPN be able to access Uber's internal Network now luckily Uber did one thing correctly and that they had multi-factor authentication on this so it was another challenging step for the attacker but through social engineering and

pretending to be the security team they called up the person whose credentials they had access to told them that it was an emergency and he accepted the multi-factor authentication and the attacker made it onto the network so we already have the fact that these credentials made it to the dark web which means they were exposed somewhere but that's not where it stops in a mature organization just because they make it to your internal network doesn't necessarily mean that they should be able to kind of disrupt services or access sensitive information but once they had VPN access they started scanning Uber's internal networks to try and find more sensitive information and they found some Secrets inside some Powershell Scripts so

Powershell scripts were just being used to automate something so from what we've seen what we can assume is that they had Powershell scripts that when a new person joined the company they would put in some permissions and this Powershell script would go through and create lots of accounts on different services for them so how do they do that well they had hard-coded the admin credentials to their Pam system a Pam system is like your secrets manager and your password manager rolled into one it controls access to every part of your organization even in Tac here gains access to this system not only do they basically have access to everything they can create accounts for themselves which means they're authentic

or properly authenticated and really really hard to detect and that's exactly what happened so then from there the attackers gained access to basically every single one of Uber's accounts their cloud storage accounts their email accounts their uh threat detection so they could have even shut down their threat detectors basically they had access to turn off the alarms in the buildings you know so to speak to persist your attack and that's all because they had poor Secrets management and going back to private repositories we can look back last year when twitch's source code was leaked publicly so this happened because of a small misconfiguration basically so there was self-managing their Version Control server their git server they made a

mistake in the configuration and all of a sudden twitch's source code was publicly available so you think about all my source code is private doesn't matter well here's obviously what twitch's mentality was but a very small mistake in the whole system comes down so the attackers did a very big favor to Twitch and they publicly announced that they had to source code and they leaked it online why did they do a favor to Twitch for this because if they wanted to be more malicious they could have kept it because inside that source code was huge amounts of Secrets huge amounts of credentials we found over 6 600 exposed secrets in here including 194 AWS keys so keys to

your Cloud infrastructure we found 69 twilio Keys Google keys and even stripe keys so keys for payment systems inside twitch's source code now had the attacker been more malicious they could have taken these keys and sold them luckily they decided not to but here's an example of of where you know the whole system can come down because of exposed Secrets inside private Source codes and so it's got very very often makes it public there's been endless amounts of source code leaks this year uh from Microsoft Nvidia there was a group of teenagers called the lapsis group that were leaking source code everywhere so you know we can't rely on the fact that our source code is going to remain

private and if we just take one of those of how does someone how does a group of teenagers gain access to massive companies private source code well here we have Nvidia so Nvidia had their source code accessed by the lapsis group how did that happen Nvidia is a great company great security um in their private source code they had signing keys that were used to sign malware so you know very critical keys so how do they get access to them this is lapse's telegram Channel they basically just paying people insiders to give them access right you don't need to have huge amounts of sophistication you think about everyone in the organization that might have access to the source code

right and now all of a sudden you find an intern someone that just got fired someone that's planning on leaving you offer them five thousand dollars to give them access to a git repository well it's probably pretty tempting for a lot of people foreign you know so we can we get a little bit technical now but I'll kind of quick over this but you know the problem with secrets in particular with source code which is kind of fundamental here is that they're really hard to spot unless you know what you're looking for unless you're an attacker and you're looking for them just in general operations they're hard to spot so this is kind of what happens

when we create code right you have our main branch which is kind of the code that's running in production and then you have these development branches that span off on it now what kind of kind of happened with code is that we keep a track of everything you hard code a secret and then you delete it it's in your history and if you're working in teams you know deleting your git history not only is it a pain but it can actually can be like hugely disruptive so people don't do it and even when we review other people's code we don't review the entire history of a feature we reviewed the latest version of that with the latest version in the main

branch which means that if someone did commit a secret it's it's gone in the history and it's forgotten about hard to find in day-to-day operations easy for an attacker to find so how come this all happened well basically because we're moving so fast in development now when we're creating applications and features we're moving at the speed of devops but we have an integrated security into this process we're deploying applications every day we have all these third-party tools that enable us to move quicker you know we're not making credit card processing we're using stripe we're not building our own authentication we're using OCTA you know all these different things but it it makes it leaves the door wide open for

you know basically for bad guys you've got to find and exploit these secrets you know so we have secrets that are committed for testing purposes you know private repositories are accidentally made public we have secrets that are in Auto generated files and logs we have sensitive files that are accidentally included in in here and we can accidentally push code to the wrong repository these are all ways that our secrets these highly sensitive strings end into the wrong places and what's really important we have no idea what happens when they get there so let's quickly talk about uh what we can do to actually start solving this issue so what does it look like when an organization has a good level of

maturity so we've just talked about how Secrets can end up sprawled everywhere we've talked about how hackers find them and what can happen but let's talk about what does it look like if we have good practices that enable our secrets to remain secure what is that what does that actually look like so have you know what we call our maturity model and basically if you take your organization or your industry you know and you look at how you're going what you want is you want to be at the top you want to be in level four but most people around level zero or level one and so we've created these steps like pragmatic steps of what it looks like to

not only get to level four but actually to move there so if you decide that okay you're a software organization and you want to be at level four of managing your secrets and having secure development processes what does that actually look like what does that actually mean well it's going to be a process you're not going to get there overnight but there are steps that you can take in order to start moving towards them so what we'll do now is we'll just go through these steps and what it looks like to be a mature organization and managing your secrets in a mature way so let's talk about what does it look like if you have no maturity so we've

mentioned that you know secrets in in these different areas so basically what's this mean is that you've got your developers your software engineers and they're basically storing these secrets in plain text a developer needs an AWS key to connect to an S3 bucket or connect to the server and you just send it to him he stores it on his computer he or she stores it on their computer and secrets.txt and away we go when we get to an actual private source code repositories diversion Control Systems we have no checks and balances into there so secrets are going to end up being committed in there and we have no idea so this is where most companies

kind of start we don't understand how much of a problem it is and we're small enough to kind of get away with it for a small period of time so then we want to get to really that level one area where we're just starting to think about how can we securely manage these secrets so here we have we still have our secrets in the developer environments we look at our secrets management well these are still being stored in unencrypted config files maybe they're not being purposely checked into our git repositories our Version Control Systems but they're still unencrypted on the developers machines that means that we still have no visibility over where these secrets are

and when we look at in our source code okay these are now grouped into areas they're not sprawled everywhere but they're still checked into Source control which means if an attacker gets into that Source control then they're going to be able to find these secrets and if we go down here we will look at Secrets detection because of course we would need to manage our sequence but we want to be able to detect them if they're there we might have some small detection in place maybe we're using some regular expression to be able to detect some of the patterns of these secrets in some high profile Source repositories but really we're not got adequate detection in

there so we still don't know that we actually have a problem now once we get to level two we're starting to actually achieve some maturity so let's have a look at kind of what we want to have in place for kind of secret storage secret detection at level two so now we're actually using some correct tools to store our secrets we're using a Secrets manager you know this might be a vault could be something else there's lots of tools out there that we can use you know to manage our secrets so here we're first starting to see that these are actually kind of stored correctly and this is really important because we can have access control over

these and we can limit who actually has access to them but we're also still seeing some Secrets being stored in Source control and in particular you know at this stage people will say that they're encrypting their secret files and putting them in Version Control because that's easy way to send it throughout your developers throughout your teams this is still a really poor practice because it gives you a single point of weakness but at least they're not in plain text anymore and then when we look at our detection what are we actually detecting in our repositories so here we have our critical repositories so our production areas are being scanned at certain process that's certain triggers so when

we're making a pull request we're scanning for these secrets when we're merging into the master Branch we're scanning for these secrets so we're starting to get some maturity in here now this is kind of where most big companies are if you look at the breaches that I talked about this is probably where Uber is this is probably where twitch was this is probably where code curve was so you have some processes in there we're not just storing secrets and get but we know that they're still going to make their way out we know that they're still going to be in the history somewhere we know that they're still going to make it onto our infrastructure

and that's because we don't have adequate fail safes in there to be able to kind of prevent that so we still need to go a little bit further ahead of that so now we get to level three we're starting to now get to a point where the risk of our secrets being exposed is starting to come down a little bit so what do we have in our development environment so yes we're using volts where we're securing these secrets and our managers just like before but we also have rotation in place we should be rotating our secrets very regularly because then if one does get leaked you know it should be invalidated soon thereafter just on regular rotation so we've

implemented some Secrets rotation based on our our secrets management and we've completely stopped storing secrets and get unencrypted or not we've removed that from our environments it's really important step to be able to get to but it's actually quite hard because you're going to get pushback from your developers and software Engineers because it's very convenient to be able to do that now let's move over to Secrets detection and there's been one big change and that is that we're actually starting to detect secrets locally on our developers machines so we've talked about Secrets being stored accidentally inside private repositories once the secret gets there it's automatically compromised because it's going to be cloned onto multiple people's machines

it's going to be backed up it's going to be pulled down and we've got no visibility over it doesn't matter if it's been there for a year or a day that secret is absolutely compromised once it hits your git repositories but um so now we've got continuous scanning but we've also scanning this before it gets to our git repositories so when your developer commits code we can Implement a GitHub that will block that commit block that upload if it contains a secret this is important because now we've not only just identified the problem by scanning our git repositories and identifying that we have secrets in there but we've actually started to stop the bleeding so we can move forward and

start solving the problem and then once we get to the expert levels I won't go through all the stages I'll just talk about pretty much kind of the the few things that have changed between this and that is one we've got introducing the concept in the developer environment of dynamic secrets so this is basically instead of saying uh I want to have a uh you know a password or an API key for this service we're basically creating one so that the machine can access it and then destroying that secret immediately so that's what a dynamic secret is that really reduces the risk of any secrets being exposed because they're one-time use secrets that's quite a hard thing to manage you

need kind of teams to be able to do that but that's why it's in the expert level so that's really what we have and then we also have our secret detection inside all of our environments continuously at every stage all of our repositories and all of our developers have uh pre-commit hooks to be able to detect these so this is what it looks like to be able to sleep easily at night and not have secrets at risk of being exposed but you know this takes quite hard if you're at level zero or level one we're not going to get here overnight you're not going to be able to find a security vendor although I'm sure they

will promise you that they can you know that you'll be able to get there instantly if you just sign a hundred thousand dollar check doesn't work like that we have to move forward and there are other steps here that I'd kind of locked out because this was a half an hour talk and I didn't want to bore you all for two hours talking about all the things that we can do for Secrets managers but the first two are probably the most applicable and certainly the areas where the most amount of Secrets get leaked out there Into the Wild so now we have to look at kind of building detection and Remediation programs because now that

we're not storing secrets and we're detecting them and we have volts we actually come to an interesting area where we're still going to have incidents the big difference is that now we're actually aware of those incidents when a secret does enter the version control system or the git repository we're actually aware of it that's one of the big differences between now and kind of at level zero so we're not even aware of the problem back then so we're mediating incidences is really hard at the start of this presentation I talked about uh the fact that in the average company in the average repository you know there's going to be a thousand unique Secrets 13 thousand occurrences of Secrets uh in

these repositories that we're going to have to sift through and organize with only four appsec Engineers that's a massive task so remediation and solving these issues become uh really important otherwise it's just going to keep getting ignored so we're monitoring our source control we have lots of different types of detection yes and we're getting alerted to them but then what right we're aware of the problem then what so remediation comes from being able to prioritize incidences that are most severe we need to start doing things like validating our secrets so if an AWS key leaks do we know if it's a real idea or not so we need to implement some of these factors in so that we can start prioritizing the

incidents and taking effect of the ones that are really causing Havoc at the moment and then we also so and how do we kind of do all that well we also need to integrate into the developers so we have this concept of devsecops at the moment development security operations team and merging them all together why is that important because you know developers are the ones if they've leaked a secret they're going to have all the context around what that secret does why it's being leaked is it valid and we need to involve them in the process and otherwise we're just going to have isolated security and operation teams working on their own so really important

the remediation process that we actually involve all the parties we get the developers involved and we actually know where where we're getting and we also need analytics to be able to see the problems at a glance because the reason why this is such a massive problem in our security supply chain is because people aren't aware of this as a problem if we scan your internal repositories of your organization I can guarantee there's going to be secrets in there and it doesn't matter if you're building cloud services or you're manufacturing paint you have software Engineers working for you now if you're a large company it's the way of the world every company is turning into a software organization so

we all need to be focused on this and then we also need to be able to integrate uh detection into every area of the software development life cycle so we have here you know going through from the start we're creating our code all the way to where we're deploying it Secrets can emerge into our supply chain at any moment obviously when we're writing code they can be hard coded in and then we commit them but they can also be injected just before our CI CD runs if our CI CD environments get compromised we can dump out the credentials there and then even into the deployment process you know we might have credentials for package managers

running in those scripts so we absolutely need to have uh secret detection you know at every every stage of that so there's lots of tools uh to be able to do Secret detections there's some open source tools that you can use for free like truffle hog get secrets and there's lots of tools that you can use for Secrets managers again some of them are open source I actually call it Vault or some of them are really expensive like the commercial version of Hershey Court vault um what's important here is not necessarily the tools that you're using but also that that you are using them so I worked for a company called Greek guardian and we have a bunch of free

tools for Secrets detection that help with that but the important things that you need to to be able to do is actually make sure that you're using the tools um to effectively combat this problem so I'm just going to finish up on one thing if we want to have our supply chain and our source code free from Secrets which we know is a massive problem there's really three areas that we need to focus on our processes our people and our tools right we need to have the tools in place to be able to do this but this is why I said vendors can't solve this problem alone even though they're going to promise you that

they can it's because we also need to train our developers and bring together devsecops to make everyone aware of it and then we also need to have processes in place that have actually had a deal with these what happens when an incident happens when do we revoke Secrets what are policies on rotation and with this we can start to combat that problem and build that up that maturity model of developing secrets so that's at the end of it uh there's a full white paper on this so if you you know missed anything in the presentation you want to go deeper into different areas this here is a as a white paper that you can download that will go

through in much more granular detail all these elements so if you if you were a bit confused about some areas you can definitely dive deeper into this and if anyone has any questions I'd be more than happy to take them here so thank you all for listening thanks thank you

Secrets Management and the Software Supply Chain: A Maturity Model for Secure Development

Related talks