
Um so yeah before I start going into my slides um I just want you to imagine this scenario. If you are a developer and you are working under a very tight deadline and you have your reliable AI coding assistance by your side, maybe autocompleting some functions, generating some code, suggesting dependencies, we call it um have you ever heard about this term wipe coding? This is where the ideas it turns into the code uh basically instantly and effortlessly with the help of AI. So um this is all sounds like magic but if we have a piece of code in the dependency that just going to affect the entire application this could turn into a nit. So this is what I'm going to talk about.
Um it goes back to the topic of software supply chain. uh if you know about software supply chain uh in the past but you probably wouldn't realize that over the time it actually um evolve into something that's pretty uh you know pretty new for example AI the usage of AI that could lead to that landscape. So hi, I haven't introduced myself. My name is Emma. Um, I'm a security architect at EPAM. I'm also leading a team at EPAM uh for the UK in Ireland uh for the so I'm leading a group of security architects and engineers. I have 12 years in ABSAC uh cloud security mainly in Azure and I'm a passionate mentor. I also speak at international
conferences. I just came back from Las Vegas. So I spoke at GFcon. Um I deliver two talks at Defcon, one of which is about cloud security. Um so that's me and I love travel. I love hiking, cooking, everything. You know that's quite normal. Uh so software supply chain is no longer simple. Um I would like to kind of introduce some of the recent attack. Um so is I just want to um talk about the idea that it is complex at the moment and global network of code uh contributors and uh open source project and third party vendors. Uh so software supply ch uh so open source software makes up about 70 to 90% of all the
modern modern code bases. uh we've seen all the kind of attacks. Um sometimes the um sometimes the the the dependencies uh attack on the dependencies could turn into an approach for attackers to deliver ransomares and now that is the current state we all been using AI code you know AI code assistants and that's why do I click on this I just want yeah okay so that that's that's basically uh what I want to stand here and talk How um but let's go through some of the basic and the history of software supply chain. So we need to understand what are the threats of app of software supply chain and how they affect the system where they are
vulnerable because almost every software uh software nowadays contains a component of open source libraries and dependencies and these uh dependencies and uh containers images they are building block of any applications and um any vulnerability that deploy into an application that could spread the vulnerability Yeah, that could be spread throughout the CSV pipeline. So, um, so the XZ util backdoor, you probably have heard about this, is a campaign that lasts three years before is being uh discovered. So, the malicious codes were planted in multiple version of the open source project where the attacker was able to um maintain persistency of this attack without without noticing. So um this is an example of using the act of uh social
engineering to maintain the trust within the open source project and uh Soluins Soluins is a a very famous one. I don't I don't need to explain too much about it. Basically it's a sprite in the uh internal build system of the Orian the Orian platform. So there is a inject uh there's um malicious code injected into the software update of the Orient platform and this attack has been has impacted tens of thousands of downstream customers which shows that you know even if you are trusting a third party vendor and which seems to be you know stable and trusted but still something could lead to disaster. Um so dependency confusion attack um is a type of attack that actually quite
interesting because it builds the uh the prioritization mechanism used by certain package um so when they um when they have this condition of naming collusions between the private and public registries. So examples would be if your organization have a private uh package repo uh to store your internal packages and if there is a the same the same package that has the same name in the public public uh repo then that could create an opportunity for um for the malicious package to be injected into the public repo and your internal input repo will probably going to pull that um repo off and install or the malware. So that is one of the attack vectors. Uh so this type of attack is called
typoscotting. So it's it's very clear from the name of typo sporting is exploiting the typo uh the expelling mistakes in the in the name of the packages or dependencies. So attacker could publish malicious package in the public uh registries like you know container registries or even dependencies registries like npm and using the name that's slightly misspelled and you don't even notice and developers would just automatically you know download that from the public repo. Um the last one I would like to explain is the um is actually something that could happen in the artifact. So when you package everything together that turns into a artifact that could be a container image something like that that could be poisoned. So how this can can
be poisoned. It could be um this is a example code uh code cove is a um is a code testing application. So this example is exploiting is is the vulnerability has been exploited in a docker image creation process. So the artifacts were affected um also affected um a lot of downstream customers. Now come to the slope scotting. How many of you have heard about this term? Okay cool. It's not new. So, so basically uh this type of attack creates a opportunity for the attacker to publish a malicious package under the exact name. However, that package may be non-existence, maybe malicious. So um as the example or scenario that I described at the start uh so when you are as a
developer you use the AI code coding assistant like you know copilot or github or copilot whatever you use I mean there are plenty of tools out there in the market and that causes this program causes when the AI recommends the code with a non-assistance or malicious or use another word hallucinated ED dependency. So attacker could register their dependency name with the malicious code on the same um package that you that the AI is referencing. So as a result developer could install that dependency or package and this could lead to compromise. So what happens in the open source? If you're asking me okay so what then if we have a malicious package in our system what you know what kind of consequences
that it could cause so most of the cases is that the attackers able to open a command and control uh channel which is also called back door plant a back door and and remain access into your system. So that's the that's probably the most dangerous things that you could experience in your system if that's the case. Sorry about this very very overwhelming page but um yeah but this page talks about the attacks attack surface when it comes to CSD pipeline. So it's a wider attack surface. So if we cons only considering the dependencies which is at the beginning of the CI/CD pipeline you know getting into the codes into your system then um then there are
plenty more happens downstream. So you can you you then you would know one of the vulnerability if in if it goes through the pipelines that could create um a greater impact. So um exploiting a CI/CD pipeline could be something like uh exploiting the CI runner, the CI uh the CI server, the build server, something like that. So I'm not going to go through each one of them. But this is something that's really interesting because if you imagine in your CI/CD pipeline you have so many secrets going on that could be maybe in one of the config file um because as a so in my project I I just want to be honest in my project it's very it's it's a very
common practice for us to uh for for the developers to write test code with the implanted um secrets and credentials they like to do that they because they want to automate everything right in testing. So this creates an opportunity if that piece of code is being uh obtained by an attacker and if that credential is is still valid. So that is um that's very dangerous. So yeah um what is the impact? So the impact could be anything. Um data exfiltration and and there will be um serious impact in the reputation of your of your company. Loss of c customer trust I don't need to talk about that you know your business impact analysis will probably show everything about
this. Now I would like to introduce uh zero trust. I don't need to introduce it because I know everybody's in is um using zero trust principle at the moment but you never probably never think about okay I need to use zero trust in the supply chain so why is zero trust there are uh three three different principles that I summarize here le privilege access uh verify explicitly and uh and uh a shim bridge so Sorry, what am I doing? Yeah. Okay. Uh, okay. So, yeah. So, when zero trust um going into the, you know, when when zero trust applying into the supply chain, it turns out to be something about verification and how you verified
your um your supply chain in terms of your CI/CD pipeline. how to verify every access to the CSC pipeline and the um the artifact integrity what's the time okay so yeah um what are the mitigation for this for for the for the scenario that that I described earlier so um so these are the methodology for applying zero trust I will explain all of them in a minutes So first of all, how to choose dependency and library safely? Um, sometimes you probably think about, okay, this is quite obvious that you need to go for the trustworthy uh dependencies and libraries and but but the fact the fact is a lot of developers just don't check about this. They just
going to go with whatever libraries work for them and that they want to deploy them for efficiency. So um you can look for the signs of legitimacy by verifying the publishers badges and you can verify the p package sources and always verify the assistant and repetition of any any uh package name suggested by the AI before installation. Um and community feedback can be a very good uh way to check whether that whether that package that you are after is a valid one. So I'd like to talk about uh version pinning because version pinning is something that's almost you know mostly uh being forgotten when we are having deploying a large project. So in the in
the project we have different version updates. So without pinning the versions tools like um you know the repository um would pull the latest compatible versions. So if you imagine that you have multiple uh developers are contributing to the same uh same uh repo and one of them pulling one version of the dependency another one is pulling another version which is resulting in another dependency tree is being pulled which means that there will be a inconsistent inconsistency of that. So if you're using things like log files, I mean in the case of um of the of the uh mpm, if you're looking uh if you're using lock files, so even if um if the package have the have released the new
versions, you will be able to ensure that the trusted version of the dependency which you have reviewed and tested will be downloaded instead of the one that is is is the newest one. So newest one doesn't mean that it's is the best. You have to test and review everything and verify everything. So that's the idea of the of the um of the version painting. So oh why do I always h okay? So another way to uh to verify your package would be to um and obviously would be to use digital signatures and hashes. So how do you deploy those um how do you use apply the digital signatures and hashes. So you need to verify your
package before installation and before the deployment and after you'd have downloaded that um that that uh package and verify that with um verify that is coming from the right resources with that digital signature. So sorry. Um so so how does it calculate when you install a package? Um the the repo will calculate a hash of the downloaded uh package and it will compare with the with the integrity hash in in for example in your log files if you're using log files like I said in the previous slides. So that's that's how you how you can uh verify the integrity of that file. Let's move on to the software um composition analysis. So software an composition analysis is also called
dependency analysis. So listed here I have a few uh tools um that we have been using in the past. um most of them are open source and you're welcome to use any non-opensource um when the you know commercial tools as well but uh those ones are the one that I would uh recommend if you're using the open source ones. So um some sometimes your package manager like MPM could um you know could provide a better scanner if you if you are able to use them because it will focus on the known vulnerabilities you need to consider for your um dependencies and um so what is sandboxing? So sandboxing is something that um is not just um related
to you know limiting the spread of your it's not just relating to limiting your the spread of your vulnerability but also if you download your dependency into a sandbox for example like a deposible container or lightweight uh virtual machines you can use that to test your dependencies before they deploy or artifact or like a container image. before they deploy into your w downstream systems. Uh okay, ESBON. I think probably all of us have heard about ESBON before. So, Espawn is a complete in inventory of all the software components, dependencies and uh and the metadata and relationship between different um with within between the dependencies within this um software project and artifacts. Um so, so when
you are using asbond there are some red flags that you can check against. Some of the esbond are very good because they could uh flags up things like um non malware, non vulnerability CVS and even typo scotting. They can compare the names between the you know between the valid names and the the names that they could call up as a type typo in that package and sometimes it could reference to the license risk. So some of the best practices about uh espawn is to enforce it within the CSV pipeline. Uh this is actually in fact in in reality it's really hard to enforce because you don't want your builds to fail um automatically um at every pull
request whatever you know whatever the gate that you are setting. However, my recommendation is depending on your risk appetitize maybe to have a separate uh pipeline that you allow to run the sbond at every pull request for this for your for your build. So in this ways it it won't affect the main build but at the same time it could uh run uh run the run the sbond and identifying if there is any vulnerability in your uh in in your code. So um on my screen here I sh I have shared with you a example of esbond that uh we have been using. So as you can see in the asbond at metadata you you will be able to see
the version of that dependency the and the dependencies and the vulnerabilities that listed associated with that dependencies. And this is another example of the ESBON output. So um as you can see that uh it shows the details of the uh vulnerabilities. Um and it could also provide also provide a recommendation and suggested versions for upgrades. Um I think another interesting uh feature of ESBON is that it could also show the rel relationship between the component such as the parent child relationship between uh the direct and transitive dependencies. That's really good tool. Okay, this is interesting. So how can we train AI to handle um everything safely? So we could try to lower the temperature or the randomness of the LM the large
language models during the code generation that could reduce the likelihood of hallucination and sometimes you could you because you know uh the AI coding assistants may relying on the stat statistical patterns from their training data to predict the the package names. Sometimes um you can use a uh reason enhanced um agent. So this could be something that uh could simulate the the humanlike reasoning and attempt to understand a border context for example you know the version the the source of that of that dependency. So in this way uh it it could lower the chance of getting hallucinated dependencies. However, it doesn't eliminate them completely. So, as I mentioned before, we can use a sandbox uh environment to deploy our
code before it goes into the the main code deployment. And you can also train AI agent. If you are using a internal AI agent, then you can train them to interact with the trusted uh package registry in real time. So you will need a data scientist specialist in your house. Okay. Um I'm going to quickly go through this. Uh so I've talked about dependency signing. Um but another very important thing in this CI/CD pipeline is the is the artifact signing. So basically this means that you also need to sign uh your binary containers, images and other forms of package code. So this could be used to in uh verify the code uh or binary integrity
and uh so if something in the artifact is order the signature would become invalid. So that is the idea again you need to enforce that in the CI/CD pipeline. Um okay so in terms of uh protecting your CI/CD pipelines that is the workflow so workflow such as the branch your workflow of your branch like who's going to approve who's going to do the code review needs to be uh needs to be um defined clearly for it's really hot needs to be defined clearly for your company. So each company have the have your own uh branch policies. So um and also um audit and monitor your uh CI/CD runners to detect any anomalies um and maintain a audit log and monitor
your pipeline. A bit of um guidance around secret management. uh we we all know that don't hard code hard code your secrets. So you can use a secret managers such as the AWS secret manager KMS um Azure key volt uh you can also use uh something like um the GitHub GitHub CI secrets to manage your CI/CD uh secrets. Depending on what you use, you can use different kind of tools that are native to your to your platform. Uh yeah. Okay. And the real time observability is also very something quite um quite a standard for for uh for manage and uh protect your runtime environment. So in terms of cloud native observability you can use EPBF uh based
networking and security observability tools to detect some anomalies. So those are the examples of the tools. Uh yeah so another thing that uh could be integrated into your supply chain uh assessment uh program is to adopt one of the of the framework. So there are two frameworks that are mainly used uh in most companies. One of them is the NIST uh secure software development framework and another one is salsa. So salsa so both of them um I mean well it's not mand mandatory to use them. However using them could help you to compliance with regulatory requirements like DORA or cyber resilience act. If you read about those documentations, it does require you to understand what is in your supply
chain. So that is a screenshot of um of the supply chain levels where you can assess at different kind of levels for the for the salsa framework. Okay. So that is the final list that I would provide for my uh developers to check if they have achieved all of them. Um in the real life it's hard to you know to incorporate all of them in the in the processes but we need to start somewhere. That's all. Thank you. Any questions?