
the the talk is going to be about uh supply chain security more specifically um looking at uh attacks that happen and that were uh public publicly disclosed in uh recent news articles over the years um and that that's it so my my name is Franco poo I'm senior product security engineer at Boost security at Boost security we build uh products and services to help software development teams build a secure software securely so like really about helping them not only to find vulnerabilities in their code but also make sure that the supply chain to that software development life cycle is secure so on this slide you can see a timeline of news articles uh public publicizing
different supply chain breaches of course that's only a a subset of those that were publicized and certainly there are many more that never were disclosed publicly uh and you know we can see an acceleration of those like in the the news um uh in the public news um and you know that's only accelerating like I only updated the the slide deck was originally built for like last year like a I gave this talk and updated it to add some more uh uh news articles but it just keeps growing and growing you should just like type a supply chain uh Security in Google News you'll you'll see that quite often and most certainly that doesn't mean
that it started in 2019 you can imagine that 10 15 years ago very Advanced sophisticated threat actors maybe targeting like more in all nation state we're using that it's just that now it's getting much more well known and we're going to be looking at all of those in fact through the lens of supply chain levels for software artifacts which actually just went uh 1.03 days ago so the slsa model affectionately called Salsa uh is a threat model that helps to identify the weakest links in the supply chain uh that that most software development is is uh being built in in modern days you know typically you will have the source control most likely some git uh GitHub or whatnot some CI
component uh back in the days might have been Jenkins nowadays it's most likely GitHub actions or Circle CI whatever it may be and then the distribution nowadays we typically have darker images that are then around being run on kubernetes but that's only like a subset of the many ways that things can flow from code to consumption so consumption style consumer side in salsa is not just thinking about Docker images for instance but it could easily be you know consumed in a car for some kind of autonomous driving system so that that you know can have obviously pretty bad consequences so using this model and the timeline we're gonna look at various examples well the first example is kind of
looking at this side of the threat model that's really about what's happening on the developers workstation where the new code is being produced at the end of 2022 Dropbox disclosed that uh about a hundred or so GitHub repos were uh accessed like private repos source code was exfiltrated and what happened after that was not quite clear because what was publicly disclosed is that yes like you know attacker basically uh did some fishing uh pretending to be coming from a well-known uh CI environment called Circle CI so the attacker kind of knew their target and kind of crafted the fishing campaign to be pretending to be from Circle CI then redirected to a fake login page
capturing username password and one time password to GitHub it was successful and exfiltrated source code from a Dropbox Dropbox explains that you know it didn't go farther but who knows please well definitely the attacker now is in in control of the the source code and might find vulnerabilities that would other otherwise be extremely difficult to Define um so we we just saw one example where it was not necessarily the developers workstation that was targeted but it was you know phishing to get to use the developers access in this case it's more like uh that the developer itself is malicious so some academics demonstrated that they were able to convince Linux kernel maintainers uh from accepting
malicious patches they actually succeeded uh in uh submitting a bunch of pull requests some of which were caught in a code review so we're just ignored other were actually merged into uh the the kernel uh main code base thankfully those academic were you know working and good faith and then those those changes were rolled backed from the uh kernel tree but it just goes to show that they actually achieved their end goal and it could have been include in Linux kernel in another situation um it was not necessarily kind of access to GitHub that was compromised uh well in fact this one is a bit more like along that line so the PHP code base
they don't use GitHub apparently uh or they were in 2021 using their own git server uh which they kind of maintain and you know do their own security due diligence on that git server um so they just take kind of more risks upon themselves but you know clearly they didn't do enough because someone was able to uh compromise that git server uh uh in some way and then commit uh malicious uh code under the main uh PHP uh author uh like making it appear that the git commit was coming from that person so you know people kind of trusted that you know because it came from that person it must be okay but thankfully a few hours later some people
noticed that it was kind of strange uh probably the the person that was the legitimate uh owner of that account and uh it was rolled back but what it did is just adding a back door basically that would be executing code based on a user agent uh string specially crafted in the HTTP header so it's kind of a logic bomb that would only be triggered under certain conditions uh webmin it's kind of a old school way to manage servers that used to be popular I don't know if many people use that anymore but it's still something that I think is used in like kind of Cheaper web hosting environments uh some researchers found a remote code
execution vulnerability in uh the code base uh at first people thought well okay it's just your you know remote code execution that that happened to be in in the code that people kind of committed uh uh just by mistake you know it was like making a just a bad code that's vulnerable but in fact like later some researchers found out that it was indeed committed maliciously uh and went through the the code base like was a emerging the code base and it went unnoticed as a back door for several years or at least it says here over a year but I think it was even more um then the uh what actually happened like behind the scenes in that case is
that um it was not necessarily the source control itself that was modified but that uh the build system would be instead of picking up from the legitimate trusted Source control it would have been modified I guess the configuration of the CI environment maybe through environment variable or whatever it may be it was overridden to pull from another source control um and that actually affected the actual release software that was published on sourceforge solar winds I'm sure that's you know where most of you first heard the term supply chain uh used in the context of software development for those of you who are not familiar solar solar winds is a big American company that's providing network monitoring software
that's quite popular I guess in U.S government agencies for monitoring and and configuring Network equipment uh one of their uh software called Orion it was compromised through the CI environment so the details I I in fact I think mendient did the forensics on that and there's a lot of information that's been published out there by Imagine and others I think Microsoft on that topic so if you want to know more there there's definitely a lot more but if we just look at it from the perspective of salsa here it was literally like directly focusing on the CI environment and in that case it was my understanding modifying the source code uh in memory so in a way that was not obvious like it
would kind of get cloned the correct version and then the um the code would get modified right before it would be built in a way that was not necessarily obvious if you just looked at at the log you could kind of get fooled into thinking that's pulling from the the right uh source and then building it but in fact it was modified the a midstream Visual Studio code I'm sure many of you know that IDE as a way that developers used to uh produce code um in that case some academic researchers found a vulnerability in the GitHub action that's used to build Visual Studio code and that vulnerability and the CI workflow would allow um basically someone to
send like uh open a bug in GitHub like using the GitHub issue a feature and the message like the the actual description of the bug would be parsed in such a way that it would try to identify git commit hash uh I guess to kind of automate the bug tracking uh system so that you know if someone mentions a git commit in their bug description would get pulled out but it would actually you know be executed as a command injection in the git command that would then go check in the git repo but in a way where um some secrets were accessible at that moment the the command injection is running so um thankfully it was reported
responsibly there's no evidence it was uh maliciously uh used to exploit anything but that vulnerability in the workflow had been there for quite a while and you know when you know what to look for it was actually fairly obvious to spot it so now there's you know tools out there to kind of scan GitHub uh at scale and kind of find those kinds of problems uh a password manager like in 2021 I to mind I I didn't even know before doing this research but some Enterprise um password manager uh was compromised and the way it was compromised in that case was through an open source dependency so apparently it used some little actually like little known like
very uh small open source library that is just I think have just one contributor one maintainer it was sort of you know left to erupt on GitHub uh somehow uh the the details of the actual attack were not very um clear but they managed to compromise that uh dependency uh itself and it they knew that it was a dependency of the password manager uh and given that it just had one maintainer that didn't kind of I guess accept the pull request without kind of checking or some other way they managed to modify it and then the the the platform manager kind of updated the dependency or had the version constraint that was a little lacks and
just got picked up as part of the build process there are many more examples of this like where uh uh an open source library that is popular uh and that doesn't have much um kind of people that maintain it in a and kind of really care about supply chain security or know about it uh they can get co-opted into uh uh basically an attack on a bigger Target you know like you find literally the weakest link the one the open source library that no one really maintains but in fact happens to be part of a bigger more important um piece of software uh the same thing can apply to Docker images so if you find like a darker
image on Docker Hub and you see that it's actually very popular but there's only one developer that's like committing it and you can kind of maybe do some fishing or whatnot and basically update that Docker image with a malicious one then you know people kind of build upon that image log4j also like extremely well publicized problem uh in fact in that case you know it's a legitimate library that had the legislative feature called unquote uh that was basically a remote code execution more or less by Design um so it's a very popular Java logging Library that's used pretty much in every Java piece of software had a special documented feature that very few people knew or used and that kind of you know
went really uh against the assumption that most people think to use a logging Library they will never think that putting user arbitrary user data like string in a logging Library would lead to remote code execution of all things but it was you know once you know it it's extremely trivial to exploit that so much so that now like most Cloud providers and other things actually build product features to go mitigate log4j at scale because it's so difficult to be confident that you've eliminated every trace of the vulnerable version of log4j because it's embedded so deep in the the stack that often you cannot kind of patch this easily so that just goes to show that you know
kind of external dependencies that you depend on a go that go in the the supply chain are are really like a big thing to keep in mind dependency confusion is uh basically uh something where you depend on uh you have your own internal dependency in your Enterprise for instance you kind of build your own shared library that you use across many pieces of software many departments and you might you know call it my you know custom uh string parsing library and that name you know it gets embedded in the package manifest along with other public dependencies that might come from say npm Pi Pi whatever uh in many cases those package managers the way they used to work some you know
in 2021 and still work if you're not careful about configuring them correctly they will prefer to check in the public registry first so if you have this you know internal library that you depend on and it never existed before at the time in the public registry but someone you know maybe an ex-employee uh leaves the company knows your internal naming scheme or someone looks at some kind of Vlogs that's public some you know documentation they might figure out your naming scheme or just know about it and then put a malicious version of that library on the public registry with the exact same name then when it's going to go pick up the dependency it will first
check in the public registry and get confused why hence the name dependency confusion so you need to make sure how you configure your package manager to you know if you use kind of internal dependencies to kind of First Steps from that and and kind of fail uh have a hard fail if it doesn't find it instead of falling back to public registry if you know that it's not supposed to be uh link though so of course you can have your own internal registry with everything prevent pulling from public Registries but how do you make sure that it doesn't apply on developers laptops maybe it does so in CI environment but does it also do it in local environment
maybe not and then you kind of go back to the beginning like of the the chain and the the developer machine gets compromised so cold gov is a very popular software as a service to help teams do call like code coverage like you know you have uh you run your tests and your CI and you'd like to have a pretty uh display of the code coverage uh the way they would back then in 2021 tell people to install it configure it in their CI environment was basically kind of your good old curl pipe Dash um which you know didn't have any kind of uh code signature to it and what happened is that some threat actor
managed to modify the version of code cough that was hosted on some CDN it went unnoticed for quite a while like long enough that for some number of months the threat actor was able to exfiltrate sensitive secrets that were only visible at build time and that was you know a pretty big thing for cold Cove I think they're still in business they they managed you know they were pretty transparent with the disclosure and managed to not lose completely uh the trust of their uh users but uh I think it was kind of a wake-up call for many people um uh that kind of used different dependencies in the CI environment uh direct kind of attacks on package
managers like package Registries of of whatever the kind so uh in fact this one is uh highlighting another variant of that so in 2008 some academic researchers uh focused their attention on kind of Linux distributions you know often you go and download an ISO and you look at the list of mirrors you pick your like kind of closest University you know favorite football team whatever because it's fastest you know it's like uh closer to you have a good good uh good speed to download a big big large file but how do you know that this you know local University is really taking seriously the security of that you know server file server that's hosting those
isos if you're not extremely careful to then have cryptographic signature check everywhere like make sure that they are so a sign how do you get the either some public key in the first place or some kind of list of hashes are they also coming from the same kind of a mirror so like how do you how can you trust them so it's uh just kind of This research was kind of highlighting this issue so in effect it's it's potentially just fine to download the ISO from a potentially compromised source as long as you can trust the way that you then verify the uh software package through some signature or Cache uh in this case uh we're highlighting
more like what happens you know with those in in those Registries so there are tons of malicious packages that are using you know so-called typo squatting so in this case they're being published there but with slight you know typographic modifications so for instance like a very popular python package for doing HTTP requests it's called requests with an s but there is another one called request without an S and then other variations that could be very easy for someone typing fast like request with you know to you or whatever uh and they all exist or at least existed at at some point in time on the registry and they keep being published you know like they they take them down
people publish again so that happens it's like cat and mouse game with pretty much all the package Registries out there Fable squatting uh there's evidence that it's been done for quite a while like easily uh all the way back to 2016 uh quite a number of years before it became kind of something that researchers uh made made news about uh more recently some of the more recent actual like interesting examples of type of squatting where for instance AWS key stealing um packages uh other like in the case of Docker hub similarly um like thousands of malicious Docker images with small variation like Ubuntu with an extra U or whatever so that's really something to look for through
basically any kind of public registry uh twilio is a software company that's offering their services especially in the realm of telephony like uh to send the SMS messages or make phone calls and things like that pretty cool thing they have various libraries some of which you can actually use in JavaScript directly in a web page so uh though the the the file server on S3 that was uh publishing the a little JavaScript library was compromised the version of the library was modified and people that were consuming the library uh were affected for some hours before it went it got fixed uh the the malicious version of the library contains some kind of I guess uh I think
it was like a mage card I had I'm not very familiar with that but it's a well-known piece of malware so in that case you know when it depends on JavaScript libraries directly from a CDN instead of kind of copying it in your own uh web kind of static asset bundle if you decide to do so you can in the JavaScript HTML you can specify a cryptographic hash to make sure that it it is pinned to a known version of that Library so if it gets changed the web browser will just you know drop uh loading the the JavaScript library so that could be one way or just you know copy the version in your bundle itself
so it and then it does get worse so like so far we've kind of focused on like one area of the threat model but then you can imagine second and third order attacks where you start at one end of the uh the the thread model and you actually affect other other aspects so uh for instance in this case um some IDE Visual Studio code has plugins or extensions um it was found that um you know proven that some of those plugins could be compromised like the supply chain of those plugins uh could be modified against basically a dependency but in that case what's interesting is that it's a dependency on a plugin which then is a dependency on
an IDE which then is then used by developers to write new code so you can see that it goes deeper and deeper the down the rabbit hole and what happens then you know developers workstation is basically compromised some it's a code that could be modified from the the source so it makes it look like it's appearing to come from the original uh legitimate developer but could be modified uh another example in that same uh nature of an ID that was compromised was actually a worm uh scenario so not only the um kind of plug-in for a netbeans ID in that case uh was compromised but once you executed netbeans and would let's say serve your your
product on GitHub and people would start to to download it and open it in netbeans then further just the mere fact of opening the project in netbeans will execute code which would then spread virally like as a worm through your other netbeans project and they themselves would get tainted with this malware and would kind of Downstream replicate and in fact GitHub added some special rules to detect those netbeans project files that contain like malicious string and they kind of prevent that from happening if someone is using an old version of netbeans references so what can we do about all that so it's really about thinking not just about application security but the application Securities security
so you know we've been getting better and better at building secure software so you know hopefully you're doing threat modeling static code analysis Dynamic testing code review all that good stuff uh getting better and better over the years you know 10 years ago 10 years ago uh you would just have people that didn't know about SQL injection now it's like you know hopefully everybody knows about that still not necessarily the case but um now we're kind of you know the threat actors know that it's getting harder and harder to find those you know low-hanging fruits so they just focus their attention on the weakest link which the supply chain because there's so much and it's so much more Rich
there's so much more problems in the supply chain that you know as a Defender you're kind of out of luck like there are so many things that when we go and think about the the security of those Downstream dependencies uh it becomes almost an impossible problem to fully address uh unless you kind of freeze all the code code review every single line go in a bunker and you know and then what never touch your software again that doesn't happen in real life you know so yeah so basically you know you want to make sure that developers workstation is uh trustworthy not easy uh the the CI environment is also trustworthy you know like card in
that environment that there's no way to modify the behavior that then you can trust that the CCR environment is really the the one that you expected to to build the software you want to like code sign and then later uh verify the code signature but not just that you want to verify either the attestation as they they call it in salsa so the CI environment will attest like the software bill of material and the kind of configuration of the CR environment so that Downstream someone depending on it can verify okay it actually was built in a legitimate CI environment it has all the expected dependencies and the call signatory is all good so you know it's all about finding the
weakest link for threat actors because there's just so much more uh to find so overall what can we do more concretely in terms of the source control you can add like Branch protection Branch protection rules and GitHub there's even taggy protection rules which is a little less well known but is actually something that should not be underestimated especially if you use git tags for semantic versioning if you add like you know version 1.2.3 and use a git tag and then Downstream you depend on the fact that it's a specific tag any developer that has code access to write on a GitHub repo can actually Force Push by default the tags so unless we put tag protection
rule those tags are not trustworthy so you might have Branch protection which is a really kind of verifying guaranteeing that the code that goes in the default branch has been code reviewed but then the tag can get overridden so uh multi-factor authentication for access to uh the source control making sure that every commit is code sign you do statical analysis Dynamic code analysis check your dependencies for a known cves check your production environment for a misconfiguration hopefully you may be using infrastructure as code uh check for secrets that might be in the source code or Docker images that are that have known cves check for the quality of those Downstream dependencies again like can you trust that the the transitive
dependency third level down that's a tiny little library that is you think you're depending on something trustworthy but they themselves depend on something that is less and less trustworthy so verification attestation at every step of the way of the s-bomb software bill of material that's quite it's quite a lot to tackle um my team has been doing some research in the past year uh focusing on on many of those environments uh one such article that we published was focused on GitHub so we published uh the the actual um a factory and the uh using a tool called deciduous if you don't know about that's pretty cool tool so we have the uh deciduous tree you can actually
contribute improvements to it so this is really kind of trying to exhaustively find all the uh weaknesses or like things that could go wrong in the GitHub configuration uh that an attacker could leverage and then all the mitigations to kind of lock that down for instance the tag protection rule is one such um kind of attack versus uh mitigation that's in in the graph same thing for GitHub action and we've been doing the research for other CI systems uh sort of like Circle CI and whatnot uh there is you know several initiatives at various levels uh trying to tackle this supply chain uh security problem the U.S government has been very much interested especially since the solar
winds attack so usdod published some guidelines nist published a bunch of guidelines as well uh CIS many vendors in this case I just highlight npm they just recently this week uh actually embedded the whole attestation as part of npm so now you can build using salsa provide attestation and the npm registry will guarantee that the attestation is valid open source security Foundation they they published this salsa model but also have a bunch of other tools which are pretty nice such as the scorecard you can kind of point to it uh point to a GitHub repo it will tell you about the security posture of that repo uh and kind of give you guidelines on how to
harden the configuration they have another project called All-Star which is kind of doing basically the same thing but for a whole organization they have many more things that you might want to look at uh I have if I have some extra minutes good we have a as I said an extra little zero day it's been patched as I said but uh it's still an interesting uh um a new Vector that to my knowledge was not publicized up until this day so yeah I guess the first to know publicly about this um so terraform for those of you who know is a tool that's used to do infrastructure as code so you know you have uh Cloud providers like AWS Google
Cloud azure we can provision resources in in those Cloud environments using terraform as code so you have kind of manifests in GitHub repo that will Define how to configure virtual machines firewalls and whatnot in a way that is very nice very kind of not necessarily provided like not necessarily cloud provider agnostic per se but at least the language is so once you have someone that's proficient in their reform they can kind of turn around and replicate the same thing in Azure Google Cloud AWS very easily of course like the specificities of one cloud provider make it so that it's not just copy paste but almost so they have this concept of providers which uh they actually provide
like some kind of uh cryptographic verification of those providers because what what they are they're the kind of the glue code between terraform which is a generic tool and a given Target like cloud provider for instance so this provider in many cases it's maintained by let's say AWS or Google for their own environments but people can publish new providers too whatever the case there's a a public registry again which contains providers but also modules what are modules modules are basically another level of abstraction where when you start to build complex infrastructure instead of just defining one little virtual machine one firewall you might want to build something more complex like as a cluster with the firewall and then other
configuration with low balances and Etc people kind of Define best practices and then they publish modules people just depend on those modules much much easier if instead of creating it from scratch most people kind of end up doing something quite similar especially in startups uh so those modules you just Define those extra attributes parameters that make the minor differences between those uh infrastructures but those providers those modules sorry unlike the providers they are not subject to cryptographic uh kind of checks on hash verification so someone can publish a module and um you know if if the actual underlying git commits so as I said again this tag it comes back to this problem about tags
is that this depends on the git tag but not the git commit hash specifically so someone will modify the the commit that that is pointed to by a tag and effectively the behavior of that module can change entirely and and to my knowledge this is not detected by the registry I was able to prove it at scale that it can be exploited so uh I found about a thousand modules that were exploitable under that uh so-called Pawn request so GitHub has a actual blog article documenting a feature of GitHub actions which is dangerous and they know about it they tell people about it this is a powerful feature that you need to know what you're doing so typically when
you have pull requests you know they're coming from got no God knows who so this is just potentially untrusted code and you don't want your CI environment to kind of pick up that code and have a different Behavior so what gitav action does is that they don't expose the secrets when they're coming from a pull request from a fork so you know you have let's say a public repo and you have Forks of them you receive uh contributions from people that fork the code so you don't know those people maybe and typically uh like when you have the the the workflow you will use pull request and that will you know not expose the secrets to that workflow but
if you want to explore the secrets because you need it to verify the code to static code analysis or whatnot um you can use pull request Target which is gonna actually expose the secrets so with great power comes great responsibility and if it's not done carefully uh you can exfiltrate secrets and and worse so what happened with therefore modules is that was able to prove that um thousands of modules totaling like hundreds of millions of downloads and that computed I some of those I could see that they are downloaded more than once per second so I can imagine that there are you know people out there that at scale like modify their infrastructure every second
and through this exploitation scenario though let's say the firewall configuration could be not what they expected and more so that you know even if someone were to review the terraform plan uh it could be modified like right after it gets merged to the default branch and then executed uh directly using the the malicious module so there's been a responsibility disclosed they actually worked uh with us quite well they fixed uh the um the modules that that were um vulnerable and uh actually went downstream based on our recommendations we're able to provide even more guidance to harden the the pipeline for the future so what can we do about that there's a GitHub issue about terraform to actually
pin the hash like you know the GitHub uh them sorry like the providers but this request has been there for several years this this issue is from 2021 but the first one is like I think from 2017. so if you care about this kind of thing maybe you can go a thumbs up on that GitHub issue uh and maybe uh one day it will be fixed in the meantime I guess the thing to do is maybe to Fork those modules because otherwise there's no way to be sure that the code will not the module will not be modified uh when you least expect it thank you
any questions
open open ssf scorecard yeah so that's zero day could you like full module review it and hash it and then every time you do a bubble just to compare the hash from the morning review would that work so potentially yes I see what you mean so like you could basically add an extra step which would basically clone it uh that before a terraform would uh would use it yes but then you would you know need to modify your terraform to depend on that versions in that case would be kind of a temporary folder but you need to actually modify the terraform to point to that so in the end it basically comes to the recommendation that I said it's like in
the meantime we kind of need to copy it and maintain it in your own source tree yeah yes
just practically uh that this there's like a visibility Clips kind of you know what I mean like after the person and then the other question is of these you know we have a tendency to think oh that happened in 2020 it happened in 2021 like it's in the past which of these of you know this great comprehensive list that we looked at did you see as either freshened or like still a problem or kind of the beginning of something that we're going to continue to see all all of the examples I gave there's still actionable like all of them could be reproduced someone could look at a threat actor could look at the the news
article and say I'm gonna I'm gonna go do that this weekend you know just do the same thing in many cases just like ah okay that's okay now I know how to exploit like a vulnerable get up action I'm just gonna go scam at scale GitHub find all of those out of those you know very popular open source packages that are unmaintained you know gullible or kind of easy fishable targets what whatever you know so that's just one example but like at pretty much every step of the way there's the only thing like open source security Foundation they're trying to tackle that at scale so they have two projects called one Alpha and Omega so sister
projects one alpha is targeting the top 1 000 most popular packages like the openness system for instance as an example so they will actually kind of work with those open source maintainers and make sure the security posture of their repo CI and that they do at the station and all that provide like training material to the people all this advocacy but that's you know focusing on a subset of the most important juiciest targets and the Omega project is the long tail like those thousands and thousands millions of packages for those they provide like the scorecard for instance so and they actually do kind of mass scanning of GitHub and open pull requests on those packages that are not you know in the
park the 1000 but they will suggest a pull request that includes the scorecard and say Hey you know actually if you added Branch protection and like fixed that vulnerable uh workflow you could improve the security posture right now it's like maybe I'll merge it [Music]
[Music]
yeah that is just a subset of the the scorecard is really focusing on low hanging fruits that are very easy to scan it's like static code analysis
oh yeah this is more dominance
yeah exactly in January I didn't exclude I didn't nothing include that Circle CI it