Sharks in the Water: Open Source Component Risk and Mitigation

Name: Sharks in the Water: Open Source Component Risk and Mitigation
Uploaded: 2020-03-09
Duration: 23 min 48 s
Description: Aaron Brown - Sharks in the Water: Open Source Component Risk and Mitigation Navigating the Open Source Component (OSC) Supply Chain can be murky and unforgiving. Gain an understanding around how recent hacks could have been prevented by proper management of OSCs through education, awareness, and a

BSidesSF · 202023:48107 viewsPublished 2020-03Watch on YouTube ↗

Speakers

Aaron Brown

Tags

CategoryTechnical

TopicSupply Chain Security

StyleTalk

About this talk

Aaron Brown - Sharks in the Water: Open Source Component Risk and Mitigation Navigating the Open Source Component (OSC) Supply Chain can be murky and unforgiving. Gain an understanding around how recent hacks could have been prevented by proper management of OSCs through education, awareness, and automated tooling.

Show transcript [en]

so thank you for joining me here so late in the day and so close to happy hour for my talk sharks in the water open source component risk and mitigation okay to a good start okay so who is this Erin guy so prior to moving into security I was a full-stack engineer building products and in that journey I was introduced to the security champion program from there I grew a deep fascination and passion for security so much so that I decided to jump in full-time and start working as a security engineer at SCI scence where I focus on hardening our cloud assets I manage our bug bounty program collaborate closely with our platform team to get in to dig into the

nitty-gritty of our ever-increasing codebase and build and integrate tooling that makes sense for us but enough about me this compass is our meta navigation table which helps set a course for where we're going today we'll start with the component risk landscape diving into the numbers of the open source software supply chain which I'm calling OS 3c mostly because I got tired of typing os SSC attack surface and why this is important then we'll make our way to manually navigating murky waters and how placing trust and our teams is paramount to success on this journey finally we'll wrap it up with Cretan and autopilot system that will complement the process that we've already put in place so by

the end of this talk you should have a solid understanding of the risk how to navigate the murky waters of the open source supply chain and how to provide the necessary context and knowledge to stakeholders of all kinds across your organization let me begin this section by saying that while these numbers may seem like I am bashing the open source community I'm certainly not I love open source I use it constantly especially in all other products that I'm building and all the tooling that I'm building as well so there's a lot of good in open source software it increases engineering city allows for near limitless extensibility and is generally well tested however not all components are

created equal and as we progress through today's talk we'll see why education and awareness among engineering teams is in many cases more important than the automated tooling we love to build into our pipelines so to fully understand the OS 3c landscape I think it's important to first get an idea of the growth that has occurred from 2017 to 2019 there's been a 75% growth of newly released packages across all repositories in just two years time two of the leading leaders in terms of percent increase our NPM and crates do which is rusts repository so JavaScript's NPM repo with over 8,000 new people contributing annually has grown at a rate of one hundred and nine percent over two years

so from roughly 400,000 packages to now well over eight hundred 36,000 and as of today it's well over a million the fastest of them all though is rust crates IO package manager growing from 7,000 packages to now over 25,000 in just two years time this is pretty rapid growth so much so that the engineering community has access to almost 22,000 new component releases every single day since 2018 there's an impressive amount of work being done across all industries and most of it is fueled by the open source community it affords us much more rapid prototyping and development it increases our velocity in other words our time to deliver software has now accelerated at a phenomenal rate we can

pull down and spin up a database just by running one docker command we can extend upon that functionality which is the core of open source software we're engineering teams are constantly coming up with new and unique ways to use average component building at a top their software to fit their needs in general solve open source software has integrity it's well tested and it's widely contributed to as with anything though there comes an inflection point which forces balance back into the universe annually enterprise organizations download roughly three hundred and thirteen thousand new OSC releases every year of those downloads twenty-eight thousand of them have at least one vulnerability and with those vulnerabilities eighteen thousand five hundred of them have CVE scores at seven

or above and eight thousand of those have scores at or greater than nine so the chances of one vulnerable one vulnerability and one popular component infecting some of the largest companies in the world is very real in fact as I'm sure everyone in this room is already aware it's already happened and it's happened more than once in fact open source components contributed to 31% of breaches in 2018 these made up a lot of what could be called a targeted breach you can make the argument that all breaches are targeted but we all know the amount of crap adversaries through our firewall just to see if it sticks so when I say targeted breach I mean those

that specify specifically target open source components so with those numbers in mind it becomes very clear why attackers target open source components but let's dive in a little bit deeper there is broad adoption among open source components 97% of code and modern web applications come from NPM which has a vulnerability rate of 51 percent so let me repeat that 97% of code in modern web apps comes from NPM of which the majority of the package is downloaded from there have known vulnerabilities and this ties directly into what I mean by pervasive there are of course different ways we could unpack the word pervasive in this context instead I'll just ask by show of hands how many of

you have dependencies you can't update yeah all right this is exactly the reason why we see vulnerable versions of Apache struts still being used in the wild it's crazy so if you're not updating regularly it's going to become more and more painful the ability to update quickly becomes critical when adversaries have compressed the times between vulnerability announcement to exploit from 45 days to 3 days and in the case where they are directly injecting malicious code into open source projects which is not just theoretical it drops to zero day vulnerability so how do we do all this as a security team well we don't it's all about placing trust in our teams and our counterparts across the organization

so the first step in this journey is don't be in a hab we're not going to be able to control everything as Frederick Lee mentioned this morning our engineers are the experts of our codebase they're the ones who have all of the context our engineers and our product leads our champions and our advocates remember your development team is not your white whale so to keep your application and the people who power to float above the depths it's our role to guide the wheel and lean in when requested and when it's necessary this comes in three parts building education raising awareness and creating partnership for education is all about finding space if your organization has lightning talks or lunch and learns or

some other forum in which you can share that's a great space for sharing the knowledge otherwise create a slack channel to share hacks but you know put together a write-up and outline the cause and remediation as well lead in-depth comprehensive sessions about all the cool you're building in the dev psych-out space create a time in a forum to engage no matter what you want to teach on share your risk evaluation techniques so help your devs and your product partners better understand the risk of a feature or component let them drive it awareness expose a dashboard with all the aggregated scan results in one place open it up for your stakeholders like your execs your product leads and your engineers to view

you'll be surprised at some of the outcomes you can get from that add your engineering team to your github security alerts if you're not aware this is as easy as flipping a switch in your organization from there you can actually set up an automated bot that will update a lot of the components for you typically if they're transitive you won't get much help there but at least you can get the direct dependencies updated with a bot slack channels for volatile Ertz share the knowledge you'll probably find that most of your engineers will proactively update the last but not least part of this is partnership partner with your architects in your platform team and your engineers this can help you find

gaps in your own understanding and put you in a position where they're comfortable to come to you with issues communicate vulnerabilities to your team's the best defense against attack is sharing what has come before hoarding or restricting knowledge is not where we should focus we should focus on getting better together build a shared model of responsibility any of you who are familiar with AWS will be familiar with this phrase security is about ownership across many levels not just one has been mentioned multiple times today we cannot be everywhere and I don't know about you but I don't want to be everywhere with partnership in mind this is a simple but helpful component evaluation tool it's a

great way to engender the idea of shared responsibility encourage your team to use this to evaluate their import prior to consumption so when was the last time the component was updated this can easily see be seen by navigating to the change log within the repo or typically you can just look at the file contents themselves and see when the last time was updated if you see a component has not been updated for four years it's likely not something you want inside your codebase popularity what kind of attention does it get from the open source community if it's on github are there stars and Watchers how is the documentation does it seem like you can work with this and for vulnerabilities I

recommend they go to resources like CVE details or NIST's national vulnerability database to search for known vulnerability within those components check out the issue tracker as well it's a primary metric that I like to use are their maintain errs responsive to the community do they respond thoughtfully and help consumers with their product and their API if it's a unpopular project there may not be many issues however it's a popular one and you see a lot of issues that are not getting resolved that may tell you that it's not something you want to work with and thoughtful imports are the intersection of them all this is where you can talk to your Inge team about being mindful

about what they are importing one of the first things I bring up is the unfortunate scenario of left pad I'm sure a lot of people in this room have heard of that but if you haven't it's an NPM called package the developer unpublished it it was an 11 line piece of code that once it was unpublished baroque node babel Travis CI and a lot of the other applications and tools open-source tooling is all about not reinventing the wheel but in my humble opinion there are wheels and there are blunt tools left pad was a blunt tool that could have been written instead of imported applying autopilot is where we get to chat about that sweet sweet

automation I remember automation is only a complement to all the other processes we've already established like partnership and education the first piece scanning is the most straightforward and why do I say choose a tool even myself as someone who loves to build and write code unless you have unlimited time or resources building your own scanner is just not a viable option instead you're going to choose a tool maybe and even though maybe even an open-source one you can see if it can scan itself and this in my opinion is a wheel no need to spend the man-hours on something on building something that can be acquired relatively cheaply you also need to decide on your trigger will it

be a cron job running in a cube cluster or will it be triggered by every PR into your master branch either way the output needs to be consumable so configure in such a way that allows for easy consumption ingestion this will allow us to model and reduce unnecessary noise it allows it gives us more agility so we can change and react to our software development lifecycle and a more predictable and controlled way so it allows us to act on our own best assumptions and even adjust some of our worst ones down the line we can use it to reduce noise it's possible to tweak your scanning tools to reduce the noise as much as possible during configuration

however there are instances where we want to have more fine-grained control building an ingestion API gives us the ability to shape and transform the data any way we like and this is what makes data modeling so important upon ingestion we need to then be able to pull out that data and make sense of it so unless you're using a really cool tool that can materialize those views for you on the fly where you don't have to worry about remodeling your data it's an incredibly important thing to start getting right at first so report this is where we aggregate and visualize in intelligible ways and distribute to stakeholders this is a good place to build a high-level executive dashboard

so there are no more ad hoc reports and everything will be at their fingertips and share since we've already established a partnership with our engineering counterparts we can work with them to get deeper insights into the findings to gain more learning and then we repeat the process all over again so thank you besides us F for inviting me to speak here today and happy ten years thank you to my team who is out here and thanks to all of you for being here today and I'll see you at happy hour [Applause]

do you mind repeating that yes so white source I would say that I've had to build a whole lot of automation around it in order to get it to function the way that I want it to however I would say that's one of a reasonably affordable one that for us was on the market and there's some decent open-source ones out there as well a wasp as one

curious how your analysis might change if you include those security issues so I want to make sure I understand the question how would I how would I assign how would I work with the security issues

yeah it the numbers that I was using came from a open source software supply chain report put out by Sona type if I were to apply my own automation to it I would I would definitely pull down these security issues that I get from packages I don't know how that would change the numbers but I imagine it would go up substantially does that answer your question yes right it's a hard question and I say it's hard mostly because it's something that you need to decide upon within your own organization the way that we address it is we look at medium and up and that's just because there's a lot of other work that we're trying to

get done and we work on addressing the medium to high to critical risks first and that is what we bring up in the dashboards as well we pay attention to those lows and we try to address address them over time but they are prioritized much much lower yes

yes yeah I would use dependency scanner which is the one that a wasp puts out Jeremy long is the maintainer on that and that's one that I would recommend

it's hard like his continuous monitoring all those dependencies any ideas or tips that can help with kind of a monitoring like that you've decided to use it how this thing is about yeah so transitive dependencies are a incredibly hard thing to manage you have to go through different layers to update to find that resolution path to update and oftentimes working down that path you're gonna find you'll end up breaking your product along the way I don't have a really good answer around how to manage the transitive dependencies outside of setting up in something like a yarn lock you can or a package.json you can actually set up the resolve dependencies routes and then I would

just make sure you're not breaking your sweets there there is a tool out there though that will actually climb that resolution dependency for you and update it in an automated way

I see overtime so that goes back to reporting and having a visualization or at least some kind of tracker that way you can have a an age of a vulnerable component and that way you can point to either different teams that are using this to ask if there's something that they can use in lieu of this component especially there's a few out there that keep popping up for us that have a boner ability after vulnerability I don't know if that helps answer your question yes so what I typically end up doing there is I meet with a couple of our engineering engineering managers and a couple of our product leads once a month to resolve

things that are coming from our bug bounty program unless it's critical of course and also things that our scanners are finding as well and that way we can meet on a regular cadence and help resolve these on a regular cadence and it feels like a less of an ambush and an ad-hoc request and it becomes more of a process that we continue to refine over time

yeah I'm tired yeah so I think the best way to do that is to actually reproduce or find some reproduction of it there's a whole lot of YouTube resources out there where you can find people actually taking well exploiting these vulnerabilities and that's a nice way to humanize it and move it further away from the scanner and move it closer to the individual level and where you can actually show this is something that you can actually do in the wild and something that we should consider escalating and moving up the priority chain and I mean try try doing it yourself because that's always a lot of fun to hack on right

[Applause]

Sharks in the Water: Open Source Component Risk and Mitigation

Related talks