← All talks

CG - SBOM challenges and how to fix them!

BSides Las Vegas43:49131 viewsPublished 2022-09Watch on YouTube ↗
About this talk
CG - SBOM challenges and how to fix them! - Hossein Siadati, Trupti Shiralkar Common Ground @ 18:00 - 18:55 BSidesLV 2022 - Lucky 13 - 08/09/2022
Show transcript [en]

good evening everyone welcome to b-sides las vegas common grounds this track is software build materials challenges and how to fix them a few announcements before we begin sponsors we'd like to thank our sponsors especially our diamond sponsors lastpass and palo alto networks and our golds yeah very excited for the sponsors and our gold sponsors amazon invisium plex track intel google and blue cat it's their support along with our other sponsors donors and volunteers to make this event possible these talks are being streamed live and as a courtesy to our speakers and audience we ask that you check to make sure your cell phones are set to silent if you have a question um you can use this microphone up here um just make sure they know if they're generous enough to let you ask um that way people can hear you on the stream and get the recording on youtube um as a reminder the besides photo policy prohibits taking pictures without the explicit permission of the people in the frame these talks are all being recorded and will be available in youtube in the future we would like you to keep your masks on at all times if you would and um looks like there's enough room for everyone so let's get it started here so all right guys thank you everyone good evening i can't believe all of you showed up for a 6 p.m talk that's quite some commitment we will try to answer your questions as much as possible but our information will be displayed here and hussein and i will be available after the talk as well if you guys have any questions with that let's get started today we will be talking about what are software glove materials what kind of problems s-bomb can solve what are the uh challenges we have face uh while generating s-bombs and how to fix them so as all of you know open source software is eating the world right uh obviously there are a lot of advantages uh the cost less cost as well as speed of execution when we use open source software to uh you know design build ships new services or tool now considering uh you know these statistics we have here if there is a critical vulnerability you can imagine the gravity and the impact at scale let's take a look at couple of them raise your hand if you remember 2017's equifax breach oh wow almost 60 percent so in 2017 open source uh component apache strut server suffered a file upload vulnerability uh through which remote code execution was possible and as a result during this breach almost 147 million users personal data was in america as well as in the us and according to ftc the settlement was around 425 million now let's take a look at our next example heartbleed raise your hand if you remember this one this is little old almost 10 years old okay approximately 50 percent so in 2012 on a new year eve a german developer introduced a buffer over for vulnerability in heartbeat extension of openssl it was discovered almost after two years in april 2014 by google researchers and as we all know almost 60 percent sites and services were affected by that and it took all of us quite few weeks and months to clean up the mess who remembers lock 4g the most recent awesome almost 75 80 percent yeah so in lock 4j vulnerability which allowed remote code execution millions and millions of java based applications uh data store and devices they were vulnerable right now those were the prime three examples but if you look at this cve data what you see in 2011 cvs score 9 and 10 these many vulnerabilities will where produce 832 and we are in 2022 right and take a look at the number wow now from this scary situation can s-bomb really save us to create remediation at scale that's the question we are going to answer in today's presentation so hi this is tripty um i am engineering manager for software security at datadog i'm a mobile game developer turned security professional so as a developer i truly can resonate with all the pain points a typical developer faces with respect to security and i'm always open for mentoring coaching or interesting security and privacy conversation over virtual coffee or in-person coffee when i'm not doing security i like to exercise and that's me upside down doing aerial yoga hike i also like to conduct a meditation workshop because i believe in work-life balance i'm a certified meditation instructor as well this is my contact information feel free to add me in your linkedin or send me a email with your questions or inquiries with this i would like to introduce my esteemed colleague hussein hello everybody this is hossain si adati i'm a senior security engineer at datadog i have a phd in computer science from nyu i have a zugler i worked on software supply chain security at google as well when i'm not doing security i'm i do hiking swimming and started surfing this is photoshop obviously but i aspire to be a good surfer thank you hussein so today's agenda is going to be we are going to talk about open source software security gaps and then hussein will introduce us the concept of s-bomb i'm sure many of you are already familiar but he will introduce in a very creative way uh for those uh for whom it's a new concept and then he's gonna talk about some of the toolings uh s-bomb uh use cases beyond improving open source software security and then he will focus on s-bomb challenges and what are some of the solution approaches we can use to fix them i will also uh speak about some of the strategic initiative uh we as you know security professional can take in our respective organization to improve the state of open source software security now let's talk about what are the common open source software security gaps we see and why we see the number one gap i see is open source software developers don't necessarily have security education when they are at university if at all they opted for university education they either take one security class or they do not take security class that knowledge is not sufficient for them preventing introducing security flaws in our open source code second gap i have seen most commonly is since last 10-15 years we kind of relied on software composition analysis tools and not necessarily s-bomb tools to fix our vulnerabilities and mostly these vulnerabilities kind of lack more detailed information on exploitability and whatnot when we just purely relied on sca tool the third gap i have seen is almost 50 percent of the organization do not have open source software security policy or standard rolled out what that does is every time there is a severe vulnerability like lock 4g or hard blade everybody loses sleep and they go start hunting what's the blast radius what's the impact let's go for upgrade if you are affected but we do not necessarily have a policy that can educate our developer or create a culture of automatic software updates regardless of vulnerability and we are going to talk how sbom can helps to achieve that state as well last thing as a result of lack of education lack of adequate tooling lack of standard and policy we see immature processes to upgrade oss many times if os upgrades are already integrated in repository there is a chance that they can break the service and cause regression to avoid all these problems let's see our main motivation is to improve the state of open source software security and to do that it is extremely important to understand how we can leverage software bill of material and what are the use cases traditionally we are quite familiar with s-bombs generated from source code but in today's talk hussein will be putting emphasis how we can generate s-bombs from different sources such as source code build time and run time and what are the unique advantages it can offer us to foster open source software security and lastly uh we would like to discuss some strategic initiative with that i would like to hand it over to usain thank you turkey so a warning before i go to the middle part section of the presentation and that would be a spam fatigue you have heard a lot of spam spam in the industry so but bear with me hopefully when we link spam concepts together hopefully we can get something out of it so what is a spam swami stands for software below materials and i'm very happy that alex friedman he's in this meeting and he's the swarm guy who basically drives a lot of initiatives around this bomb and next week they're going to have a big group of people getting together to talk about what is next step on spam where we can take it uh but thank you so much for all the great work that you have done in the domain and this definition comes from the various through documentations that ntia has provided around this bomb and the definition goes as spam is a nested inventory for software a list of ingredients that make up software components and if i want to draw some basically analogy in the domain of mechanical engineering this is not a new concept it's like 70 years like 1960 industrial engineering and mechanical engineering they have been having this sort of diagrams that they basically specify what are the components that are being used in an engine for example what is the shape what are the you know lengths what different aspect of that to be able to for example diagnose if there is any problem in an engine they go back and see you know which part was this what was the producer and to be able to you know fix the problem identify and fix the problem the same goes for the food industry and chemical engineering for example there are you know customer phasing facing labels on any almost any uh thing that we used uh recently uh that says you know how much calorie this one has what are the most important you know materials that customers specifically want to know about um to be able to satisfy some of the use cases around for example if somebody has allergy they should know if there is something that they are allergic to but that definition of this bomb as it appears it sound that this bomb is only the list of dependencies or nested dependencies but i want to just put some emphasis here that s bomb is not only the list of dependency its dependency plus some context so in addition to the list of dependent dependency the suggested list of you know baseline information would be author name for example supplier name component name version string component hash a unique identifier and relation of an object to other objects in most of the cases it would be you know including relation one component could include other component but the relation could be um something else also you can add as many um other contextual information around dependencies like licensing um you know time sam end of life or grouping whatever you you can add as much context and these information are very powerful for you to satisfy the use cases uh that i will talk about so there are so many different use cases that we can imagine around this bomb so it's not only give me the list of dependencies it's gonna serve the company with different use cases there are tons of them this is based on the research that we have done within the company um but i'm gonna emphasize on two of them only one of them would be vulnerability management for discovering the vulnerabilities for example if you know that you have certain dependency to one specific open source project for short if that dependency is impacted by the recent cbe then you know that probably probably you are impacted but by that but not necessarily the other one is something that is becoming big more and more important is the software supply chain security as i will describe s-bomb is not only one point of view of your software but it could serve you to show the chain of the uh and the workflow of uh your software so to give you a bit more context of you know how this spam is gonna surface from the technical point of view is that when you describe a piece of software um you're gonna have a form of description of that um it could be free form you can you know choose how you want to present those data but fortunately there are two major standards one of them is pdx and the other one is psychologics spdics was created 2010 the most recent one is 2017. the the major use case of svdx was around you know compliance to just show what are the components that i'm providing for this software specifically if there was a piece of software that was used by an external company a third party company you had to provide the list of ingredients of your software but the cycle on dx is more of you know more recent use cases this one is a specific example of you know um a software with its dependency with different levels and these are the descriptions of you know sbdx and psychologics of that but as i will describe um we shouldn't be worried about the formats because they are interchangeable uh and and there are tools to convert from one format to other format as truthy mentioned in the beginning of the talk you can create the spam or software bill of material in different stages of you know software creation basically software development lifecycle they could come from source code this is the most common case there are tools that they get the source code you run the command including 3v or other tools and they i have linked a basically when you have access to the slides you're going to see a big document including all the tooling that you can use mostly on the source code there are also integration with for example ci cd so you can add github actions to your source code so as you push new source code to your repository these spam are gonna get generated automatically you can you know push them reuse them and generate them and there are build time tools for example microsoft spam generation tool that you can use to give you the build time software bill of material which is you know something different so for example in addition to the context concept of the context that i mentioned so not only the actual software that you have has dependencies but the build tool that you are using has dependencies so for example if that build tool is also impacted by one specific vulnerability it might influence the final artifact that you generate so all of these are related and should be considered and the runtime dependency at the moment most of the full tools that provide application performance monitoring they have visibility to the components that your software is using and they will be able to generate sort of software bill of material for you but as i said like there are lots of hypes in industry around spam and people truly and genuinely started generating a spam from whatever software that they are using but there are challenges the first challenge is around tooling um so if you run two different tools for example on a one repository they're gonna give you different results different number of dependencies and different you know number of them for example i ran um two tools 3b and cyclone digs format on gotof and there are different number of dependencies of course i mean part of it could be because some of these tools include test time test dependencies or whatnot uh but also if you exclude them you still see differences between the number and you know dependencies that they report so this is one of the main challenges and one aspect of it is that some of the tooling are noisy basically to generate too much dependencies that you are not actually using them um the other challenge uh so so for the first uh uh basically item for the first challenge the recommendation is to go after something we call a basically correct tool for a specific language so for one specific ecosystem there could be some tooling that focus mostly on that specific language and they provide better quality swamp compared to some more general tooling for example from our experience cyclone dx gomod is providing better quality golang dependencies but eventually what industry is going to diverge is that each of these tooling like language specific tooling provide would be the base and then all the other tooling basically run a command for that specific language and they generate the spam so eventually the industry i believe that gonna converge to that point that we don't see these discrepancies anymore the other challenge is unsupported build systems like for example we don't have any tooling for generating a spawn for bazel uh build projects bono repo is one of the common challenges that industries have because you know everybody follows google google have has this monorepo system so in many companies uh there are mono repos but the thing that happens in monorepos there is no specified boundary between the projects from the perspective of you know these tools that they generate as bomb so you end up running a spam in the root project which is a collection of projects and you get a big spawn and those tools they don't understand this aesthetic folder is a separate server project um so they cannot provide a quality tool of course you can annotate your tooling and have a you know higher level tooling that uh help with that but at the moment uh it's not embedded in this bomb tooling uh the other challenge is limited supports for build time spam and i guess we have to wait and see how the industry goes in this direction to provide better support for build time spam supports and the last one which is not actually a challenge is most of the people are concerned about the formatting and i would say we shouldn't worry about the format because i mean they can be easily converted using the existing tools the other challenge in the domain of spom is that you know understanding of you know we have generated all these response but how can we use it why how should we use it right because when we generate lots of data it adds to the com confusion unless we know how should we use it and one of the directions that the industry is going towards is to basically utilize a spam for example for the vulnerability management um and that is inclusion of you know some more context to a spam for example if you are familiar with there is a concept of vulnerability exchange uh that is an extra piece of information which leads to one the list of vulnerabilities that your software is actually impacted by and if you basically uh couple that with the spam information or use a spam information to generate the x it's going to be a very powerful tooling so as tripty mentioned uh basically we have to think about how we want to use spam information in the context of open source software security and one aspect is automated software upgrade so when we generate sperm information automatically when we identify which specific vulnerability we are impacted with using the vex information we will be able to automatically use some tooling to automatically upgrade for example that dependency that we have if there is a new version that we want to fix um so what i'm trying to say here is that a spam by itself is a collection a database of you know lists of dependencies but we have to put it in the context and and and copy it with different useful information including vx to be able to take a proper action um in uh in this slide that i mentioned about the um shortcoming of this bomb one was that one was accuracy of the information that they get from this bomb for example there were dependencies in the from the source that they weren't accurate enough so one approach to overcome the noisy information is to put information of a spawn from different stages of software for example if we have a collection of spam generated from source code from build time and from runtime it's going to help us to reduce some of the noises for example from the source time we would see 10 dependency in the run time with vc for example 3 dependency so at least this means that those 7 extra dependencies that we see from the source time shouldn't be the focus or the highest priority uh if not they are false positive otherwise but the collection