← All talks

CG - How to have perfect vulnerability reports and still get hacked

BSides Las Vegas46:4642 viewsPublished 2023-10Watch on YouTube ↗
About this talk
Common Ground, 18:00 Tuesday What vulnerabilities are really lurking in a given application? The assumption that we can answer that question undergirds US government mandates both recent and decades-old. Hackers, of course, know that this is absurd: attackers have 0days and aren’t afraid to use them. But even a much-humbler goal, “free of known vulnerabilities,” isn’t as feasible as we’ve been led to believe. In this talk, we’ll see the pitfalls of common tools—software composition analysis (SCA) and software bills of material (SBOMs)—commonly brought up as silver bullets for this issue. We’ll see the vulnerability reporting ecosystem, including databases and manual triage of vulnerabilities in your application. Nonetheless, we’re hopeful: these tools are stronger together and can do a good job in many scenarios. Further, we’ll see what the future holds for bringing us closer to “free of known vulnerabilities” status, from open-source tooling to better government policy. Attendees to this session will learn about: automated security tools that miss what’s right in front of them, empirical research exposing vulnerability management challenges, the fight against security by obscurity, and the daily commitment to keep applications free of known vulnerabilities. Zachary Newman, Luca Guerra
Show transcript [en]

so this is the last one the best one I hope so the title of this talk is how to have perfect vulnerability reports and still get hagged and our speakers for today is Zach Newman and Luca Guera welcome thank you thank you so welcome and thank you for staying for the last Talk of the day thanks a lot uh so today uh Zack and I will tell you you more about how we can have perfect vulnerability reports and still get hacked because yes we needed someone to explain so I am Luca and I work as a senior engineer at sysdig I during my normal uh work work day I work on a project called Falco that is runtime Security based but my background is pretty much everything security from security research to engineering so I really feel a bit right at home here at besides and hacker summer camp week so and I have the pleasure today to speak with Zack who is an awesome uh researcher and research scientist at chuard a company that does specialize in supply chain security so a lot of the cool stuff that we'll be talking about today and also uh he he's very expert about crypto where crypto means cryptography so you can take a look at his blogs I I wish could understand half of them so what uh actually are we talking about today so today uh we'll be of course we'll explain how our vulnerability report can be perfect while we get hacked so in order to do that we'll take a look at how this vulnerability reports are produced so what are the uh what are the tools that we use software composition analysis we know we have heard so much about SS bonds so we'll take a look at that tool and how the whole process works and by understanding how the whole process works we know pretty much where the blind spots are that none of this spoiler none of this technology is perfect but they are actually useful and we so let's get started of course we all love vulnerabilities we all love looking for them patching them all that stuff and uh if you think about a time when maybe you you weren't you weren't shipping software to production maybe a happier time I'm sure you were happier I was when I wasn't shipping software to production I could ask myself how much vulnerable software I wanted in my production environment and of course I want none why which stupid person would ever have a vulnerable software especially software that you know it is vulnerable in their production environment well as it turns out really in the real world you can't have that uh we all know that you can't have a system that is automatically at the latest and greatest version and and of course we are not even thinking about zero days we think about vulnerabilities that are very well known to everyone and yet every time we read on Twitter or x that you should patch your system immediately and it's very hard to do in the real world and I just found this uh while I was taking a look at the content for the stock I just found this tweet x uh that that was saying basically that there are environments that run Cent five that are critical business production environments that we rely on somehow so uh yeah it's it gets hard to do the theoretical thing but we're not using Centos s here uh we are we work at Cloud native companies we all use modern Cloud native environments that um so uh we have containers and so containers are based on the idea that if it works on my machine we'll ship the entire thing as a file system structure and it will work the software will work we got there uh and and we're happy but we're also dragging all the vulnerabilities we have in there within the container and now we don't have to care about the host we have to care about a lot of containers that are potentially vulnerable so how do we even start at keeping track of all these vulnerabilities and everything that we have in our systems well fortunately we got tools we've got software composition anal are vulnerability scanners uh we've heard about them we probably used them I've used them and I also worked on one so I think uh uh I think we're familiar we're they're PR too much a magic box if we think about the container image uh you take your container image and you just shove them into the magic box and out comes a list of vulnerability of vulnerabilities this is really useful this is awesome because I can take any container that anyone generated even a third party vendor open source whatever I have I just push there and I get the vulnerabilities think about when log for Shell hit you could use open source or commercial solutions that are pretty mature you would just run it on everything you have uh in your clusters and that would just work but let's take a look at how the magic box wors if you think about it the magic box are actually at least two magic boxes one uh is is a Content detection part so you take the image and you try to figure out what's in there and the second detects the vulnerab is uh the the stuff that is in the middle uh it we call it an asbb software build of materials it's basically a list of things that are contained in our image and in our software so well let's focus on the first part first so content detection yes that's great how do we do that uh you take the container image and and it's got layers you get a squash file system representation of the layers and your software composition analysis tool will just go and look for any piece of metadata that it can find uh to find the the packages and the dependencies that are installed there so the package go has that embedded in the binary or in if you open a jar file you know that it's basically a glorified zip with stuff inside you might find manest that they may contain uh good information uh but sometimes stuff doesn't well I mean doesn't really go wrong but some stuff cannot be detected here because uh uh we need for software composition analysis to have enough metadata in there in order to you know find what this software is and if it exist and it didn't make into the s bomb then uh it's like uh um our our friends at shenard has going the term for this that is softare dark matter it's not in the sbone which means that we don't know what it is and we don't know it's there but it's still there so we did uh according to report that Zack would tell us more about uh software Dark Matter composes might compose even more than 60% of the software in an image so uh for me the question is more like do I care about that is there any piece of software that that I'm actually interested in so let's take a look at an image I got a Pache I scanned it with a software composition anal tool I got my asone it's a bunch of stuff 126 packages categorizes as dbn first of all do I care about the ones that I see yes I do so uh I know that the httpd server is the HTTP binary if I list the dependencies I'll find that basically there's a bunch of the of libraries that are dynamically linked and each and every one of those has a corresponding package so if we find the vulnerability say in the regular expression libraries I will actually be able to uh to understand that this is vulnerable and it's going to affect my server so I want to know that but if I go back and look at the list of things that are in the image in of the httpd if I some stuff I probably don't care about and we talk about that later but is there anything missing I looked at it and after a while I figured that the httpd software itself is missing like the one thing that I downloaded the image for actually was not there in the list of software that contained that was contained in the image and if you work on this kind of software and if you know how they work it's actually quite obvious because there's no additional metadata that is put there at build time that actually tells you that this is the httpd server it's probably a binary it's actually a binary that is built with compiler and then it's stuck in the in the image so it doesn't come with any additional thing so we know we know it um the question is what if we build the software build of material as bomb not just by uh running software composition analysis but we could put our own data in there and so here comes the um discussion about esom Zack is a great expert at that so uh thanks Z for telling us more about uh the espon yeah so the the issues luuka just told us about all had to do with this content detection phase right if we had a perfect list of all the software in our image ahead of time we wouldn't have to do that and that's what the promise of the software bill of materials is um so you can think about software composition analysis the content detection as reverse engineering you go to Taco Bell you eat a quesero you taste it you squeeze it you try to figure out what's in it and then you go home and you try to make the same thing uh but often you don't quite nail it you you it's very very hard to figure out what exactly is in your container image uh and this is true of food as well uh and so a very overused analogy but overused because it's quite useful is that an esom is an ingredients list for software right it tells you what's in your application what's in your container image and then there's maybe a warning at the bottom it says may contain cve 2014 0160 um and Es bombs can be produced in a number of ways uh one way to do it is exactly like luuka was just saying uh sort of this post Haw software composition analysis um and we do this by looking inside a container image looking at metadata like the uh app uh database on your devian instance um but you could also Imagine creating one of these lists of ingredients at build time uh why CU that's when you have the most information that's when you know what you're actually putting in there um and so these s bombs will contain package information dependency metadata uh cryptographic hashes of the content so you know exactly what is and what is not in in kind of that um software and so uh here's an example you don't have to read every line of this just trying to give you the flavor of it uh but it's like a text format it tells you some metadata about the format itself that's spdx is one of these formats um tells you kind of okay here's a repo on you know GitHub that that we're tied to uh it tells you metad at it about who created this so in in this case there's a person that created it there's also a tool that was used to create this and this is just for a very simple hello world kind of binary um it gives you some information about individual packages so in in this case there's just the one package hello bin tells you where the source came from it tells you what the commit was that it came from it tells you the license of it and and so on um and then it tells you a lot about like the dependency relationships within that within that application it tells you okay you know your package depends on this package depends on that package and and you can embed kind of the whole graph in this plain text format uh which is not very nice to read it's not super human friendly uh but there are tools for kind of visualizing and so on and so where where did these things come from uh for a long time as long as people have been consuming software they've kind of wanted to know what was in it and so uh for a while what we did is we just said hey take take your lowest paid intern give them you know Microsoft Excel and have them like make a catalog of all the you know libraries that you're installing all the libraries that you're importing um and then you know sometime around 2010 or so uh the spdx group uh formed under the Linux Foundation uh and actually this project wasn't designed with cyber security in mind it was designed for open source license compliance uh and at first blush this might seem unrelated right what a what a licenses have to do with vulnerabilities but the first step in both cases is knowing what's actually inside the application if you're linking against a vulnerable version of open SSL you could have a vulnerability if you're linking against you know a library that's agpl licensed you could have some legal problems and so after about you know five or six years uh the Cyclone DX project kind of made an initial release and that comes actually out of oasp and so this is interesting because we're starting to see a cyber security focused group really investing s bombs and spdx along the way has picked up a number of features that make it useful for vulnerability analysis uh after a few years of that uh folks started to notice hey it's not actually that useful for you to tell me just this application contains this Library what if they didn't call the vulnerable function what if they you know like it it doesn't give you enough information about exploitability so Vex or vulnerability exploitability exchange came on the scene around 2019 which lets you mark soft were as affected by a vulnerability not affected by a vulnerability and so on um and then in 2021 uh the US federal government issued an executive order 14028 that says all vendors to US government must provide s bombs and there's some some details on timing and so on but has caused a mad scramble and a billion companies have started uh uh to to solve all your all your organizations esom problems if only you write them a fat check um but missing from all of this I think is a notion of quality uh and so on the right I have an empty nutrition facts label because maybe that's technically compliant right uh the ntia has has guidance on the minimum elements of an es bomb and it talks about identifiers for components it talks about version numbers it talks about dependency relationships nowhere there does it say that you actually have have to have httpd in your htttp image sbom right um and so Luca alluded briefly to this term software dark matter which is which is a term ter my colleague came up with to kind of uh describe if you have say like a container image uh and you go through the eson for that container image what percentage of files can be explained by that esom ideally you'd want it to be 100% you you'd know you know uh these things come from the OS and these libraries come from this package and so on uh but actually if you look at popular images on dockerhub you you actually find a majority of files are unexplained they're they're sort of if you run popular scanner tools the scanner tools can't tell you the cause of a number of these packages and there's a couple of reasons why in Luca luga did a good job illustrating some of these uh the big one is that scans are missing software that you're not installing via like a very well defined package manager uh and so if you're if you're cobbling together your container image by you know copying files in from here and copying files in from there all of those are going to be not accounted for um and any esoms that are coming out build time which is which is kind of my preferred time to be generating these things that is a good way to not have this problem of dark matter but you need a lot of support from the build tooling and that's that takes time to put in place you need your compiler to support it you need your you know you need make to support it you need all of that um so I'm going to turn it back over now to Luca to talk about okay so even imagine we solved all those problems we had a perfect eson that had all a list of all the software in your container image we done and um you may note by looking at your watches and the fact that we still have some more time on this talk the answer is no thank you Zach so yes uh I really now I imagine I have my perfect test bone either SCA generated or we actually wrote it uh with any build time tooling with any help or a combination of the two now what well now we turn to our vulnerability database and we do the magic so the problem is there are many vulnerability databases and these are just a few you get like the MVD that I'm sure that many of us have looked at uh we've got some open source some open source Parts like G up and gitlab we got vendors we got paid for vulnerability databases that have their own goals and scope so what's going on there well it means that in your scanner whether it's software composition analysis or vulnerability scanner there's another Magic Box inside that's the magic box that even if you use open source you might not be seeing it but they have the code to deploy it yourself it takes all these databases and squashes them together into one single vulnerability database that can be used to detect and building that dep database it's is much much tricker than we can think so um in in a lot of cases this is great because if you think about it vendors especially vulnerability um uh sorry uh distribution vendors tell us exactly which vulnerabilities are in their packages and they know it because they maintain the package and they know exactly what to look for so uh that's great and now with uh if you think about GitHub and gitlab for example we have uh package maintainers that can put data in a standard format there in their package so that uh you can take a look and it's written in some format that a scanner can conform to however not everything is uh that great for example we have all we have been speaking about the software package type before if you think about the Java go whatever we spoke about it but not all software can be uh reconducted somehow to this kind of package types think about the httpd server it's not a you know it's not a good package or it's not anything like that so what do we do there and also we got software that come from vendors so it's not in an open source ecosystem so what you do with them uh and also vulnerability databases are a beast of their own they all have different goals and they're so they're inconsistent from each other because they're slightly different but also between themselves they can be a bit inconsistent so of course let's take a look at an example I like Concepts but I scan images I want to be sec secure so in order to be secure I scan my psql image and in the s bomb that was generated by SCA I can find pogress so we don't have the HTTP problem uh it's there it's version 15.2 this is great now let's go to the vulnerability checking part so we got eight vulnerab there are actually Four because four of them are just duplicated but that's okay because uh the Debian maintainers that have compiled that security list know what they're doing so if there's two libraries that are affected by the same vulnerability from op SSL they will uh they will Mark both also I really want to know about this op SSL vulnerable version because I have pogress and pogress is going to connect securely I hope with OP SSL so if there's a bug there I I do really want to know is that is that all no if I go to nvd the National vulnerability database I can take a look that there's a vulnerability with the CV ID and that vulnerability is uh something about pogress itself so uh we had pogress in the sbom uh we had somehow data about the vulnerability but we couldn't match the vulnerability so we don't have it in the scanner output why is that so I also took a look at how the the Asom is generating and what database I'm looking at and the package here is detected as a d package so Debian software and the scanner knows that it needs to match Debian software with the dean Debian security database does the de Debian security database have information on that vulnerability yes it does but it only applies if the package has been distributed by a de maintainer because this is how it works by contract that database Works only for that and if we take a look at the docker file for that pogress image we find that pretty much the docker file had the AP reple for pogress and not for Debian because of course pogs use their own uh repository meaning that our scanner is unable to find the vulnerability that sounds sad but you might think I know I know I saw the database before you can't fool me it's there I I know that there is t