← All talks

Keeping on Top of Security Advisories

BSides Wellington · 201737:3271 viewsPublished 2018-02Watch on YouTube ↗
Speakers
Tags
About this talk
Managing security patches across large heterogeneous fleets of Ubuntu and Debian machines requires reliable data sources and automated tooling. This talk explores the challenges of tracking which packages are vulnerable, where advisory information lives, and how to integrate host instrumentation tools like osquery to detect and remediate issues in real time. The speakers present an evolving free software application for collating Linux distro security advisories and demonstrate live patching workflows.
Show original YouTube description
When ensuring a large number of heterogeneous Ubuntu and Debian machines are "up to date", there are questions that need to be asked. What's even installed on all these machines? What constitutes "up to date"? Where does that information come from? Why the heck isn't it in a machine readable format already? What started as an experimental attempt at solving the problem has become a useful, evolving free software web application for collating Linux distro security advisories and integrating with host instrumentation tools such as osquery and hostinfo. I will talk about the history of the project, the challenges faced in obtaining the data we use and developing the applications, what we're working on at the moment to improve its performance, and show some live demos using osquery to detect problems and observe their remediation in realtime.
Show transcript [en]

okay cool so yeah extremely worthy title for what is or at least should be a pretty straightforward process for those who don't know us I'm Michael I'm a senior ops person at catalyst I mostly deal with the network but sometimes I get roped into these infrastructure projects and I'm Philip I'm a junior operations developer also working at Hillis cool so we set out to answer this question and it turned out to be sort of harder than it ought to be before we get started here's sort of your chance for honesty who in the room truly and absolutely feels like they could definitively answer this so I gave this talk a few months ago at Wellington icig

and you know out of a room of similar size to this only a couple of people really felt like it was something that they could do and not all of us can be you and McNeill there's a sort of weird state of affairs where no two people and InfoSec ever really seem like they can agree on anything but I think it's pretty fair to say that keeping your host pitched it's got to be in the sort of top five and yet for a lot of people the answer really looks something like this and so we want to work out how we answer the question we've sort of divided this up into kind of the three

pillars that hold up your house of patching and allow you to comprehensively automatically and actionable answer the question assuming you're using Debian or Ubuntu but we'll talk about I will talk more about that later so let's get started by taking a look at the the kind of journey that we've gone on to get to the point where we are today Philip and I work for a company which when looked at in the right light you could sort of describe as maybe a managed service provider primarily we do software development but the weird thing about developing software is that eventually somebody wants to run that software on a computer and as a result we've got thousands of servers

they're split across hundreds of different customers which means hundreds of different environments as well as our kind of corporate and infrastructure resources that run the company itself and because these machines that we look after tend to spend they spend a really large number of different environments that belong to different customers and these are pretty widely different things and that means that they tend to get patched in pretty different ways they've got different schedules that are set by the particular customer as to when they're being patched there's different access methods that you need to get into the environment different deployment mechanisms for putting the patches on different appetites for how long a system can sit unpatched and what

constitutes a serious bug that has to be fixed now and it therefore makes it pretty difficult when you have a branded scary vulnerability like heartbleed or shell shock or whatever to get a really good view across the entire fleet of machines you know this thing that has to be patched everywhere how many machines have actually been patched for it let alone you know everyday common lib tough bugs that crop up every second week and yet we consider the kind of ability to say is this hosts patched or not a pretty important metric in an environment and we think that you should too so we didn't come into this entirely unprepared historically we used to see

it of great Perl scripts which pars the email messages that the operating system vendors send out with security advisories which if you've tried to solve this problem before I'm sure sounds very familiar essentially email advisories and webpage lists of things are still the dominant currency of advising people about security bugs and free software projects and distros are no exception to that we did however have one of the most important pieces of the puzzle already which is we had a full list of every package that was installed on every machine in our responsibilities and we do that currently with a tool that's been around for a long time Mike Forbes first talked about it at insid nog in 2008 and that's

a tool called host info and it definitely could be worse if you're in a small environment this is probably the process that you're going through to do your patching and it works okay if you have like three or four computers but even when you've got maybe ten machines this process really sucks and so either way you look at it mmm trying to get a comprehensive idea across a really large diverse fleet of machines as to the state of their patching is definitely not as easy as it could have been we really wanted to be able to generate reports to answer the question automatically and easily so that we could do them frequently and without sort of exorbitant cost involved this is

really good because when we're patching something like shell shock it lets us check multiple times throughout the day to see kind of what our progress is through a particular environment and it's also just really good because sometimes you want to make sure that a machine hasn't fallen out of its patching schedule even if you're not the person doing the patching it can be good to keep an eye on it and go kind of prod the person who is doing it and we have the technology it seemed like from my point of view we basically had all the pieces that we needed to make reports like this automatically we had a list of all the packages that were installed on

our hosts we have a rough idea of what packages are problematic and it really should be as simple as just doing a kind of Set intersection and you know making a report so we had a bit of a look around for existing solutions to the problem and it's it's not that amazing so when I initially started this project there were some options available landscape is pretty nice at the time that we started this project back in 2015 you had to have a bun to advantage to use it and it cost over $1,000 per sever per year and it only works with a bun - so that's not super useful they do have a cheaper offering now if you just want

their software as a service offering you can get it for like seven bucks a month per machine but if you need on-prem your back and everyone to advantage territory readhead makes space walk which is a big java thing it only works really hits that's no good for us and there's this kind of cool project called V fence but if you dig into its github project you'll find a really sad message from the developer basically saying nobody wanted to help and I got disheartened and gave up so V fence is out so I figured we have most of the info we need I'll just make a little web application we'll present it in a nice way we'll

release it under an open-source license and everything will be great so I made a github repo in 2015 and I told everyone at work they would be done in a couple of weeks so it took about two months of on and off work just to kind of get an idea of how you're supposed to work out whether a Debian packages is secure or not and we're still trying to work that out and Philip will tell you more about that later let alone build a user interface shake out the enormous number of edge cases that comes with like 20 years worth of Debian packaging and then even start to think about how you show that to someone in a way that's useful

it wasn't until earlier this year that we kind of felt like the project was starting to take some sort of usable shape and so we checked out well I checked out most of the code and May and presented the bits that actually worked at ASIC and unfortunately in June we hired Philip who has gone through and fixed a huge number of bugs and the code that I wrote so let's take a through the process that's required to achieve patch reporting Nirvana pillar one and our house of patching so let's take another step back and ask another question who in the room can truly with 100 percent conviction say that they know exactly what packages are installed on all of

their machines yeah so there are there are a bunch of tools for this and they have been you know lots of tools in the past catalyst has its own thing called host info but at the moment the hippest one by far is always query not everyone wants to run a pill based thing that communicates with email and all that sort of stuff boys query is a really hip tool which at least by Facebook under an open-source license and it's basically this kind of amazing real-time host instrumentation system always query is really cool and I could easily give you an entire talk just about always query and the interesting stuff that I've found that it can do but we'll just talk a little

bit about the things that are actually relevant to this patching stuff so auras query essentially comes in two parts you've got OS query I which lets you do ad hoc lookups of stuff that always query knows about it's really good for playing with OS query finding out what types of data you might be able to extract out of it troubleshooting your queries that kind of thing it's more of a kind of development tool and then you've got the really good bit which is always query D and that's the daemon that you install run on your hosts and have periodically execute queries that find out information about them so conveniently OS query D has a pluggable

architecture you can select plugins which label the daemon to work out its configuration and also that tell the daemon where it should send the stuff that it finds out so we've decided to use the HTTP API that it has it's got these three API endpoints it's got the enrollment endpoint which is called the first time that the OS query daemon starts up where it gets its configuration initially it's got the configuration endpoint which it hits to get a list of queries that you want it to run and it's got the logger endpoint which it hits anytime something interesting happens on the host oh shoot excellent so we'll just quickly show you a little bit of how our scary eye works

because a demo is worth a thousand words I think

typing so this is what the OS query I user interface looks like if it seems familiar it's because they literally copied and pasted the SQLite codebase into our query so this is also a secure Lites query interface and what we've got is we've got a series of virtualized SQL tables so instead of being database tables on disk these are actually properties about the system that always query is running on and so we can start to do some really nice stuff for instance we can look at Firefox add-ons a useful thing to look at that's pretty easy it's just to say tell me about the USB devices that are plugged into this machine right now and then we can start

to do more interesting things where we can start to add conditions and we can say tell me all the demons that are not bound to localhost for instance which might be something that you're interested in finding out about but for our purposes the thing we're most interested in is this ginormous list of every package that's installed on the machine

and so the great thing about always query is that even for a query like select star from packages where there's a crapload of data in there you can still say that always query I want to know what the state of this is every 10 seconds and it's smart enough to say okay nothing's changed since last time so don't see into the updates back to the server it only tells you when something actually changes now if those people's have other good stuff but we don't care about it for this and so what do you plug or scurry into the main downside to it at the moment is that while it's a great piece of maintained free software that's come from Facebook

they just gave us always query they didn't give us the server infrastructure that our security talks to at Facebook which is fair enough because it's probably really weird and Facebook specific so if you look around you'll see that there's a whole load of these existing commercial and free software offerings for software that I was clear you can talk to and they're pretty good doorman in particular I like it's written in Python lets you really easily put things together but the problem with these is that they're big complete full-on software products you need to have post grades you need to have darker they're not really suitable if you just want to build something with always query or you want to experiment with it

and sometimes you just want to you know try a thing out and so here's the first of three pieces of code that we're releasing today it's a 200 line super minimalist Django application that implements the OS query API and implements the three basic endpoints and we bundled it with a really simple demo application that you can use to kind of start playing with RS query in your environment and we'll do a demo step

so here's a live view this is the the Django built-in admin system which is the sort of free UI that you get with your Django projects so you don't have to write any code now you can see I've got a few of my hosts currently reporting in here and 2os query and they're all alive at the moment which means that the daemon talks to the server in the last couple of minutes and what we can do with this demo application is that we can start adding queries that we want to run on those hosts so I put the select star from USB devices query and that we looked at before and if we come back here and now

hilariously I was going to bring a fake malicious USB stick and but I've lost it somewhere in the conference so far I've borrowed this USB stick from the AV people yeah I'm sure it's totally fine I'm just going to clear the log of all of the crap from me putting my yubikey and and things like that which was not supposed to be in there and so if I take this USB stick that I found in the parking lot and I stick it into my laptop that I'm using here and then I stole for time to get past the 10-second window we're always curry chicks things again

you'll see that we now have a log entry saying a USB device was added to this host and that didn't take very long at all so in the next thirty minutes you could clone this code stick a thing in there that sends an email when this happens and then make your security team very happy so that's how you can get started with ours query really easy doing some basic stuff

and if you want a copy of it you can grab it from one of these URLs so we have the first pillar in our house of patching Nirvana and we have to ask the next part of the question which packages actually have bugs in them and which packages that we're going to have to do something about and this is sort of where it gets less easy so we started looking at the official information sources that provide our OS vendors if you've been in the info SEC world for a while you've probably seen a variety of different formats for communicating vulnerability information such as CVE or oval ultimately the main univ currency and communicating security risks with a

particular distro is usually the security advisory that the distro maintain is published the problem we'll studies is that if we start looking at what information we actually get from ghost people was what they give us things start to look bad so Fincham first looked at deviant his favorite Linux operating system and here's what they do provide the primary source of the advisory information is still in a human readable email and while there are a bunch of API is still ket none of them quite give all the information we need in a single place and the worst of all is the advisory team from the devian advisors from devian in what sorry the worst well advisors from devian they're

always in terms of the source packages which doesn't make much sense as you don't install a source package so you can't use so you can use that too sorry so you can use that to reconcile which backers you need updating you have to work out which binary packages come from each source package and there wasn't an easy API for that either more on that in a little bit at least things a little bit better on the winter side where we just get a giant JSON file containing all of the affected visor ease in all the effective pictures on each advisory let's take a closer look at the situation with debian and see what the best process we've come

up with for working out machine readable advisory data this is what it more or less looks like basically graven rdf file from the deviant website which we don't know how rdf works it's kind of worse XML that contains only the last 25 advisories and only partial data things like truncated sentences then we take a full copy of the repository metadata from the Security repository since it specifies for every binary package which source package created it so we invert that mapping to produce which source packages create which binary packages and then we then we whoops the McClendon subversion repository that theevans security team uses to coordinate the work we pass the file with the custom file parser that lists the advisory

metadata but if the advisor is not in the current version of the package instead of using the information we gathered back in step 4 we now have to go to a different API that's in a different format entirely and then we mesh all the metadata back into our database by comparison Ubuntu is a bit simpler we download a roughly 50 megabyte JSON file that contains a list of all the advisories and we just load in any of them that we haven't seen before would be great if devian had something similar to this but as far as we know it doesn't exist yet so think two or three that we're releasing today there's another django app that handles the job of clicking and

pausing devian and Ubuntu advisory data michael has also built a basic UI to let you explore the data and we'll do a quick demo of that now yep so if anybody wants to play along at home this one's public you can jump in there and have a look at stuff just put this back into mirror mode so the URL is just tool stop light cotton z slash Advisory's slash and this is just giving a read-only view into the information that we're able to pull out of Debian and Ubuntu advisories you can probably explore this yourself it's all relatively straightforward how it works you can look at a particular advisory you can get all the metadata that we

were provided with all that we were able to scrape up you can get a list of the source packages more usefully you can get a list of the binary packages and the versions and then it gives you a kind of optimistic command that you might run on the machine to fix it we also let you look at it in terms of CVS which is quite interesting you can look at which vendors have patched which CVS and in what advisories it's a little bit misleading it makes it look like debuting are overachievers but in fact upon to don't issue advisors for nearly as many things as Debian does

so that's available at this URL if anybody wants to have a look at it they can or even better go get the source code download it install it yourself which brings us to pillar number three on our house of patching nirvana how do we actually compute the intersection of the two pieces of information that we've just showed you and how do we present that to someone in a way that's actually useful to them so I had a bit of a go at this unsurprisingly once you run the numbers the performance is not that great if you've got say three thousand hosts and maybe 600 to 800 packages on each one trying to calculate the intersection of that data and the

packages that are listed in a particular advisory which might be say 20 packages it's actually kind of time consuming it's way too slow to do during page rendering in a web application and it gets worse as you scale it up interestingly if you look at Canonical's blog they ran into the exact same problem in 2015 with landscape canonical has a really complicated solution to this involving go and compilers and optimizations we did a much much simpler thing we spent a lot of time messing around trying to implement caching layers and all kinds of complicated things but ultimately you can't get past the fact that it's just sort of a slow process the solution we came up was with

pretty simple really when new information comes in at import time and we new advisory or when a package gets updated on a host we just check in store if it has any problems associated with it this happens in the background during the import so when a page loads it just has to do a simple lookup next we have next step once we have the database table what I do oh sure I did a bad thing

it's how we actually present the data and a useful format this is something we're still actively working on we've tried a few different experiments and spent some time working with the people that use it and gotten feedback from them one of the main things that it's come that has come out of that is actually the UI is quite specific to our environment and catalyst where the tool is used one of the other big things that we've also learned from this process is that you really have to try and collapse down the number of different patching policies that you have across your environment if you've got like 200 different ways that a thing gets patched it's a lot more complicated to try and

rectify in a machine-readable way whether that policy is actually being adhered to so one of the things that we're trying to do is say ok we'll have a machine-readable expression for the patching policies themselves and we'll have fewer of them that way you can kind of automatically evaluate a pass or fail on a particular host or group of hosts the internal name at catalyst for this project has been patch friend and it's the kind of place that we're doing the UI development and stuff that's really specific to headless we're not going to be releasing the code for that today because as Phillips said it's kind of weirdly specific to the way we do things now but we did make some

screenshots so we'll kind of take you through some of the ways that we're trying to present the information to make it useful so there's the Advisory ListView it's got as you can see its upstream ID a short one-line description which even isn't particularly avoid with the OS list of source packages and when it was issued and a progress spa which shows a list of posts how many hosts haven't been patched yet versus how many have been patched which is useful for working out how long will it take to fix in a visor e the pet the progress bar things great it's like game of flying patching you just try and get the progress bar to go across as quickly as

possible and you know you're secure this is an advisor do v4 devian which it's pretty similar to advisory feeds I think it's got description progress the source packages very pictures update commander same thing yep and we would normally list the affected hostess down the bottom but you don't get to see that by comparison Ubuntu is very similar it's got more detailed description and required action after you upgrade all the packages such as rebooting the system here's the detail view for close list so this lets us filter all the hosts that are in our environment by a particular customer or a particular other attribute of the machine this is pretty handy because you can get a real quick look at whether

they're any machines that have you know outstanding patches that needs to be applied and also how many packages have actually been fixed on that machine and so this is the kind of host detail view which has turned out to be really useful as well it gives us an overview of patches that are pending on the host but perhaps more interestingly it actually gives us real-time log of wind things worth fixed and this is really great when you have situations like Drupal vulnerability where we know that the vulnerability was issued at this time and then 12 hours later someone scanned the entire internet and point every single Drupal instance it lets us go and say well did

we fix it before that happened and that's been really valuable so we're not releasing the code for it but we are releasing a thing that resembles patch friend so this is a kind of evolution of the stuff that I presented at icig it's got the last six months with the good work that Philips done rolled back into it and it gives you the base where you could start to build a thing like patch friend for your own environment and that's the final piece of code that we're going to be releasing today so I'm now going to try and do a full demo of all of the stuff that we've talked about the real-time host instrumentation the

Advisory pausing advisory metadata pausing and then the reporting in real-time so let's have a go so here I'm logged into one of my actual production machines that are on the Internet this machine is currently fully patched up but you'll see that sitting in my home directory here I've got a get package from three months ago there subsequently been an advisory issued about git so this is an unsafe get package and it would be very dangerous for me to do this now you'll see the D package there warns me that I'm actually downgrading the get package don't worry this is actually a super harmless bug and yet it's like a code execution or something cool so we can see now that

the version of the get package that's installed on the system is now out of step with all the other get stuff that's on here which means we have successfully downgraded it if we jump over into the application that we're releasing today and we come into the hosts view we can see that a KL B is the host that we were on we can see it's a live thank goodness which means always query is still running and uploading stuff into the environment and now hopefully if we go into this problems table we'll see that we have a new problem now in our environment which is that on host a KL B according to advisor EDSA 3

9 8 4-1 the get package at this version is not safe to have installed anymore so along comes our people who do the patching in our environment which is me and we say oh there's something wrong with this machine we'd better quickly do inept just upgrade to bring it up to the latest so it's noticed that the get package is out of date

we pull in vision df9 you to ignore the service restock that's not relevant come back in here you can see that now Deb nine you to is install which matches along with the other wrist of the get stuff we stole four time a little bit more to give always carry a bit of a chance to actually notice there and upload it to the server and then when we come back into our demo environment you'll see that the problem is now fixed so that's combining all of the pieces of the information that we've showed you today and this code is now available on our website for you to download and start to build other interesting things

with

wait I didn't put the stick in presenter mode and here's where you can get a copy from so where to next we want to try and compute some useful metrics about stuff there's all kinds of exciting numbers that you can start to pull out of this once you have the information kind of in a machine-readable form we want to know about things like the burn rate for bugs we want to be able to say what's the vulnerability that currently has the biggest impact across these several thousand machines we want to know what's our average time to fix a problem are we doing it quickly enough what package has had the most advisories issued for it

it's the Linux kernel that kind of stuff if you actually go and look at that code you'll see that we haven't packaged it very well well I haven't packaged it very well so we kind of need to improve the code packaging we're trying to split out the various parts of this so make them a bit more reusable that's something that we're going to be working on and we want to try adding more distros but unsurprisingly basically all the other distros are not on the machine readable advisory bandwagon yet either I think we're going to have a go arch next and of course there are always more bugs in each cases to be fixed so that's how

talk thanks for sitting through it do we have time to take questions scootch all right so any questions Thank You Bryan

it sounds like a good idea yes Oh Oh a screw he's written in C and it runs his route so run it under a farmer or SELinux or anything it you don't actually have to run it this route but more of the cool stuff that it does needs route so

yeah let me just show get back to the one slide we have on this so basically the daemon that we've written is just a django apps it has three URLs that it implements OS curry hits two of them at startup time to get its configuration and to enroll itself and then it just pings a single URL every time it notices something interesting to report we've also got it set up here to just ping us once every 60 seconds as well so we know they're known survive but yeah it's just an it's just a single HTTP outbound connection from always query into the controller

hmm yes so it would be nice if patch friends not only knew about all the problems but also solved them for us as well so no that's a thing you could do in our environment it's it tends to be a whole there's a whole like approval process that you have to go through for every new package and so we couldn't really do that I I would love if there was just like a press button soft problem to be honest if you've got any kind of orchestration framework you can do that os query is nice and that it is very avowedly read-only you can't easily make the OS query daemon do stuff to the machine which is good for a security

tool but it does mean you have to do that out-of-band

I think it's definitely useful in an environment where you do have restrictions on your patching process so most of those policies are driven by our customers because they've got a you know staging environment and a UAT environment and a patch has to kind of make its way through those to make sure it's compatible with the application I definitely think this tool is still useful in other environments just as a kind of backstop because I don't know if you've run unattended upgrades on a large number of boxes but it's surprisingly unreliable it like it just decides that oh the packages got into a state that we can't deal with give up this host and never update it again

and so it's still pretty useful to have as a kind of backstop even if you don't use it as part of your kind of patching driving process all right if anyone has any more questions just come harass us later I guess but thank you [Applause]