← All talks

A Decade of Low-Hanging Fruit in the Linux Kernel

BSides PDX · 202427:551.2K viewsPublished 2024-11Watch on YouTube ↗
Speakers
Tags
StyleKeynote
About this talk
The upstream Linux kernel’s security hardening efforts have made huge progress in a decade. We’ll look at how we got here, what CVE statistics show, and what’s coming next. Where is the industry going, and can we finally be done with memory unsafe languages? Kees Cook has been involved with Free Software since 1994 and has been a Debian Developer since 2007. Currently, he works as a Linux kernel security engineer at Google, focusing on Android and Chrome OS. He previously served as the Ubuntu Security Team’s Tech Lead and remains on the Ubuntu Technical Board. Kees has contributed to a range of projects, including OpenSSH, Inkscape, Wine, MPlayer, and Wireshark, with a recent focus on Linux kernel security features. --- BSides Portland is a tax-exempt charitable 501(c)(3) organization founded with the mission to cultivate the Pacific Northwest information security and hacking community by creating local inclusive opportunities for learning, networking, collaboration, and teaching. bsidespdx.org
Show transcript [en]

[Music] um I'm case cook uh it's spelled weird I was named after my Dutch grandfather and uh vowels are strange in Dutch um if you want to see these slides you can download them there I have it again uh at the end so hello neighbors uh with apologies to Fred Rogers and Travis Goodspeed um I I I like this greeting uh just out of sort of the the history of Fred Rogers and his kindness and I think the the generosity that exists in this industry I think was something that really happened uh happens a lot when you go look for it um in particular Travis uh sent me a bag of USB face dancer boards

back when I was looking at debugging um finding bugs in the Linux USB protocol handling um which was much appreciated so very neighborly um so about me uh professionally I have been uh doing this a while um I I was a sadman at the open source development lab which ultimately became the Linux Foundation uh so I like to tell people I was working at the Linux Foundation before Lena stals um but it's not entirely true um then I moved to work on the auntu security team uh to try to beat user space security into into some sensible space and then um uh joined Google to work on Chrome OS originally and now I'm basically uh

full-time Upstream Linux Kernel Security hardening lead so there's a lot of people that helped me get this job done um I moved to Portland in 2002 so now I consider myself a native um I've been a free software hacker for as long as I can remember and um was lucky enough to be on a a team that won the con CTF two years in a row um I thought it was initially a fluke in 2006 and then we worked really hard to prove that we could actually do it again uh that was awesome um oh right and of course I'm speaking for myself not for Google so I got to show quotes from other people um I like

to send a lot of security hardening patches to Linux konel uh lenus does not appreciate it um this is his quote um um I did not in fact stop sending patches um uh but I don't want this to be all about me uh this is a group effort um I want this to be about all of us so here's a much better quote um that I found through uh this fantastic book which is very dense and very scary uh this is this is how they tell me the world ends which is about like the uh sort of the Cyber weapons arms race over the years it's pretty interesting anyway the quo's most likely way for the world

to be destroyed most experts agree is by accident that's where we come in we're computer professionals we cause accidents this is uh supposedly from Nathaniel borain who uh drove the creation of the mime standard um which I think is apt um so to that end to that end uh um I've liked to think of practicing with accidents is really capture the flag in a lot of ways uh playing with these things and my thought process on CTF has been there's effectively three stages that you're working on right there's you got to figure out what the flaw is you know do that work understand what's actually going on really understand the system and and and how you can manipulate it

and then you have this choice of two other paths you can weaponize that flaw and mount the attack this is sort of like the red team area or there's you know hardening the binary The Source or doing whatever you need to and defend against attacks which is sort of the blue team area um I have a a somewhat weird view of this which is I understand how bullets work so I can either work on machine guns or body armor um this is overly like militaristic um but it's not EX exactly inappropriate uh considering software flaws are being exploited to Target you know journalists and political activists and their families so there's a little overlap here in the

reality of the situation um I ended up going the body arer route but I I do appreciate all the folks that are going the other way um they keep my my phone free um so one of the areas of of the defensive work uh that I spent a lot of time with with the colonel is you know looking at how we can protect the Linux kernel uh from user space from attackers whatever um and there wasn't really a central concerted effort to to work on this area uh back in the day uh and I picked away at bits and pieces for a while and then realized there's no way I am ever going to be

able to do all of this myself it's going to take me too long to learn all the different architectures that matter learn compilers learn like there's just so much uh going on so I tried to sort of start hurting cats and get people involved um and and announce this Linux kernel Self Protection project back in 2015 um and the main goals that we had were to remove entire bug classes because chipping away at individual bugs is just a total game of whack-a-mole and it's hard to make progress and um uh the when you look at the lifetime of bugs in the Linux kernel I have whole presentations on this um right now it's about an average of 5 and a half years

between when something is introduced and to when it is fixed so and and I'm and I'm speaking of like high and critical like there's a giant long tale of medium and low uh vulnerabilities um so I like to point out to people that you know it's a 5 and 1 half year window so right now uh for all of you using Linux based systems there's vulnerabilities in your system that many of us have not found yet some people might have uh so trying to get rid of the bugs at the beginning is pretty important and then uh since there are always going to be bugs uh we need to look at how do we get rid of entire

um classes of exploit mitig you know exploits how do we mitigate uh that how do we make it not easy for attackers um so it's been you know 10 years roughly uh have things improved uh I'm going to show us some vulnerability Trends but first i'm going to quickly take a look at lus kernel flaws and cve um this topic has come up a bit um uh because the colonel became its o own cve naming Authority in February which is to say that now the kernel maintainers or really the CNA team Within within the colel are uh doing CV assignments whereas before it was sort of General Pur purpose dros when they tripped over something that they cared

about they'd assign a cve um and now the the the colonel CNA will assign CVS for everything that's being fixed that looks like it might have any kind of security relevance under any kind of threat model uh which turns out to be a lot of cves um and I can sort of show this a bit um so if there is the universe of all flaws in Linux the omniscient view that the objective truth that human humans cannot grasp um within them there's some group of publicly known flaws that's great and I'm speaking of flaws generally bugs not necessarily security flaws yet um and then of the publicly known flaws we're going to fix some and of course uh because we cause a

there is a bunch that are accidentally fixed that we didn't even know were flaws at all um so that's this little bubble on the side now a subset of that is security flaws and it was around here as I was making this vend diagram I started to lose my mind um so we have known but unfixed security flaws we have yet to be found security flaws and we have accidentally fixed security flaws in this grouping and the old cves were off in this corner um overlapping so many areas I need to zoom in so we've got a range of false positives so a CV gets assigned but it's not actually something anything you know anything that we can do anything about

um so there's false positives that are in a variety of not a security flaw is it a fixed flaw is it accidentally fixed um is it magic um there's stuff that there's false positives that aren't even flaws at all like oops that wasn't designed correctly and then there are true positives in the unfixed realm and then there's you know accidentally fixed also true positives but it's kind of in this weird corner it's not covering the reality of security flaws very well there's a lot of stuff that isn't being identified in in the kernel in like the security flaw space it just isn't a mapping for cves um with the new kernel.org CNA we have a much larger mapping um but it

tends to focus on stuff that's been fixed which is a a change and is a little weird but it does create fewer false posit like fewer in in ratio false positives and we get a lot more coverage um which I think is good although it creates an absolute nightmare for people who are tracking cves so that they know that whatever software they have has no publicly known vulnerabilities in it but this is a reminder to people in that mindset our goal is to fix security flaws not cves cves are just a tracking method we have a better tracking method now um it's still flawed itself of course everything will be um but we have a much much

better mapping to reality now um which is I think fantastic except of course people who were only tracking cves uh now have a have to deal with the fact that oops they now actually have to track a much larger percentage of things but those are the same flaw you were supposed to be fixing all along um anyway there's this's a whole other topic but I wanted the idea of these vend diagrams popped in my head and then I was compelled to create insane graphs um but my point is I can't compare Trends between the old style of CV assignment and the new style of CV assignment so uh doing a trend analysis right now uh ignoring the CN the new CNA

I can look uh retrospectively to through the cves to get a sense of where things are because it's hard to get a true objective uh ability to analyze these Trends but the CVS give us at least some signal uh to talk about it um and just a nice shout out to the abun cve tracker uh they make my life really easy because for cves especially high and critical uh kernel CVS they actually track when the flaw is introduced as well as when it was fixed and for doing lifetime analysis that's really critical um because traditionally uh CVS just say oh it's fixed in here have a nice day you're kind of like okay but how long

has it been there what what else do I need to cover um the new uh CV CNA uh has introduction commits as well as the fix commits uh but they don't assign severity because there isn't a threat model that they work against um so the abtu tracker it's a general purpose Dro it's a general enough uh threat model that I'm happy with their severities under most cases but anyway um onto my damn lies I mean statistics um so looking at things that mention buffer overflows or overwrites in the cves from 2010 forward um I see this delightful uh linear line uh that's going down so that's good something's working um of course when I saw this graph I said to

myself but where do we cross zero so I stretched it out and oh my God it hits zero in 2038 we will have no more buffer overflows in 2038 so by my highly scientific and sta statistically robust and totally accurate prediction um we will have no buffer overflows uh right around the time the 32-bit time te Unix epic wraps so um yeah uh which brings us to another class of uh vulnerabilities integer overflows so where are we on energer overflows okay it's actually seems to be like it's improving as well and okay let's stretch this out to where do we get to uh 231 oh good ah we will have fixed the Epic wraparound before it

happens um which hopefully is actually true um yeah and uh there's this finding these these uh extensions of the prediction is uh just I giggling to myself the whole time making these slides because it was so funny anyway um this is generally good news uh we've been making progress that's nice the actual frequency of bugs is coming down in these areas um like array indexing array overflows underflows has been kind of flat and I don't like that um and uh in 2020 was this bleeding tooth vulnerability and uh this is an array in a larger structure that has a fixed size I don't remember off the top of my head how many bytes this is but HG Max ad

length is a fixed size the compiler knows that number it knows that it's an array it knows how big the array is so obviously when you have a mem cop and you're going to copy data that came from who knows where into this completely fixed well understood size thing you should not do anything with validating length and just blow past the end of the array keep on going overwrite pointers later on in memory and do Insane things uh so I was filled with rage um and I said okay this is this has got to stop M Copy can't just be take a pointer and write forever off of it it actually needs to be object aware it needs to say

hey this thing is only this big and uh it doesn't do this because C for 50 years has been treating M Copy just as an address like that for the destination is just an address there's just no concept of anything and people will uh deserialize strings of bites into many neighboring structures intentionally uh so we started refactoring the kernel and redefining M Copy and so if we take a zoom in on the array now okay we're making progress It's not a great signal um but we've spent a lot of time on this uh which I'll get into a little bit later um the question is now where is a low hanging fruit we've clearly been

squishing bug frequencies out of these uh larger classes well as a turns out use after free has absolutely skyrocketed and I like to look at the where it starts to really come up like 2016 and on like oh so where we started trying to actively get rid of bug classes in these other areas people went uh okay the easier space is to look at use after free and dealing with temporal problems not spatial problems okay that kind of sucks um a question of course I had after seeing that graph is oh my God like that's a huge number why is it growing where are these coming from so did analysis of where all the fixes like

what files are getting touched in the kernel uh when fixing use after free bugs and some big ones that stand out that I can sort of point to are net filter code the Android binder interprocess communication uh driver and IOU ring um and there's been a ton of research and mitigation work on use after free that's not all Upstream yet um so the Google kernel CTF vulnerability and Patch reward program so you can get money if you point out a vulnerability and you can get more money if you send a patch please um net filter has been just received receiving a complete beating on here uh which is really interesting iur ring got such a

beating earlier that uh Chrome OS started turning it off Android is really tight on restricting it it just was seen as way too young and API is too prone for errors and then Android binder uh is one of the first targets of uh of a full driver to be completely Rewritten in Rust for the colonel to just get rid of Lifetime problems uh you know all the spatial and temporal problems that go away when you're actually using rust for things and so I want to talk about how we we we drove down other bug classes there's a whole list of stuff that I just off the top of my head there we've been doing a lot of stuff for a long

time but a lot of refactoring a lot of removing uh bad language usage uh you know fixing how the colel arranges its internal stuff um but I really want to call attention to some of the other work that's we improveed the compiler to do a thing improve the compiler to do a thing improve the compiler to do a thing it's like that's not actually about the kernel source that's actually about C itself um so it's time for another quote uh this is Yoda talking about the SE language which is ambiguity leads to is the path of the dark side leads to confusion confusion leads to flaws flaws lead to suffering and I sense much ambiguity in you

see um so uh yeah C supports ambiguity uh but we can fix that so there's the whole class of undefined Behavior which is a really uh hilariously well-defined thing for compiler folks to talk about and it is a source of a lot of flaws but honestly it is just a special case of language ambiguity um and of course there's you know no memory safety no lifetime enforcement no safe concurrency and see either but um our our our choices here as you know in the Linux kernel are you know what do we do we have to remove ambiguity and c and write new stuff in Rust um so what do I mean by ambiguity in the language and I have a hundred

examples but I'll talk about a couple ones that I think are are pretty clear um so the first one is uninitialized stack variables uh it's important to remember that there is no such thing as uninitialized it's just whatever was there before and as an attacker if you can control what's going on you can control what was there before which means it's very very well initialized um and yes my garbage example here the compiler will warn you in this case like hey you didn't uh set on the stack before you used it uh but it loses track of things very very quickly if you pass anything by reference uh if you do any go through any unusual optimization paths uh the

compiler just sort of loses track of it and says I don't know if it's initialized um weirdly the optimizer does but that's okay um so this is an ambiguity what what is in this value I don't know it's sort of non-deterministic in the sense of the programmer's intent um but this was created trivial autov varet equals zero you just say everything's Zero by definition and if you assign a different value later an optimization path will just pass will just get rid of the zero assignment you're all good everything goes away the compiler will still worn when it can um and it becomes deterministic and it's safer in almost all contexts so if you've failed to

initialize a thing you've got a null pointer which is usually handled you've got a zero length string so nothing's overflowing you've got a zero bytes to copy and um you've got uid zero okay so not always perfectly safe but it's much better um than that um there was an enormous amount of push back from the compiler community on this because they said people will depend on these variables being zero it's like yes that's the point um and they did not want to Fork the language again I do I would like a safe C you can have your crazy C but let's get something that actually works and is deterministic it's not ambiguous about what's going to be

there um so here's another one array bounds checking right the thing that filled me with rage over a bleeding tooth so uh fixed size we can in fact do bounds checking here we have now repaired all of the complete Insanity was just pick up a rock and there's another rock under it just kept going with all the weird garbage in the in the compiler where it would sort of refuse to check array sizes for our like a lot of bizarre ancient reasons uh you know compatibility all with compatibility um but we've got got this flexible array style which is well we'll Define it at runtime how big it is and so C goes well I guess I don't know

anything about the array and it just gives up which is uh we don't want that we would like to be able to say if I have an array I know how big it is um so we've added now the counted by attributes you can you can hint to the compiler you're like hey yeah it's sized at runtime but here's where you go find how many elements are actually in this array and now theoretically you can get bounds checking on fixed and dynamically sized arrays oh my God it's a miracle I don't know how we could have ever thought to do this um says C the only the only language that hasn't been able to do

this for I don't know how many decades but this is kind of a a big deal um there are a lot of other ambiguities I I could I could literally continue to talk for hours about this stuff so the real Target lately uh that we've been staring at is is you know dealing with the language so I think of the C standard as uh strict slow moving and that it prioritizes compatibility over robustness and they they have their reasons for doing this um but for me trying to be really practical about what's actually happening on running systems uh I need things actually be dependable and unambiguous so if you ever find yourself so cursed as to uh need to do this kind

of a thing the key key to making practical progress in GCC clang and msvc is to use the magic phrase I would like to add this language extension and then they go oh not to the standard and they go away and they're happy to let you blow things up um so then you coordinate between the compilers um and the C standard can catch up uh when they're ready when someone wants to spend the time to do that and I've uh been trying to uh nerd snipe people into um talking to the standards committee and getting things happening and it's it's coming along slowly the zero initialization might be coming in the future at some point which

was you know finished years ago um so that's removing ambiguity um the big one honestly is just using uh a language that's actually going to start from a position of memory safety um so writing new stuff in Rust is uh is is becoming a bigger and bigger thing in the kernel and I think in the industry I think it's not that when governments notice your dumpster fire uh it's time to switch languages um so there's a bunch of links here on things saying do not use C++ and C for your new projects um the implication being that they will not buy your software anymore um that's sort of the what's being said um on the

colonel side this was a huge political uh debate uh trying to understand how do we open up the you know the the developer Community to include rust how do we get the build working how do we make everything uh run smoothly and this has been going on for several years now for a bunch of really awesome folks um that have just been plugging away at it slowly um there's a lot of stuff Landing into Upstream a lot of the bindings uh trying to get all that done and and uh get maintainers on board um and in the meantime while that's going on entire driver have been written in Rust we've got you know two full Graphics drivers

now there's file systems block drivers uh network drivers so uh thing I said recently to someone is if the Linux kernel can start using rust with its uh rather uh aggressive ecosystem and grumpy maintainers I really think anyone can start doing it um we did it um anyway so um those have been my struggles uh shared struggles there's a lot of other people involved in this work uh how are you doing um if you're just getting started uh please keep it up uh if you're already writing in Rust you're awesome um you're defending the cloud from Evil the job never ends uh keeping the AI from consuming the planet I do not want to be turned into paper clips thank you

um uh are you jailbreaking devices so I can fully use my Hardware also thank you and also I apologize for moving bugs uh because you do need the bugs the jailbreak uh that's a dilemma um doing other stuff in the industry I I completely love it um so uh I don't know all of our work can be a struggle but it makes a difference I think it can be uh really frustrating and me demotivating at times uh but in the end uh we're making progress uh even in weird spaces like the kernel um so uh I don't care if this is cheesy but I'm going to go back to Fred Rogers uh and I think it's a great quote it's uh what

you're planning and doing are things that can be a real help to you and your neighbor I'm proud of you this is good work so um thank you and enjoy the rest of the day

[Music]