BSidesSF 2017 - Linux Monitoring at Scale with eBPF (Brendan Gregg & Alex Maestretti)

Name: BSidesSF 2017 - Linux Monitoring at Scale with eBPF (Brendan Gregg & Alex Maestretti)
Uploaded: 2017-03-13
Duration: 28 min 52 s
Description: Netflix engineers present a practical approach to security monitoring at scale using eBPF (Berkeley Packet Filter). They discuss the performance and cost trade-offs of comprehensive Linux logging, and demonstrate how eBPF enables lightweight, kernel-space instrumentation for process execution, netwo

BSidesSF · 201728:522.8K viewsPublished 2017-03Watch on YouTube ↗

Speakers

Brendan Gregg Alex Maestretti

Tags

CategoryTechnical

TopicDetection Engineering Network Security

TeamBlue

StyleTalk

Mentioned in this talk

Tools used

BCC ftrace kprobes LTTng perf SystemTap tcpdump Wireshark

Concepts

eBPF

About this talk

Netflix engineers present a practical approach to security monitoring at scale using eBPF (Berkeley Packet Filter). They discuss the performance and cost trade-offs of comprehensive Linux logging, and demonstrate how eBPF enables lightweight, kernel-space instrumentation for process execution, network connections, and file integrity monitoring—achieving sub-1% overhead across large fleets.

Show original YouTube description

Linux Monitoring at Scale with eBPF The latest Linux kernels have implemented a Berkeley Packet Filter (BPF) virtual machine which can provide safe and efficient syscall hooking. There are many logging systems in Linux that provide security relevant data, and several excellent open source tools that sit on top of these. These existing options provide many features that are useful during response, but at scale we focus on lightweight alerting across the fleet, to be followed up with heavy scrutiny of a subset for a limited time. We landed on the need for three basic monitoring capabilities - process execution, network connections and file integrity. Our goal is to provide meaningful security monitoring at under 1% overhead.

Show transcript [en]

at scale with ebps pretty presented by Brendon and Alex from Netflix thank you thank you thank you it's great to see the hardcore security professionals still out here making it to the 4:10 talk special thanks to locals who are putting up their first sunny day this little winner was hear us talk about to hear us talk about security monitoring so I am Alex to study I managed the 13 that Netflix we provide security intelligence and instant response across the organization but our focus is on the products so that's the streaming service that we host in AWS I started about a year ago and when I came on board I was looking around the team to see what sort of

intelligence Plumtree would get on the security front so in doing that I wanted to get some more details out of the instances that were running naked with some more details off the endpoint and so I came up with a standard security architecture response which was to monitor all of the things right and that sounds really great it works really well but you when working public cloud you have some very direct cost that are involved in that so I went to the base ami guys and said hey would you push this set of audit rules for me it's going to moderate all the things we'll have great responses out of this and they said maybe you should get some more

context on that and what that means is like you might be about to do something stupid so when you when you get that response what you need to do is stop and think and go find something smarter than yourself to ask questions and that's what I met my co-presenter here burning Greg it was a senior architect on perform Steam so I talked to him about you know what it meant to monitor all the things and he was able to come up with a very direct cost bill of what that would cost us in terms of compute overhead and then I was up to me to sort of stop and think about whether that was really a Netflix best interest or if we

need to find a better way and fortunately Brennan has been building open-source tools for a long time in terms of performance monitoring and introspection so he's deeply familiar with the inner workings Linux kernel he's been involved this ebps sort of movement for lack of a better word and so he provided the option of sort of having our cake and eating it too which is really using ebps to do security monitoring and so as I started looking through those I got super excited about it I want to do some talks he agreed to join me on these talks so I think that's our purpose here today is to sort of cross pollinate from the world of

performance to local security and see if we can't sort of build some consensus around EBP F is a great tool to do better security monitoring in the Linux kernel going forward so that's our purpose I would say the the goals of this talk are sort of threefold one is the why why is evps important we have a lot of great sturdy modern tools already out there why is EPF relevant why should be thinking about it so I I'll cover some of that and try and convince you number two is how how does if you have to work how can you guys use ebps and retinyl sort of cover some of that you know the the performance benefits

are derived from the fact you can execute and kernel but you also get safety and stability constraints around that through bytecode analysis and various things so he'll walk you through how it works and you'll also walk you through how you guys can use it using the BCC the bps compiler collection that he's been contributing to so it makes it very easy within a weekend you can do this tutorial you'll be getting you know very raw data out of the kernel using this Python essentially so we'll do we'll do the howl and then finally it's sort of a call to action is hey if you guys are interested in this if we've convinced you this is cool

reach out to us we're looking for collaborators we want to build some open-source BPF tools around security monitoring on linux so with that the the why why is evps in my opinion super important than sort of a sea change in the way that that linux is going to operate in the security realm well we've got all these great existing security monitoring tools out there why would we want to build a new one and it kind of goes back to that intro story I told you about the performance piece but it also applies quite a bit to the way that you consider your design constraints when you're working in a public cloud environment and when you're working with

microservices so with the public cloud in our case AWS the cost structure is different than if you have sort of a traditional data center where you have perhaps a lot of servers give sort of stuff costs and in the cloud you're buying compute on demand so memory and process are very important to you but transport within data center is pretty much free and storage is is based on cost so it might actually be cheaper to store log data off instance and on instance there's some interesting sort of trade-offs there you can think about and as I started working through that I realized that maybe we need way to approach things so you know my first thought was OS query because this

awesome product you know it applies in a data center it applies in user land if you have users of desktops and laptops it's apparently a ticker-tape parade it's starting so you know it works on Mac it works on Windows it works on Linux it's so - the Swiss Army knife of security monitoring and for those who aren't familiar with it basically how it works is that it takes all of the city relevant data from various systems and it presents those in tables you can then query through sequel so if you know LS query you don't need to know exactly where the process table is in Linux or in Windows or Mac you just write a

sequel query and it gives you an information so a great tool it's been deployed at scale in a number of places you know you can use it as either sort of a hunting tool to jump in and ask questions of things or you can use it as a as a monitoring tool for the daemon process but it still has this idea of sort of batch queries right you're writing sequel queries against the tables so there's there's more of a pull nature to it and in theory if you're if you're doing the daemon you're asking these queries every say five or ten minutes there's a window that are from attacker to exploit on the Box escalate privilege cleanup

blog and even closer to the mic yes I can do that my problem is I am all I [Applause] will get closer that - closer closer to like okay this is good is my first mic appearance apologies for the technical difficulties so as I was saying that was query there's a window there in which in theory an attacker could clean up logs and you might miss the detection so that's what sort of drove us towards more of a push methodology rather than a pull methodology I mean it might be kind of a moot point to give a very narrow query window then it'd be very difficult to to clean things up and that amount of time

however this from an architectural standpoint I kind of like the idea of streaming events off of the off of the instance and I think a lot of folks are going that way or you see windows event logs you see even cover black announced to streaming sort of detection service so that seems to be the better way to do things another optional s-second that was mentioned earlier a lesyk is a very fully featured hid host intrusion detection system so it provides you rules and alerting that you can trigger on various either in secure configuration States or series of events that come out and it will fire off these events and unless you look something's wrong so yeah it's sort of a streaming model

in that regard however it's pushing all the rules and and intelligence down to the instances so that sort of runs into the micro services architecture right you want to keep your micro services very lean because you're going to deploy tens or hundreds or thousands of this copies of them so it might not be the best approach to run tens and hundreds of thousands of copies of the same rule set it might be more efficient sorted at a higher level but moreover you have much better context at that higher level so while on instance you might be able to detect rules quickly and take actions on basic heuristics again in this micro services architecture if you've got many

many copies running the same set of code you can sort of look at that as a peer group or a herd so you're again you're taking these built from source you're stamping out many many copies they're serving traffic from a load balancer so as they sort of serve traffic you know their security characteristics might change they're going to stay roughly in a group as they move around and if one of them starts to sort of depart from that cluster that might be a performance issue it might be a security issue but you should probably take some action you could terminate that instance so you could quarantine it go get more introspection but those are some you

know some fairly high fidelity alerts you can do based on sort of standard machine learning algorithms but you can't do that without the context of the holes were so you can't do that on the endpoint so you can't really do that with a less sex you need to do that at a higher level so that's why we sort of moved away from the model of having a a highly capable HIDs and wanted to again stream events off blocks and make decisions elsewhere and that's really what audit D does right audit D has been around forever analytics and it will let you write rules to capture bury assist calls give you a very good view of what

the system is up to but that was where I rounded that performance issue and there's also some filtering issues around what an oddity will let you do or what cannibal you do filtering wise for instance sockets right if I want to monitor all the network sockets I can monitor all the socket creates but I get inter-process sockets too so I get way more information I really need don't across the transient between Colonel New Zealand and that has a forest penalty as well the slack guys wrote a great great update to audit go on it which is a rewrite of a user LAN which allows you to get a little performance benefit from the user land

side of things output JSON and do regex filtering so it saw some of those problems but as we'll see later in the slides performance wise we were still looking for something a little bit leaner so sistex another issue and one on the on the flexibility side it really it meets those needs you can create a chisel to do basically anything you want but Cystic is throwing everything over that kernel to use the land barrier so your to me it's more of a very useful introspection tool or some tracing tool and less of a security monitoring tool so that's sort of where we came down from the existing solutions right we've got a bunch of great stuff out there

we'll probably still use things like alleged query we want greater inspection but we're looking for an approach where we have very very lightweight in monitoring across the entire fleet and we'll just text something that we turn visibility that's really the only cost-effective way to do it at a larger scale so for that we have a new option and that is this ebps which Brennan will now walk you through some of some examples of Albers okay thanks Alex my name is Brendan Greg and I'd like to show you vpf by starting with a screenshot so that you've seen something concrete this is a very simple tool it's using Colonel dynamic tracing of the cap capable kernel function call as printing

out her event details you can imagine running this to create a whitelist to see what capabilities your application is using if anyone used K probes before Oh excellent so we have like ten people what about X traits Linux F train oh yes like 15 people at x1 so this doesn't look doesn't look that new the kernel has had things like F trace and Cape Road for a while and so you can and I have another tool kit of tools which which can do things like this where in DB F gets different is that we can run programs on events here this is a bit of a hack but I've taken the cap capable probe and I've used another open-source

tool called artist and I am taking the DX register when cap capable is fired and I'm doing an internal frequency count and the output that we see here so we have the capability 12 which here 83 times 21 with hit 5 times that's nice but that's frequency counted in kernel using an e BPS map and that's something that only ebps can do when you compare it to previous inbuilt kernel traces like f trace and puff events and we can do a lot more not just frequency counting things we can do more advanced filters so that's been many many traces for linux and i've got a list here there's been K probes which was added a long time ago the

kernel dynamic tracing there was difference which really helped to prove that dynamic tracing was powerful and useful although that was not integrated in linux cleansing system Tassel though that's been out of tree and strengths which are a bunch of your views that's great perfect which is a great official performance analysis tool for linux and so on and so on there's also been many out of three traces so there's been LT TNG fifty K tap and many more Intel pet unfortunately there's not a lot of documentation or use cases getting into all these different traces I mean when it comes to kernel engineering this is this is the bottom of the basement this is difficult stuff when I first got into

K Propst a few years ago I was searching for resources and I stare on the code in the kernel and I found on the in the kernel tree there is flash documentation and you can read the documentation that was submitted with the patch set but there was a much evidence on the Internet of people actually using it I only found one good reference and that was from crack magazine now it turns out the security crowds really get into it doesn't matter how difficult it is and there's a great introduction at the top of the article which says we are hackers hackers should be aware of any and all resources available to them and it's great so chambers that week there's one

person of those caped wrote and detail outside of the kernel tree but we're going to talk about bps BPF itself began life as a packet filter and if you run TCP dump and you give me an expression and then you add minus D for debug it print is weird assembling that is user defined by codes that sent to the kernel and is compiled and executed when those send receive events happen and it's executed inside a sandbox virtual machine well this is really interesting there is a virtual machine in the kernel where I can send it by toad and have it run stuff on my behalf on events well it's really interesting when it's been enhanced so that instead of just send

receive it can be attached to other things and it can also do more more things with its assembly so the BPF assembly has been enhanced it has Maps now which are like assertive arrays key value arrays so we can store things in kernel memory like when I was frequency counting the capability calls and we go also to actions so we can modify the state of the system NDP F looks like this a bps is one of the most difficult programming languages I've ever tried to code it in fact I'll confess I've never had a single raw EBP F program of mind work compiled I have had in Tikal programs work in fact I we contributed to in

tekele I wrote the guessing game in Tikal to know that is so I'm fine with complicated languages but ebps itself is really difficult it's okay because not many people need to know this there are compilers that you can write a higher-level language and piles into ebps one ebps earth is it's difficult to get our head around like I said this is a sandbox virtual machine and it's in the color this is not some third-party add-on that a company is created this is in Linux the fourth series of Linux so if you're running Linux sooner or later you're getting this weird technology from far were sort of many use cases for it the company that was originally doing the

development plumb grid they were using it for software-defined networks so you could write router in that control of in kernel and firewalls and that would become a BPF program that was executed when packets arrived it's being used by cloud fans plan for doing DDoS mitigation so that service can accept a higher rate of traffic and drop those packets earlier in the earlier in the tcp/ip stack we're looking at using it brings fusion detection and we're not the only people there is a company that's using it for container security and it Netflix I'm using ebps for performance observability to give you an idea the internals you write a bytecode program and that gets loaded into the

kernel via verifier which makes sure you're not doing anything naughty so make sure the program doesn't do backwards jumps it doesn't do loops it doesn't access memory that it's not supposed to access then that can be executed by DPF the way we want to design it or way we are designing it for security intrusion detection is finding all the different events we want to trace but we can protect finger on low frequency events to make it more efficient BPF can then send out a pair event log defusing ring buffers and it's pulse from user space at a gentle interval all for efficiency so what can we monitor now missing some great talks today at b-sides

and we got to see how for intrusion detection what normally happens if you look at what inda gives you and the vendor might give you some microsoft access log or some web access log and then you figure out how to interpret and how to figure out what what is suspicious on the system given logs this is different when using dynamic tracing you pose the questions of the system this is a little like I guess is an analogy the difference between cable television and Netflix with cable television you turn it on and you watch what they decide to play you and that's like consuming a vendor log consuming what Oddity has decided for example on linux but we

dynamic tracing it's like Netflix you're in control you get to pick what you want to watch and so when I started doing this slide I was decorating it with the kernel functions that we can trace for these events and I realized I'm not going to do that I'm going to decorate it just with the names of the events just to highlight the difference in thinking that we're now going through we now we choose what we want the instrumentation to give us not the other way around so we can instrument all of this we can instrument shell command ssh authentication crypto initialization pseudo usage su usage of lip pam events are really easy because that's a nice API we can trace when

processes are launched set your ID privilege escalation and the carnal any code part that does that we can trace we can trace anything weird that's happening in virtual memory maybe there's some exiting is trying to execute the wrong page CCP I see we can go up and down the stack and we can trace events we don't have to touch them receive because that's high overhead we can trace the lower frequency events and so on there is a collection of tools for BCC examples where and I've written many of these this is on github which demonstrates that all of the different places we can trace and many tools are getting added all the time and so these

service life examples what I like to do is just give a quick demo of a couple of these

like in some logs on a Linux system again some 14.8 but VCC is thinner and prettier spin around things like photo for it starts to get usable and just with a couple of examples I'm going to run bash red line this is interim instrumenting the bin bash program and it's instrumenting whenever the bash shell accepts a new line a new commands and it's doing a system-wide I don't need to give it a process ID I can go to another window and I can typing commands and I've instrumented it so that I can take a log I didn't need to run a special version of bash I didn't need to compile it in any special way I've just

chosen I look at the bash source and I said this is a function I want

so I can look in the Bosch finding I can say yeah you know I want to trace redline that's probably got the what people are typing and then I just use the this is you probes user space dynamic tracing and I built a tool to do that it's an example of Colonel dynamic tracing okay so this is def pts one

okay so I have one window is watching every single character that appears in another window so it's seeing it event by events and does anyone remember the drill TTY watcher TTY watch anyone you suffer for so I've just rewritten that using BTS is very simple right because the kernel is emitting characters to that TTY and they can just trace those kernel events I can even do things if I run by okay this is where the wheels fall a little bit so I can see them in vibe but it's kind of not working I'm setting up this demo I realized I didn't think a buffer size big enough so my Tiki wave buffer is only 256 bytes so I need to change that

to be like a kilobyte so they can catch all the screen rights anyway these are just examples of using dynamic tracing in the kernel and there's there's just so many events we can instrument so I could instrument all of the TCP functions so there's t2b check space receive establish the strings mess up because I was doing TTY watch it has a device escape characters so I can trace thousands of events in the kernel and come up with the instrumentation I require and that becomes the difficult part about using this it's the questions you want asked of the system

there's a couple more screenshots but I just demoed some so this exact snip where I can see commands is our run keep the connect where I'm only instrumenting the kernel functions that do connect so I'm not touching Semler see I'm doing performance engineering and so that makes me happy because I can keep the overhead very low you know wherever possible we want our network intrusion or our system intrusion detection system to consume less than 1% CPU some instrumentation techniques firstly you need to know the questions you want answered but use the stablest api possible dynamic tracing is fantastic I can put my finger on any software function and begin to write events out but if I start doing that a

lot when the kernel changes my programs will break because the kernel change that functional is instrumenting there are trace points in kernel space and then there are USB qi probes for the equivalent user space and these are the stable API so if we build our security tools using them they will work from version to version if they don't work we can we can try and find something that might be dynamic tracing but it kind of has a stable API Alex have a good idea of tracing the security hooks that are built inside Linux usually built inside Linux because they kind of have a stable API and so we can use them to create intrusion detection systems as well I

was noticing last week that Lib Pam has a very well defined API so I can trace Lib Pam renegade etc etc but why we love a PDF is three things it's safe the kernel has a verifier it protects all memory access through helper functions and it's also part of mainline kernel so we're all getting this it's flexible so you can instrument anything anytime and that means if there's some new zero-day vulnerability out and Oddity doesn't catch it we can probably write neatly TF program and run it immediately we don't need to restart any binary source start anything in a special mode it's like updating snort logs on the fly so someone can say oh there's a new attack

Tiffany piece of programs that will instrument it you can run it straight away it's also performance so ebps has been designed to do networks and receive tracing although I try to avoid that I try to heat up lower frequency events but it uses kitted instrumentation and other techniques to keep the overhead low I did a comparison a quick comparison between oddity go order and ebps and I was doing just the accept system calls to see when connections were accepted and BCC EPP F was about one-sixth - CPU overhead of the other solutions go hold it with a little bit faster but BCC EPP F was a lot faster but just to really illustrate the difference in overhead the old way to do

this kind of instrumentation is package capture but the new way is dynamic tracing or static tracing is possible where you can put your finger on the event and say I only am interested in that event I don't need to do every fiscal or everything receive to keep overhead low that gives you an idea of the code and this is all on github such I invited such BCC it's part of the Linux Foundation you write some C code which gets run by the kernel and then Python code to do the user level reporting and there is a tutorial online if you if you really want to get into it and lock the lots of examples it gets

pretty complicated so this is a little bit of my TCP except where I'm going and digging out bits and pieces to get the IP addresses and ports and whatnot it's hard but it's doable you don't have to write the ebps assembly that I showed earlier at least this is C so then gets compiled into EBP F but it's great that this is all possible at all and I'll hand it back to Alice for the site awesome so one correction I gotta get credit credits do the idea to hook the LS m was with our colleague Brian pains but I would definitely encourage you guys to go check out that I of Iser compiler collection because I am not the

world's greatest developer but within a weekend of playing with this I was getting brought out out of the kernel in a very performant way so it makes it very very easy and and that's really our sort of our call to action here is to to get more folks playing with these tools and to encourage you guys to to reach out to us to make and collaborate on bringing security monitoring based on EVPs to an open source solution and I'm disappointed we didn't get the the illustration done in a professional manner but here's my sort of take on what I was describing before right you've got the various instances out there they're producing telemetry you're you're looking at them as in terms of a

herd and trying to do outlier detection and then turning up response based on that I think this is a an interesting space I think we've got a chance in a sort of modern environment of having immutable ephemeral instances that we may not have had in a more general-purpose environment so I'm really excited about what this means for for our industry and for securing the data there so thanks again for sticking around to talk to us and I hope you guys soon [Applause] so on behalf of b-sides and zip it and for referencing fracks magazine in a presentation I'd like to give you both a pivot oh great let's hear for Brendan and Alex

BSidesSF 2017 - Linux Monitoring at Scale with eBPF (Brendan Gregg & Alex Maestretti)

Related talks