← All talks

Auditing all the things for smarter detection faster incident response Mark Thomas BSides Boston 14

BSides Boston22:11147 viewsPublished 2014-05Watch on YouTube ↗
About this talk
More information at http://www.bsidesboston.com (c) Security BSides Boston 2014 Follow us on Twitter at http://twitter.com/bsidesboston
Show transcript [en]

I don't know all right this is good all right okay so um I'm gonna talk about and it's uh all about like uh kernel auditing stuff right so um I spent a lot of time digging in the kernel uh and doing application development to kind of we'll go into it right so um this is me whatever okay if anybody here works or works at Red Hat specifically on the code base for audit you probably want to leave the room if you know Kung Fu because I don't want to get beat up uh this is our blame marketing slide uh all of the research and development and the things I'm going to tell you about and the techniques is all dude in my

company thread stack um so let's explain what the audit system is in the kernel uh it basically provides uh you some information uh that is not usually pretty to the user uh for our processes and things like that um it's a it's a passive mechanism only which means that uh there's no way that it doesn't actually enter you can't inject or you know anything like that um you know so it's only like being admitted to you uh it uses a simplistic filtering mechanisms so you can add rules to the audit system uh that you know get placed in the different linked lists and uh they're just basically Boolean matches right um it can be it can be extended uh by

kernel in the kernel and kernel modules if the uh and that's only in the on the various log aspects uh that are exported in the kernel otherwise you know you can actually touch other unexported parts of the audit system so it's really easy like you have a kernel module and you want to start adding your stuff to the audit subsystem that's great um so there's an application called audit E A lot of people use it to just you know to get logs and to kind of see what's going on in their system and uh you know people kind of assume it's just some kind of magical thing right but Oddity and auditor basically not or

their musically mutually exclusive so um audit is not magical in the kernel whenever you need to find assist call you wrap it around this macro right so this macro uh you know ends up uh eventually calling you know so it's called defined which you know then eventually calls Cisco multi I really wish you guys could see this uh uh this is called metadata right and this uh just I'm gonna explain it it generates the one that generates the code uh you know you know the actual assembly code for your for your uh for the kernel and when it uh what ends up happening is that every single system call that gets uh executed via this

macro is has a call for entry point and exit point that calls the function audit says call entry and audit success call exit um so the entry point uh determines the Cisco that should be audited initializes all the underlying context uh the memory and stuff like that for uh you know uh the output uh at the exit meet several messages associated with assist call all kinds of crazy stuff um and after all that's done after the message has been omitted um it sends an uh eoe message uh basically saying that uh here's everything about the syscall and it's chunked into little bits of data um so we we get return statuses of uh assist calls you know what if you know

isn't exactly e sock addresses you know that could be like you know uh you know TCP uh or just like any kind of struck socket or stuff empty pairs you get to pay the uid and the session ID session ID needs to be pointed out here um basically a session ID is a uh Atomic increment in the kernel that when somebody logs in uh they will get uh you know any permanent number and uh so like if you're in a different name space or a different anything that you run uh from your shell uh is still can be associated with that same session ID um the uh the thing that you have to realize is that um you actually have to

have something that will go and touch a specific file and proc when you log in so Pam will do that for you Pam the pluggable authentication module um so I'm gonna blast a little bit uh through here so here's some main problems that is problems and you know good good and bad problems right so only one process can be doing uh connecting to the uh audit system itself the audit system runs over net link into the current uh so you're basically just speaking directly through the kernel so only one process can be doing that um so so if you create a second process and then you kill it the first process doesn't actually you know start up again

um that's mainly because then the kernel would have to keep a backlog of some sort that's stupid um okay this is I wish you guys could read this okay so this is the biggest thing uh that I ran into prior to the Linux 3.8 when the audit backlog was hit and the audit logs start called during scheduled timeout in the kernel there would be a deadlock and your system would halt so when I was debugging my application it would I would get screwed and I'd have to rush to turn off the actual audit uh featuring the kernel uh because I didn't have to reboot my machine all the time uh so everything that comes from on it has this uh you know very

specific format right it's like this you know key value type thing uh a lot of people don't like it um I didn't like it I I thought it was absurd um and but but you know what are the Alternatives and over overly complex binary messaging system you know something where you have to like say okay this like equals this type of message and things like that yeah you're insane you want to put make it into Json now now we don't want to put Json uh encoders in the kernel right so there's just simple rules that you can abide by to actually kind of like deal with the messages you're getting from the kernel uh everything in kernel is a key value

uh unquoted values usually are deemed as like untrusted strings right um and they are encoded as ASCII house and quoted uh the serial number is the Colonel's way of designating uh a group of messages together so if there are multiple uh portions of uh syscall that does multiple things like it opens a socket and it you know goes it's in this path and it uh opened up this file then there's going to be three messages and you associate those messages by the serial number um so uh there's also you can also do userland uh based messaging and but those are actually wrapped in at different things so nobody can like spoof a real audit message or anything

like that um and they're wrapping message equals quote and then whatever um if you're like me stop bitching about the format and just deal with it uh so here's where I start kind of getting a little uh making fun of stuff um so the the problem with the audit stuff is that uh we are you people use audit D which was written by red hat and it was written many years ago and uh so we ran into problems where I mean it was just pegging CPU if we had rules for like receive for example if we were ex when we were doing Cisco uh you know monitoring for a receive or something like that and somebody's pounding on

nginx or your web server you see your CPUs like a mad uh using this audit payment written by Red Hat like I said um so head performance problem uh has very limited output uh you know format uh it's difficult to extend if not possible without actually going into the code um the code is impossible to read it's poorly designed in my opinion uh did I mention uh performance issues it does it does okay God damn I wish you guys could see this okay this is some code from ID right like like okay so we have a bunch of global variables and they actually say like uh global data uh so that is always kind of

like a indicator of Uh something's going on here okay and here is a piece of code that shows a mutex lock uh and then a conditional signal to uh another thing so that means that audit D is taking every single message that comes from the kernel and queuing it up and then spinning them out in a different thread that's bad because you have a lot of attention Okay so we decided to get I I decide to get rid of it because I couldn't stand it so we have some goals and these are obvious lower resource utilization um extend uh all the logging uh capabilities um if we you know as best as we could uh

try to keep backwards compatibility as much as you could um don't we the one thing that we didn't want to do is reinvent the wheel in places we didn't have to uh and Abstract everything blah blah blah uh so here's where we go with uh the low level processing what I mean by that is the how it's handling low-level type IO so uh ID uses Libby V for signal and you know network uh you know net link stuff uh I use live event and uh lib mnl for soccer for the net League stuff right um so once again I wish you could see this so uh so uh the point of these basic slides is to show you like right so in this

slide I'm going to kind of describe it uh to create a socket and register it with the kernel and you know via net link it's very clean to do and very easy to do with mnl uh so you know always use something that's really good this is this is how audit D does the whole thing I mean there's like pages of just dealing with something that you know I could do in like 20 lines uh so here's receiving a single message from a net link Saga with Eminem what is that it's like it's uh it does all the error checking it does it all good for me here's net link right it's crazy amounts of stuff and you know it's

actually this is actually like five calls in um so now we have to I'm going quick sorry guys uh so now now we have to figure out how to deal with the data right so there's post-processing which is the way that audit D Works where it basically takes messages directly from the kernel and puts them into a log file so you have to have an application that runs that goes and correlate you know does all the Reconstruction and of the CIS calls and all the stuff that you want to do um so yeah uh but uh so like like when I was talking about the serial numbers runtime grouping which is the technique that I'm going to talk about that we

used which is to use that serial number and utilize the kernels uh you know end of packing you know end of uh stream uh thing to basically say okay this is the start of this syscall exiting and uh you know that um this is the end and we can tie it all together um so here's uh uh this is ungrouped it's just a bunch of lines and I kind of put it in red this the serial ID of the serial number and uh this is how I wanted to do it right basically like a Json where all these things that are here are grouped into one thing um here's some more examples which if we

had a bigger screen would be great um it's really easy to understand like here's like you're going to see a user doing a cat Etho DAC client uh here's nginx doing an open on index.html and you get the file descriptor 13. here's an except for a specific IP in a specific Port um and uh you know here's some Pam stuff uh I'll put these slides up because I feel really bad um okay so parsing parsing is a big deal because parsing um can make or break your application especially so um just read the quote uh every year one out of ten programmers will commit suicide due to maintaining parsers written in C that is quote from every C

developers had to maintain a parser and C so here's a see it uh laughs so there uh we have uh the audit method which is basically what I call Brute Force right here's a kind of like a a grep and a word count of every single stir talk or any kind of like uh string comparison function there's 448 in odd D um I use finite State machines let's see how many I got four all right um so I have I have leviated all the overhead of all the crazy like string matching stuff so uh you know this is this is a this is uh uh on a day just just parsing the timestamp and actually

looking at it now there's there's actually a bug here that I just I just saw it uses it uses a Alica uh which the return of which uh could uh is you know could be aired but you know it's uh it won't actually return null it's you know uh here we seem to be incrementing pointers one okay whatever so somebody go who actually knows security and go look at this function extract time stamp and write an exploit for it um here's the state driven API uh so the next thing that I want to talk about is how to do the basically turbo boost your parsing and you're dealing with the data that you're getting uh

specifically with like conditional logic and this and stuff like that so our Technique we generate a perfect hash table using the application G perf G perf is like a gno uh program where you kind of use this key value type thing and it fills in a structure generates code and fills in a structure that you can use to you know a perfect hash table basically is an o1 which means that I put in a key and there are no conflicts or uh with the table itself so it's very quick uh and so I went there we we went and signed known keys and values to a types that can be innumerable so I could do like

switch cases on the key that um you know normally people would be doing like stir comp or something like that on it so in that case we could filter out keys and values that we kind of said oh yeah we don't actually need these things they're just kind of silly we can add validation uh we can determine this is this is actually kind of we can determine if a value of a key can be treated as an integer instead of a string that's actually kind of good from a performance perspective right um so we could do stuff like this once again you can't really see it which is not awesome uh but for example like uh

since uh and I now have a new a numerable types for like a at a0 to a like five which is on its way of saying the arguments passed to a function um I can just basically say turn into an enum and so it's very fast and I can say oh if the value is nothing I don't need to log it uh and we can also do Google stuff like this uh this is uh I actually know that something is a sock address right and I can try to actually decode it with a simple if you know one conditional uh one uh comparison versus most you would have to do more strings though uh so

here's and then we wanted to do and this is really cool and it's very easy to do is to start doing filtering uh post-profit you know after you get the data we use Luigi it's very easy to embed in your application uh there is a you can do per uh instance or you know uh functions uh so we convert the actual group data that I was talking about into a native Lewis table and we call that function and that function can either return 0 1 or Lua table zero says okay pass on you can actually log this message one says drop it I don't give a crap and if you pass fluid table to actually will take that data morph it

back into the the internal structure and then output the Json that you put using Lua um I've already said this so these are basic examples which you really can't see so here's a simple filter that says basically okay if it's any of these socket type functions and there's no stock Adder type uh part included then we don't really care about it uh this one is the one that we can actually you know return a table convert it you know I really wish this okay so here's the output types that you have with audit d there's not many you can only have one but there is a an application uh that comes with it uh it is called all this

food um so so for each plug-in that you write for an output uh it will Fork exact V8 and then start sending you messages over standard app um this is real so uh that's pretty bad because you know uh when you're reading from standard in it's blocking and it's just creates weight States on your CPU which is bad and so if you want to log in some different format no good um we end up running like a plugable output system and uh uh so we know we have all these like different outputs that you belong to um going through this quickly again uh but if you do it right you can start adding all kinds of new types of apis

anything net link related right uh so here here uh here is the actual performance test that we ended up doing running Apache bench at 10 000 requests per second um you know with the rules for all kinds of like soccer rules disc golf stuff um audit D 128 CPU ours 10. um which I think is pretty good um so this is the stuff that we're gonna do in the future uh simple analysis statistics Gathering blah blah blah um so I mean that's pretty much it right um hopefully it's there's nobody here there's none of my bosses here um I want to open source this very very badly and I'm giving you a lot of I like

the techniques that we use um but if you want to twick tweet out something like uh you know thread stack please open source your stuff all online hashtag that'd be great because the world needs uh good software not bad software for example you know open SSO so that's it

is there any questions

oh I will post them or hey you want a computer yeah I'll go I'll post them whenever what's that yeah the computer or the slides yeah sure uh yeah it is uh mostly because we needed something that what wouldn't kill the Box any more fun questions

if I could use D trace and Linux I would be a very happy man but d-trace has uh is not native in Linux right now another question yes I'm very familiar with that yeah um okay so the thing with systic is that requires a kernel module a in order to actually function uh the way in order to do that kind of stuff it's really cool I like it um but we didn't we don't want to distribute a kernel module or I didn't want to but it's there is overlap because and actually it does a better job in a lot of things um but not all things in the kernel can deal with you know all the things that

you can send audit messages um you know systic might not be able to pull that information out but yeah it's a really cool thing and there might be some there definitely is some overlaps with like uh you know tracing down ciscalls and stuff like that so your Solutions yeah uh audit the audit subsystem has been in uh the kernel since like two points six one who knows uh so yeah there's there's no need for uh any kind of new promo module or music as anything else all right did I get it down in in the time no yes