
good afternoon this is breaking ground at b-sides las vegas uh this talk is see you later allocator before we get started some quick announcements first of all i'd like to thank our sponsors especially our diamond sponsors lastpass and palo alto networks our gold sponsors including intel google amazon their support is what makes this conference possible so please say hello to them outside uh as always we are streaming cell phones off please if you have questions there will be time for questions at the end of the talk just raise your hand we don't have speaker mics today and with that please welcome daniel danzi all right good afternoon everyone i just want to make sure can everyone hear me in the back awesome all right um i just want to do a quick survey uh so who decided to show up to this talk because of the pun who else uh decided to show up because volatility was in the name all right good show of hands um who showed up because they will really want to dig into some linux internals today wow all right let's get this started so a little bit about me i'm a recovering grad student i graduated from louisiana state university taking part in the lsu's applied cyber security lab we like to do a lot of hands-on cyber security activities try to bridge the gap between theory that's being taught and actually getting hands-on keyboards um and in a former life i was a software developer so memory forensics it is a branch part subset of digital forensics it kind of sits in between traditional disk forensics where you power off the machine yank the disk out go do analysis on it later and live forensics where you're looking at stuff coming through the network analyzing things as they're happening memory forensics kind of sits in the middle because you take a copy of the machine's memory as it's running which gives us a little bit of insight into the state of the machine and why would we want to do this well it gives us a number of very useful information such as the processes that are running on the system any kind of network connections it currently has engaged the network sockets that it represents potentially any passwords that are stored in memory or cached in memory as well as operating system hooks that may involve things like monitoring the keyboard or mouse memory forensics also gives us the power to look at who performed what action since we get to see which process has started which other processes which means we can attribute activities on machines to either users or potentially malware if we find it on the system and in more recent times there's becoming an increase in the number of memory only malware some examples are dooku and corelump um both reside exclusively in memory and if you try to do traditional disk forensics you'll never see it because it never gets written to disk and likewise if you try to do live forensics a lot of times there are anti-analysis mechanisms in place to make that not fun so how do we do memory forensics one of the very popular tools is volatility since it's open source it's plug and based so if you need to expand the functionality of volatility to suit a particular analysis need you are able to do that and volatility handles from a development standpoint a lot of grunt work behind the scenes it provides virtual to physical address translation which saves a lot of pain when you're trying to work with a memory sample which is just simply a copy of memory from a given time of the machine and it provides an interface to the kernel objects in the system so you can look at the data store in the kernel objects even manipulate it get a much deeper understanding of what's going on the system at the time so what exactly does this analysis look like involves running plug-ins as i mentioned previously um one such example is pslist which tries to emulate the ps commands on linux systems which i don't know how visible this is but it provides us with information such as name of the process is running the ids of the process any parent ids that have started that process when that process started and any user or group information that's associated with it likewise there's netstat which tries to emulate the identical name netstat command on linux which blocks the process list from a given memory sample and enumerates the sockets that each process has which gives us important network information um including both local and remote remote ports as well as the process that owns a particular socket which makes it easy to attribute network connections to a particular running process now both of these previous plugins use what's known as a list walking method where you just take the kernel data structure that contains the list of processes lists of network structures and so on and you just walk through that list the advantage of these approaches is that it's relatively straightforward to do you just find some point in memory and you know where they all get linked so you just follow those links until you've enumerated everything however a lot of times deleting things in memory is very lazy in that whenever something gets freed up such as when a process terminates a network connection is closed all a lot of operating systems do is just simply free up that space to be allocated by something else in the future so this data is still resident in memory and it's still important for us to look for it so there's carving approaches which just hunt for blobs of data that look like something in memory one example is ps scan for linux which provides us with very similar information to a ps list the only difference is since we are carving potentially terminated processes we can see that for some of these we have a user of negative one which would be invalid however this is associated with a process that had been terminated on linux and this becomes very useful because we don't want to miss out on processes or commands that are run on a system just because we took a memory sample at the wrong time now linux being open source allows us to look at the memory allocation process directly and we can create a targeted plugin to analyze linux memory systems so if anyone's run this beautiful command ps list cache you're presented with this glorious message so what's going on here um slub is one of linux's memory allocation processes it handles the quote unquote smaller allocations done in linux and ps list cache was written for slob's predecessor so why is this important um well slub's a kind of niche allocation system that is since about 72 percent of recorded kernels and it's also a little bit of a newer allocation system being added as the default memory allocator for linux around 2008-2009 which is about a decade ago so really quick it's important i don't think i need to make too many points but linux is used somewhat widely twenty five percent of software developer workstations according to stack overflow use linux uh eighty percent of web servers are run on linux nearly all of these super computers are run on linux and linux serves as part of android's bootloader so being able to look at a linux system specifically any of the data structures allocated by the kernel is really important so how does memory allocation in linux happen it starts with k malek which if anyone in here has programmed in c has likely used a malik function where you just ask hey can i have a certain number of bytes in memory and it gives you a pointer to that memory the kernel has to handle its own memory as well so it has a k malloc function where they just slap the word k or the letter k in front of malik and called it a day [Music] how it works internally though is there's actually two subsystems that handle memory allocation at linux there is a large request handler which handle which is known as the page or zone allocator and a small request allocator which can be a number of allocators depending on the configuration of the kernel which is where we're talking or looking at now so the original small request allocator for linux was slab it whenever linux asked for memory it groups all the objects that are allocated by the kernel based on the object type so all of the process structures are put together all the network structures are put together it organizes it very nicely it tracks every single lowercase slab directly so from a memory forensic standpoint whenever we're looking for a slab we can just go through the structures for the capital slab and pull out everything and have a very nice and organized list there is slob which was originally an alternative to slab for limited memory or memory constrained systems oftentimes embedded or iot device systems and slob internally functions uh just as a basic keep essentially so in linux kernel 2.6.23 slub replaced slab and the reason for doing this was a number of internal optimizations but from a forensic standpoint the two key points that changed were slug no longer just groups objects based on their type it creates caches based on the object's size so like for example task structures wouldn't necessarily be grouped with other task structures but rather objects attached of a similar size to task structures slub also only tracks partially allocated slabs which are just slabs that still have space for more objects to be put on them so we're not given these clean lists anymore where we can just iterate through all the lists and pull out every slab in memory so digging in a little further what exactly does this lab look like it's just a pre-allocated chunk of memory that's contiguous and it aligns objects on a size boundary which is set by the cache that a slab is being allocated from very commonly these slabs are one two four or eight pages although it can vary depending on system optimizations and at the end of a slab is a free list pointer which just simply points to the next available space that an object can be stuck in so a brief high level overview of the slub internals as i've been mentioning previously there are caches and each of these caches are responsible for allocating objects of both a certain size and a certain allocation type the most common ones are just any old regular memory allocation and any memory allocation that needs to occur with direct memory access available all these caches are linked together in memory by a list so once the first cache is found the rest of the caches can be enumerated through and as i mentioned before in sub-internal internals there is a partial list of slabs that are not fully allocated and then if slub debug is enabled um there is also the list of fully allocated slabs however this cannot be relied on to always be there so how do we get objects out of slabs if we're given a slab we can start at the beginning of it and just carve every kmm cache size bites afterwards and once we step the size number of bytes afterwards we're at the start of the next object and we can just simply walk along the slab in these steps and pull out every object and in the case of slub where there may be mixed object types we can validate that these objects are what we're expecting them to be now because slub does not always have all the lists available we wanted to go a little bit further have a little bit more fun and assume that we didn't have access to any of the cash lists neither the partial nor full list so how do we go about extracting objects in this scenario well from other volatility plugins we have access to other kernel lists that are maintained by linux and since we know that slub handles all the small allocations done on a system if we have an object that is small by linux's standards that means if we have a pointer to the object we know where the object resides in memory from walking these other kernel lists then by having an object we have somewhere inside of a slab and from the cache that a slab would be allocated from we know how far a cache would extend in both directions so we can just walk both up and down in memory to pull out objects now i'm not sure how well this is visible in the back but this method will generate a lot of junk since we are going past the bounds of the slab in both directions but we're at least guaranteed to get every single object out of a slab so to test this um we set up virtual machines to like memory samples from because whenever you suspend or pause or create a snapshot of a virtual machine you create a copy of the memory of the virtual machine running so this gives us a memory file to work off of and then inside of these virtual machines we ran some pro just various command line usage and a artifact generating program which just created some sockets closed and freed them which gives us something to look for in memory that we know what it looks like so obvi so some of the goals we wanted to achieve was creating a new plugin updating an old plugin slab info to accommodate slub systems since it slab info has also not been updated to accommodate subsystems and recover any processor socket information that we can so this is the first part of the demo and hopefully all will go well all right oh i cannot read that at all i may just have to sorry folks a little technical difficulties all right um can everyone read that i know the glare is not the best not at all all right so i may just unfortunately have to resort to using the backup slides on this [Music] wow [Music] oh damn that windows you gotta love when windows decides to just revert back to settings because you didn't hit a button all right well we'll just have to wing it so unfortunately demo didn't work but this is just a snapshot of what you would have seen live so first we just had to test whether or not our well first we had to update the old slab info plugin to give us valid information for the caches this is just as simple as running the or reading from the um the slash proc slash slab info file on a linux system which runs this program prints out all the metadata information for the caches on linux and then we can validate that our program's working because the parts in the top match the parts in the bottom so moving on to a little more interesting stuff so how does this actually work with carving out objects that we're looking for so when we run netstat on the system we pull out a number of sockets and ports ranging from 1080 to about 1084 or 1094 and there's some holes in the middle which are the sockets that the program has decided to close and free up and then whenever we go back with our carving approach we're able to pull out three of the sockets that were found or three of the sockets that were created on the system so likewise we can use this approach for carving out some of the processes that had ran and stopped which were just some of the command lines that had been ran previously additionally volatility by default hides the swapper processes from the linux from the ps lit or the ps list command so this is a little bit of sanity checking and that we're able to find it with carving so that was a little experiment now how does this actually fare against real world malware a little bit of background is bpf door was a targeted and incredibly invasive malware it whenever it ran on a linux system it wrote itself directly into memory killed the original process cleared out all of the related data for the previous process and the newly running process it used a berkeley packet filter to sniff traffic which allowed it to read um any traffic approaching the system before it hit the firewall since these packets are a filter below the firewall and so any packet that had the magic value in would get passed to this program which would just check basically whether it was one of three commands one being a reverse shell one being a reverse connect or a connect back shell and then the third was just a heartbeat um the program is still running so through traditional means this is very hard to detect on systems however normally through memory forensics um we can run the ps list command which shows us one of the aliases the bpf door malware uses one being slash user lib slash system which looks suspicious since normally the name is just the name of the program for the process not the actual path to that process and traditionally we can look at the socket filters and find the bpf filter set up by the process and this would look suspicious because it is a af packet with a stock type of raw so it's just reading the raw network data that's coming in which should trigger some alarms since usually only system level stuff should be running these packet filters now one of the other samples that we had run was the bpf door only it had additional root kick root kit functionality to hide itself from the kernel list so in the listing shown here if you are able to read it you'll see that there's no suspicious processes running and then likewise we don't find the packet filter set up by the malware because it's removed itself from linux's process list so since the socket filter relies on the linux process list to pull out any of the socket information it's also hidden there however with a carving approach we are able to ultimately find the malicious process running on the system and thus busting it so overall recovering free or hidden objects is important free objects provide additional context into the system and give a little bit more resiliency to when you take a memory sample and then likewise any objects that have been hidden through malicious methods you obviously those raise concerns and you would want to find those through memory forensics and then this approach is still useful even if a traditional full cache list extraction is available since one going through that process involves some back-end calculations that will slow it down as well as this still provides a solution for when slub debug is not enabled which would hide some of the slabs through traditional means um so just a quick summary the work we did developed a method for extracting objects from the slub allocator on linux systems which allows volatility to extract hidden data on memory samples created a new plugin to use as well as updating an old plugin to accommodate the new system and some of quickly some of the future work for this uh would be obviously updating the plug-in for volatility three volatility is currently in between version two and three so a lot of developments done for two but will ultimately be updated for three [Music] add some additional polish and filter to the plugin so it doesn't produce as much noise in the output and then finally linux is adding a new way of representing memory in newer linux kernel versions which will definitely impact memory forensics thank you um is there anyone with questions um [Music] scrub memory that you're free but in that case does that potentially limit your ability to detect some of these hits that might only manifold all right so just wanna make sure i heard everything right um so the question was uh how does this work in situations where a malicious process may try to uh scrub itself out of memory after it runs to unload so that would defeat this approach however approaches like that still have the possibility of creating other artifacts on the system that would indicate that a process tried to scrub itself out of memory as it was leaving which would still raise it as suspicious it may hide certain functionality of that depending on what it's trying to scrub but it would still ultimately create more artifacts that would point back to it as being suspicious that you're going through and basically saying even if i can't so to represent this i can characterize this very last thing you showed examples of places where you knew the types of lists in the walking process right do you think your technique is trying to just find blocks that have seemed to have similar characteristics so right the question is um could this approach be used to go through memory and categorize different re