Our Docker app got hacked. Now what?

Name: Our Docker app got hacked. Now what?
Uploaded: 2018-10-27
Duration: 55 min 13 s
Description: Abstract Someone deployed their application as a Docker container. Then another someone came along and hacked it. Now everyone is looking at you asking, "How did this happen? What did the attacker do? How do we stop this from happening again!?" If this were a normal physical server or VM, it'd be no

BSides RDU · 201855:1363 viewsPublished 2018-10Watch on YouTube ↗

Speakers

Joel Lathrop

Tags

CategoryTechnical

TopicDevSecOps

StyleTalk

About this talk

Abstract Someone deployed their application as a Docker container. Then another someone came along and hacked it. Now everyone is looking at you asking, "How did this happen? What did the attacker do? How do we stop this from happening again!?" If this were a normal physical server or VM, it'd be no problem: you'd just crack open traditional forensic tools and start building a forensic history from the disk image. But this is a Docker container... and your tools don't know what Docker containers are. So now what? In this talk, we'll go over what a Docker container looks like from a forensic viewpoint. We'll dig into how you can get access to the underlying disk/filesystem data of a compromised container and which existing forensic tools that you may already use can still apply. We'll also cover what new forensic opportunities Docker provides and new metadata that can be extracted that wouldn't be available on a conventional system. When we're done, you'll know how to tear apart a Docker container and get all those people turning to you the answers they need. About Joel Lathrop Joel Lathrop spent his childhood developing computer software, a path which eventually led him into the field of cybersecurity. Beginning with work in developing distributed systems for ensuring privacy and anonymity, he picked up an interest in cryptography which led to an M.S. focused on cryptanalysis. Delving deeper into the subfield of threat intelligence, Joel has applied his occasionally unorthodox approach to reverse engineering and forensic analysis toward research into topics such as malware counter-exploitation, malware obfuscation evolution, and botnet neutralization. In what spare time he has, he enjoys keeping up with advances in programming language theory, cryptography, and distributed systems design as well as attending the occasional opera.

Show transcript [en]

I had a really question you give out the

ID all right it's about to be all right one more question from wargames David discovers the password gaining access to the Whopper account the computer asks David if he would like to play a game what game does David prefer to play I [Music] heard it first year I know it's the easy one I'll make I'll get some some more difficult ones alright so I'm gonna introduce our next speaker this is our sir our last talk before lunch speakers topic is as you can see our docker app got hacked now what cannot understand because dr. super secure I mean there's no miss it's completely secure joel has his master's in crypto analysis he also dabbles in reverse

engineering and forensics analysis malware counter exploitation malware obfuscation evolution and botnet neutralization welcome Joel so as you can see from the slide this is going to basically be talking about what you do after someone's stalker app has been hacked cuz as we all know they're not always the most secure they're better than some things but none was the most so right off the bat that is a link to all the slides I'm about to go through the reason I'm putting up there first is because there's going to be a lot of terminal session screenshot type things a lot of text I've tried to make it as large and legible as possible but I realize now they wouldn't necessarily be

able to see it well so if you want to even just download it right now on your phone you can't so I'll leave that up there for a second while it's up there I'll give you a little bit of a story as to why I'm giving this talk basically something along lines of what the title happened I run a small cybersecurity company that among other things does forensics and one particular day one of our partners reached out to us and said hey we've got someone who was running a bunch of apps on a server and it got compromised and now now they need to know what happened you know this docker stuff can you guys take care of it okay

so one fine summer day I found myself staring at host image they had a whole bunch of docker containers on I had to figure out well how am I going to go forensic ly pull out the information what went on and figure out you know basically what was the story here and of course the first thing you do is you sit down and say okay before I reinvent a bunch of wheels let's go see if someone else has made some tools so you google it and Google came back with nothing by the time I was done I defend spent a reasonable amount of time the docker source code sorting through how all this stuff worked and when the b-sides

puzzles came around I thought you know maybe I can save someone else the trouble of having to do this the hard way like I do so the goals for this talk basically are going to be if you find yourself in a position where someone's docker i've got hacked and everyone's looking at you and saying what happened how can you go pull out the forensic information necessary to work out and answer that question we'll cover what it's like for a live system which is the easy case that's one slide and then we're gonna spend a lot of time on what to do when you basically been handed a cold hard disk yeah there were some doc

wraps on this tell us what happened not quite as simple specifically docker has a lot of different storage backends which is how they still store the disks State for a container and images and whatnot we're only going to cover two of them but they're going to be the two that you're most likely going to run into the overlay two back-end which is the default one that they're pushing everyone towards and the device mapper back-end which is the one you're going to run into on certain versions of Red Hat Enterprise Linux or CentOS there are others but if you're wound up in this position you're probably going to be dealing with one of those two so to

start off just to review what's a docker app a docker app basically or a running docker container is really nothing more than a lease on a Linux system which is what we will be looking at nothing more than an ordinary Linux application is running with two caveats one is there's that nice little box around it which is basically the docker Damon using a bunch of Linux kernel primitives to say okay we're gonna try to make you think you're in your own world as Dan Walsh famously said containers don't contain so that is a little bit leaky but it's pretty good and then the second part that makes a docker container our docker container is that this application that's running

gets its own dedicated view of what it thinks the file system is and what's a special interest to us in this case is that the weight doctor does that is by kind of stacking layers on top of each other so instead of just saying here is a you know a flat directory with your filesystem we're just kind of like chroot into it and off you go they maintain things in a bunch of layers partially so that they can have a base image they can reuse many times and then all the rights that individual containers do are specific to those containers and partially due to some of the development oriented priorities they had when they originally designed stuff

from our perspective basically for a live system we're either going to be dealing with alright we've got that entire picture so what can we grab for a cold system we're just going to have those little boxes underneath the fou-fou process which is the different disk layers that make up the food processes of view of a filesystem so live capture how are we gonna do that well thankfully docker makes this fairly simple we can actually do live capture almost entirely with just plain docker commands it turns out when you run a container almost all of the relevant metadata for what's there can be dumped out with the docker inspect command that's going to include what what image

what filesystem layers that you start with was the container started on what did they run what directories from the host file system or somewhere else that they mount into the container what kind of privileges did they give it what were the environment variables the whole nine yards so that's one thing you'll definitely want to grab if you're dealing with a live system that you can control the other nice thing is that docker also logs all of the output standard output in standard air from any container and if it's got a terminal attached to it you'll get the input too so the docker logs command will give you a nice dump of that that can be

especially useful in situations where someone has compromised a host and then promptly turned around and run more docker containers of their own or maybe that's how they got into the host often times they'll run a perfectly benign container like just a base they make a container likely a bunch of one image but they'll have a terminal attached to it then they'll pump in you know a couple commands saying like you know go download Python here's this teeny little Python script that's gonna basically just set up a port forwarding child you know off we go in that case if you were to go look at what was on disk you might not find a whole lot but being able to

look at the terminal i/o can be incredibly illuminating then in terms of how do we actually get what's on disk for a given container while it's running this is where the design of docker and their their mindset for making it kind of like a development packaging system more than a secure containing system comes into handy the way images are built is by running containers that do individual commands and then each time a container is done you take that little ephemeral readwrite layer and you commit it and it's just a new layer built into the image and so when you're done you've got this image composed of a whole bunch of little little layers each representing the changes from running a

specific command because that's how images are created it makes it very easy for us then to capture the disk state of a run a container we just committed now what we'll get is we'll get a new image that is just like the base image that it had but with one new layer on top which is exclusively the changes the container made then we can use docker save and what that's going to give us back will be a tarball of all of those layers this is especially useful because what you're going to get back is the each layer as the differences it had on the one below before so you can look at the top layer you have in that hard

ball and it's going to be everything the container changed or added as opposed to what it started running with this is oftentimes what we're searching for when we're trying to do forensic analysis of a disk drive anyway in this case it just got nicely wrapped up in a bell and handed to us then finally if the container is running the other thing that we can do is we have the opportunity actually grab process memory because of running docker containers really just an ordinary Linux application it's just got these constraints by the kernels I've done it and it's a little view of a file system from the host viewpoint a process inside the docker container is also a process

on the host well I mean that's handy for us as long as you're one of those processes inside the container you're still it's constrained by the the kernel restrictions that the that are in place on there but if you're outside in the host yeah you know if I want to kill a process in this inside a container and I'm the super user I can just kill it's just another process just kill it so in this case the docker top command basically just lists off here all the processes that are running inside this container or put another way here all the processes that are running that we've slapped these kernel namespacing and c group c group constraints on once

you've got the process IDs and you're running at a host level you can just say okay well these are processes go get me the memory you can use it to like g core or whatever your preference is and now you have memory images so if you have a system that's already been compromised and you suspect you're going to want some forensic artifacts out of it it's going to be an awful lot easier to go grab them this way then pulling the plug and doing it disk wise we're gonna go into what to happen when someone pulled the plug and giving you the disk and sometimes in emergency situations you don't want to spend the time just pull

the plug but if you have the luxury of dealing with a system you control that is running with potentially compromised containers running just grab them this way so cold capture this is the situation where someone has walked up into your office and they've handed you a hard disk and they said hey this thing was a host that had a bunch of docker containers on it and something went wrong tell us what happened at this point everything on the previous slide is useless we know that most of the information that we could easily get at with docker commands is there but we're gonna have to go at it probably the hard way depending upon what your tolerance is for potential

alterations to an image that you're doing analysis on you might be able to just load this thing up stick your own little copy-on-write layer on it boot it run docker and grab some of the things that way in a lot of cases especially when you have stricter evidence requirements you probably can't do that you're going to have to deal with a purely read-only disk and pull out what you can from it so in that case if we're staring at a hard drive when we got to pull out information about the containers we're the goodies at varla docker that is by default the directory on analytic system where docker is going to store everything that we're interested and

coincidentally all of the relative paths that are in this slide from this point out are going to be relative to this location so the next slide when it has a path right at the top that's going to be relative to Varla docker so first thing there's one tomato day that we can go grab that's fairly easy to get it one example is what image is what docker images were installed on this thing well in pharr lib docker image the name of whatever storage background they're using there is a file called repositories that JSON and it basically just lists here are all the images that were pulled down onto this host so right off the bat you can basically presume

any container that was running probably use one of these as the base if it was a compromised host it might even be interesting to know what images were there even if none of the containers that were running that you could find when the plug was pulled appeared to be using them it may have been a very short-lived container someone used to gain lateral movement or some kind of privilege and looking to see what images are there could be very useful another note for the sake of clarity you can see there's a bunch of sha-256 hashes but I've just trimmed a them all made them very short cycle ipsus after them I'm gonna keep doing that all through the presentation so

you're gonna see lots of teeny little like six character shots with these six hashes that's just me truncating them so they fit in the slide and this is actually readable in the back hopefully so there's one place another thing alright we've looked at all of the the images that were pulled down onto this host what about information for a given image well once again relative to varlet docker there's gonna be a directory image back end and back end it's going to be the name of the storage back-end so in this case we're gonna be looking at overlay two or device mapper so it's the docker daemon that was running on this host used overlay - it's gonna be

image / overlay - / image DB content shot - 30 SEC's and the image ID the image ID is just that long shot 256 hash in fact an ID for pretty much anything in dr. as a shot to the VESA cache so that file is a really big JSON file and it contains pretty much all the pertinent metadata you might be interested in on this image like for example the images name the default entry point script so if you just say docker run image name what runs it's going to be the entry point script the default arguments for it the build history this one's interesting basically it's not quite as legible as the original docker file the image was

created with because Dockers renamed a bunch of stuff internally but you can still see all the basic commands that were run to create this image when the thing was made and a lot of other metadata as well then what we're really interested in a lot of time is we want to know what containers were warning and what's going on with them so once again relative to var love dart there's a directory called containers and it's gonna have a whole bunch of child directories in it each of those is going to be a sha-256 hash which is the container ID and then inside those directories is gonna be a bunch of file but the one with the really useful

information is this config v2 JSON file that's going to be more or less equivalent to what you would get if you'd run docker inspect while it was running the other important thing to note is that containers when docker runs them aren't always deleted when they stop by default they're left there so this would be in what you're going to expect to find in that directory will be all the containers someone ran at some point and didn't get rid of and all the containers that were running at the time when someone pulled the plug on this host and there's a lot of useful information to find there obviously things like the name of the container what was the ID of

the image it was based off of driver which is what storage back-end was used especially interesting when you're trying to build a timeline for what happened the time the container started running and the time to finished running paths and arcs just because an image has a given entry point in arguments for it that run by default does not mean that when you run a container using that image that's what you have to use you can specify for something different but whatever it's going to be is going to be recorded in this container configuration JSON file also all of the environment variables that were passed into this container mount points this can be especially interesting if you've looking

and you're seeing a container that has mounted the root filesystem just slash into the end of the container itself that's highly suspicious that's probably the kind of thing you're gonna find in a container that does something like you know dumps the Etsy pad for a password file or add something to a cron tab or some other way to basically gain persistence on the host obviously there's other things like permissions settings that could be added that would give them excessive permissions tax as the host there's a slug you can use in docker run which of course now that I need to say it I immediately forget what it is it starts with a pea blanking anyway there's a

flight can pass the docker run that basically says yeah all that nice little box we put around your container throw that out let this thing pretty much have access to the entire system if it wants it and at that point can tanners Asians not really containing anything it's more being used as a way to just deploy a software package so that it kind of has its own file system and isn't messing with stuff that's another thing to be looking for the other interesting thing back when we were looking the live forensics capture there was that docker logs command we know that could be very useful especially if someone's running what seems like a benign container but

then issuing a bunch of commands to it over a shell session on TTY is connected with a container how do you grab that well docker logs is just pulling that out of whatever log storage system docker was using oftentimes they're a fault for log storage is the JSON backend which basically just dumps another file most directory the container ID - Jason dot log which is going to have all of those logs B with something else but still stored as a file that log path value in the config is going to tell you where you can go find those logs which once again can be very limiting I believe for the Jason back and you also get

timestamps for when individual input and output was done which also can be very useful you're trying to build a timeline and someone's done something like Rana based Linux container and you shoot a bunch of commands to try to gain persistence somehow so that's all fine and good for metadata but what do we do about disk content and this is where things get a lot more interesting so you're staring at that disk someone's hand of you in your office and you want to go find out okay well what did the file systems of these running containers look like what were the changes that these containers made based off of the base image so this is another picture

taken from Dockers documentation of roughly what a docker container file system storage is going to look like docker basically has all the different layers for an image those are read-only and then on top of that they stick another layer which is read/write and any time the container wants to write something it's going to get written into that thin read/write layer any time it reads something it starts the top looks is that file or block there yes great pull it out if not keep going down so you basically have this stack of layers of file system or disk data that where each layer above kind of trumps the layers below and each layers basically including changes on the

layers below the tricky is how's all this going to get stored so to consider that let's have a small motivating example here is a really small simple docker container so the docker file which is going to basically create the image of what our container is gonna look like does hold me a few things one it says we're gonna base off of the alpine linux based image so we're probably gonna go pull that image down from one of their repositories and then start running these commands to build the image for our container then since we're this example we're basically simulating a fruit basket we're going to add a lemons file and a pears file to the fruit directory then we're gonna add

an entry point script to run when this image is actually invoked as a container and say hey look this is the entry point script go run it and the entry point this will run when the container is executed it's basically going to add apples to the fruit basket it's going to change our note on lemons it's going to add a dirty into the fruit basket and then we're going to for the sake of this example make sure all of those changes are synchronized to disks and nothing just stays in memory then we're probably going to throw the durian out of the fruit basket because they looked interesting we couldn't handle the smell and because we turned our back and the

pears instantly over ripened on us we get rid of those too so the with this motivating example we'll then look at what are the results we get for overlay two storage backends and twice pepra beckoned and how do we extract them but before we get to that what is it we're expecting to find well at minimum we're gonna expect to find these three different layers stored somewhere somehow at the bottom we're gonna expect that Alpine image Alpine Linux base image that we're building everything on somewhere in the middle or of the stack or at the top of our image because remember it's gonna it's going make multiple layers trying to build that thing but when it's done building

the image we are going to expect to find a fruit directory that contains only lemons and pears then when we run the container that thin read/write layer is going to be redundant disk and there we're gonna expect to find apples and lemons but no pears and at least no remaining file for a journey and but maybe people were lucky we'll be able to find evidence that during existed at some point so this is what we're expecting a fine as we go into this so first storage back-end overlay - this is the default insofar as dr. can push it this is what you're going to run into most of the time in fact I think right

now for most all Linux distributions except for some versions of Red Hat Linux and CentOS this is what we'll run by default unless you haven't have a hosts file system of like btrfs or ZFS in which case it's gonna go use that it's based off of a linux kernel module called overlay FS the way it works is basically by each layer is stored as a directory and what that directory contains is just other files and directories that are different from the previous layer so if you've added a new file it's going to add a new file to that different directory if you change a new file a change an existing file it's going to add your change version of that

file to the different structure for that layer if you remove a file it's or directory it's gonna find some way to kind of add something kind of like a tombstone and they've got different ways of doing that to show that that particular file of directory has been removed as compared to the previous layer what the overlay FS kernel module does then is that you give it a bunch of these different different directories in a mount command it can basically layer them all together just like in that picture where at the top what you've mounted looks like a nice flat file system and anything you write gets written into the top layer and only the top layer but all the reads are going to

check each layer from the top down until they find something that matches and then hand it back so in this particular case when Dockers using that that means that all of these overly FS layers are going to be stored on the hosts file system and that's going to wind up being interesting later on but of course the first thing we have to do is we have to figure out well how are we gonna go find these layers and of course that's not maybe quite as easy as we like but it's not all that horribly painful so here's a directory tree basically we started with our live docker and there are two children directories we'll be looking at

from their image and overlay to image on the Left overlay to is on the right we've already seen other paths that started with image some kind of back-end and then something else well here the back ends overlay to so we're looking at image overlay to layer DB that layer the DB directory is basically the way docker tries to keep track of what are all the layers in all of the images and containers I have four containers this directory called mount and what will be inside that will be once again a whole bunch of shot 256 hashes that are the container IDs for the containers that or had run at some time there'll be a bunch

of files in there but the two in particular we're interested in is one called mount ID that's going to have a shot 256 hash and if we go look in that overlay two directory under Lavar Walker will find a directory with the same directory name is that hash then there's going to be once again and that layer DB mounts container ID directory a parent file that parent file will contain another shot 256 hash inside of it and that points to basically the next layer but since you're running container the next layer is going to be an image layer the image layers aren't in a mounts directory they're in a shot to 56 directory why I don't know there would

have been no collisions if they were using chalk 256 hashes for all of it but this is the way they wanted to do it so similar kind of structure instead of a mount ID you a cache ID file that points over okay this particular overlay to directory and then a parent file will point you watch the next layer to basically find all the layers in a given container in its image you just start with a container look for the parent file move on move on move on down until you finally create a directory that doesn't have a parent file and that one's the so here basically I've kind of cut out the other layers that were not

interested in and we just have the container layer at that image layer and then probably the LP analytic space layer over in the overlay - directory you'll have all of those shot 256 directories and inside of them a directory called diff and that diff directory is the overlay - or the overlay a vast difference directory so given a container that looks like this we'll make note of the overlay to storage storage hash for the container read route layer is a 2 3 5 - OH if we want to go say alright that's fine and dandy I found all the different structures but I want to see what the container was looking at when it was

running we have to mount the thing so thankfully that's not too terribly painful they make a directory you want to mount it in let me say mounts the type will be overlay the source is overlay and then because we don't care about writings we're not gonna try to modify it we can just say for options it's going to be read-only and instead of trying to give it a merge director in an upper directory - let say here's just a whole bunch of lower directories we're only going to be reading so all you have to do is just staff these together we're not gonna worry about rights right now and if we look back one you'll see if

you look at the three different trim kyo - hashes on the right side you'll see here they are again in that lower der options section listed in the same order the first one is the container read/write layer and then you go from the image layers from top down and you mount it and if we go look inside that mounted directory lo and behold there are apples and lemons which is what we expected when the containers running apples are there pears are missing so that's nice we can actually now take a look at what the container itself was looking at before someone pulled the plug but let's say we want to look at what the container would have had at the

moment it was started the top image layer well that's actually pretty simple in fact the next slide is going to be almost exactly the same as this one with about one change in the mount here we go instead of three now there are two we just took that that very top a - 35 - Oh dip directory that was the container read/write layer and left it off now it's just the top image 1 all the way down and if we look inside that amount of directory whatever fine well we find lemons in pairs which is good that's what we should find that's what the image was set at so that means we can now mount both the container what the

container viewed and what the container started with and would look at them side by side but a lot of times when you're doing that what you really want to say is well what's the difference between these two in that case overlay FS actually made it kind of easy for us because all these dip trees are literally just differences in between layers we can just go look at the different amount anything so here is the diff directly for the fruit fruit subdirectory in the container read light right layer apples is there lemons is there and pears is a character device with a major and minor number of 0 that's basically the way the kernel overlay FS MA module denotes a file has

been removed they just make these 0 0 character devices and then for directories the way they make things that removed is they'll just to get an extended attribute on them so all right we've figured out how to mount the container read/write layer we figured out how to mount the image layer we can realize but a lot of times we'll exactly what we wants just by looking at this diff directory without doing anything at all what about that during remember the container entry point script creates a file called during in and then after it's sink changes to disk promptly turns around and deletes it sometimes especially if you're dealing with a container that was created by a malicious actor they're

probably going to pull some things down the disk make use of them in memory and then promptly delete them from disk we're not going to find out at a lower image layer and we're not going to find out on the file system for the container layer is there some way we can still get that well if this were a normal host you could go about using normal forensic tools to try to see okay well the file may have been deleted but as the inode 4 it's still lying around somewhere remember back when we introduced overlay to back-end we noted that since these are just directories of differences everything's on the host file system in the event that something like this

happens and a containers been compromised or a host with containers has been compromised there's still very good merit and grabbing the entire disk image for the host because in this particular case if you've done that you can go actually find the durian in this case I'm just using two commands from sleuth kit one where I'm saying okay here's here's my host file system device device da1 and here's the full directory to where the fruit directory in the overlay to diff for the container readwrite layer is what was the eye note for that great now it gives me the eye note then I use another command I say ok here's the device here's this inode list

the files internet and by the way let me know there happens to be anything still dangling around they got deleted and lo and behold there's the durian still there so since overlay 2 operates on a filesystem level in order to well on a file level in order to grab information like this we're still going to need a full forensic image of the underlying host disk now the nice thing about all this is a couple of months ago the Google search for what do you do with docker storage forensics got slightly more useful a guy named Romain Galen over at Google created a nice little Python program that will basically automate mounting overlay to and the

older overlay and a ufs storage backends it's called docker explored you can find it in Google's github github repo there so if you ever find yourself in position with all you really want to do is just mount the things this might be an excellent place to start so this brings us to the second and slightly more interesting back end which is device mapper this is the one that you're probably not going to run into unless you're handed a host image that was either Red Hat Enterprise Linux toss up to I think like the very most recent version of Red Hat Enterprise Linux the way this one works is by using the Linux kernel device mapper system

and the Linux volume management system or logical volume management system basically what docker does is it says ok we're not going to store things in the hosts file system you're gonna give me an actual disk for this thing we're gonna trust that disk or disks to LVM and I'm going to go make a volume on that disk that is going to be the base device what they're gonna do on that base device is they're going to go set up a filesystem in a lot of cases I've seen that file systems XFS and they're gonna leave a completely blank then every time they need to make a new layer they're going to snapshot that base device and if they

wanna make a layer on that layer they're gonna snapshot the first layer to get the second layer and have the new changes there all the way up till they've built all the layers for an image when they run a container they're gonna snapshot the top device of the image and get a new device that will be used for storing all the rights from the container so in this case instead of dealing with files and directories we're going to be dealing with block devices and instead of finding the data in convenient places under part of docker we're gonna have to go mess around with twice mapper and thin pool snapshots to be able to get our hands on it but

before we can do any of that we still have the same problem how do we find the layers in this case thankfully it's more or less exactly the way it worked with overlay OS except we note here of our live docker image instead of saying overlay to now it says device mapper but everything under that on the image left side is remains the same it's still the same layer DB directory structure still the same parent pointers to each layer beneath still the same mount ID and cache IDs what's different on the right is instead of having an overlay - directory and a bunch of diff directories underneath we have a device mapper directory with a metadata

subdirectory and a bunch of files that each contain metadata we're going to need to be able to try to get our hands on the virtual block device for a given layer before we get our hands on it first we're gonna have to look at how we go about doing that so this the nomenclature here can be a little bit confusing a DM set of create makes you think okay well I'm about to create a new thing pool layer know we're creating the device for an existing thing pool volume DM setups the command you generally use to interact the device mapper back-end in this case we're going to create a new device table as the option you hand to basically say okay

here's the information to go with and then we're gonna have to give it a couple of things a starting sector off of that base volume the size and sectors that this virtual snapshot device extends it's gonna be a thin snapshot so we say thin then for thin we have to give it what's the base pool device we're doing all of this off of and what's the volume ID so all of these little layer snapshots are gonna have different volume IDs so how are we gonna go about doing this well to access one of these layers first of all note the asterisk you actually have to have the docker volume LVM mount activated that's a little outside the

scope of this talk but if you run into that and you have some difficulty just google forensics wiki LVM and there's a very nice explanation on how to do it well that will give us will be a device called slash dev docker thin pool so that's the that's the base device that metadata file from the branch on the right we saw slice back that's a JSON file it's got a couple of fields in it the two were interested in are the device ID and the size because that gives us the remaining information we need to hand at the EM setup so whilst hand iam set up create give it whatever name we want for this new device table

it's always going to start at the zero and then the size is the size from the metadata file divided by 512 and that's because device mapper always uses sectors of 512 bytes because Dockers record in size and bytes and device member wants it in sectors we have to divide by 512 thin dev docker thin pool and then the most important part to figure out which which layer are we getting is that device ID once you've done that what you'll have is a regular ordinary Linux block device just like you would have for physical hard drive sitting at dev Macker dev mapper DK my container and if you look at it with file say hey look this isn't except as

file system so at this point gaining access to the data is fairly mundane make it ready make a directory to mount it on mount the device on to the directory you want with one little caveat if you try to do this twice at the same time the second time is probably gonna fail unless you put this no UUID option in and the reason for that is remember they had that base device at the bottom and they put a filesystem there that's empty and from that point there are just snapshotting up well the filesystem is gonna have a UUID built into it and all the snap shadow file systems are going to have the same UUID built into it so when you

go try to mount the XFS filesystem once it's like great yeah this you IDs not use we're good then you try to mount a different one it's like no no you already melted that and you have to say well actually we're kind of breaking your expectations for how universal this universally unique identifier is it's actually different trust me just ignore that it's ok so if you leave that off and you run them out the second time to try to get at a different layer and you get this error that makes it sound like it's corrupt it's probably not just at that on alright so that's how we got got at the the container read/write layer

sure enough we've got apples and lemons so what can we do from here well what we have at this point are these thin pool devices we can get one for each layer the process is exactly the same whether it's the container read/write layer or one of the image layers just go use the layer DB chain figure out which metadata file it is pull out the device ID in size and off you go but the really cool thing is this more or less acts like what we have are these little virtual hard disks so instead of having to monkeying around with doing things differently if you really wanted to you could just use your imaging tool

of choice and just image one of these things and go to town later the nice thing is that because you have the lower layers you could say image the container read/write layer and image the image layer directly beneath it now you'd have two disc images and you can just compare them and see what the differences are the other thing and this one's for those who are reckless or both apart there is a tool called thin dump it's not terribly user-friendly you're gonna have to muck around with a kernel to get a meditative snapshot but what it will do is basically dump all of the metadata that the device mapper kernel module is maintaining for these thin pools the

reason that's useful or could be useful if you want to go that route is that if you think back to that let's see if I can actually find that diagram again real quickly here if you look back at this diagram again you have a base device and a whole bunch of snapshots on top of it and we know anytime you have to do a read it's gonna start at the top of one of those little towers and go down until it finds something that matches well all the device mapper is actually doing underneath the hood is it's got that that big base base device in the volume underneath it in the LEM storage pool and it's got a whole bunch

of blocks there and so it's like okay I'm gonna start handing out blocks to people so for the first base device it gets a whole bunch of blocks then when you do a snapshot anytime you make a change it's gonna hand out a new block to that snapshot and so now the data for that snapshot from that change is gonna be stored somewhere else in the volume what device mapper is doing in size it's just basically maintaining a nice little list of okay given snapshot which blocks in the snapshot point to which blocks in the volume and that's it that's all it really is all it does is it's a new writes done is just gonna change that

and say okay well you used to point to the block that the layer before lower blow is new but now you're pointing to this new one so if you're able to there we go if you're able to dump out all that metadata within dump you can basically say look at okay which blocks were pointing to what other blocks at the container read/write layer and which ones were pointing to what other blocks at the image layer directly beneath it and now you know exactly which places on disks something changed at this can be more useful and depending upon the size of what you're dealing with more practical than trying to do a block by block comparison of both devices all the

way to the end you could be dealing with you know gigabytes and gigabytes of data and waiting for a while whereas there might be only like two or three blocks that changed this would also potentially be a nice way to find the durian remember we created and deleted it you could go rip open the file system with XFS DB and hunt down through all the file system a today then find the directory thing I was in like oh yeah well that was one of those smaller directories where basically we lost all information once the file is deleted great but the block that was changed on disk is still there so using thin dump might be one way for for the more more

bold than comfortable messing around with directly interfacing with device paper kernel stuff to pull out just the blocks that change then compare them there in summary if you're dealing with a cold disk you've got a bunch of metadata JSON files that will give you more or less the equivalence of docker inspect will give you for a container an image you have this layer DB directory that basically gives you a chain to start from any given image or container or an even layer and figure out what is the layers beneath all of it and then depending upon what your back-end is your are they gonna have the overlay two directories that'll have all these directories of differences or device

member directory that's gonna have all these virtual disk snapshots so I realized this was kind of a lots of texts and text eternally technical talk so I made sure to bake him plenty of time for questions at the end so we have time for that and as a bonus I also put together a cheat sheet for you all it's just that one page landscape letter cheat sheet that contains the the high points of this that would basically get you access to the noise mapper overlay back ends and show you the layer DB tree show you where some of the metadata Jason is so questions yes what initiates a new layer well you're going to get a new layer in

one of two ways new layers are built when you're creating an image which is basically a series of docker commit commands and actually all that's that's not that's not really true new layers are created when you run a container so you have an image you hand it to da current say run this thing a running image is now basically a container and when a container runs docker will take a it will create a new empty read/write layer on top of whatever whatever else is there so you get a new layer when you run a container the way you build up an image full of layers is that let's see if I can find that docker file I showed

you yeah the example here so that docker file every line in that file is gonna create a new layer in the image the first lines going to go grab the alpine linux image that's my base set and it's actually only got one layer to it then when i say add lemons and pears what docker actually does is basically it does a docker run on the alpine image and it basically copies two files in and then it's done and now it says docker commit which takes the container read/write layer and makes it part of the image i'm building up and it does that for each step no well no with the caveat there's an experimental option

you can turn on a docker called squash so that when you do a build like this it will as the final step take all those layers squish them together and you just but yeah generally the way you're gonna get layers is by default it's gonna be one layer per line in that docker file and then any time you run the image that extra container layer slapped on top yes [Music] yes there is yes there is it's not a huge hit but it's enough that if you read the docker Doc's you're gonna find repeated in any of the storage back-end stuff basically saying yeah and since we're chaining layers and layers and layers and layers and layers if you meet

high write performance maybe you should go use a volume for that the other interesting thing is which back-end you use is going to give you different performance trade-offs so the device member one's kind of nice because you're just using this Linux kernel thing that really does exactly what you want the downside is that for the thin pool you have to say how large a thin pool block is and I think by default docker says 1,024 now that blocks in sectors of 512 bytes so basically you're guaranteed if I change one byte in a file at minimum I've now got half a megabyte of data that just got added to my new layer and that can add up overlay

FF the directory difference is one is a little better in some ways also the way the kernel module works the page cache for the file system is useful so if you have a whole bunch of containers all with a common layer somewhere in the middle and they're all trying to read that same file it actually is that same file whereas with the Weisse mapper it's gonna be technically the same filesystem but different block devices so things that confused and it's not always necessarily gonna benefit from the page cache 5 containers are reading the same thing you're gonna go actually go to disk 5 times the reason that Red Hat Linux ism is has had to use device

member for a while mostly just because they were doing a lot of really cool stuff with SELinux they try to contain containers a little better than they currently contain but it didn't play very well with overlay FS and I think they just to get that kind of worked out and the most recent version of her head and her poised Linux seven can do overlay FS no other questions yes yes I was actually able to I mean it it was kind of anytime you run into a case where you're like okay well before I reinvent the wheel let's go Google and Google says yeah nope you know you got a fun problem so it was a fun one but I mean it didn't wind up

being an obstacle the nice thing about things like well I mean any kind of computer system yes yeah I can't really tell you what it was because obviously it's one of our clients well I mean it's when you're doing incident response from people they don't like it when you tell tell the stories but yeah the gist of it basically was is that it was a host that was compromised once they did get in they actually proceeded to advance advance their hold on the host by means of running more docker containers and so if you kind of play back the talk through your head you'll hear some points where I'm talking about yeah they might do this or a grabbing terminal app

it might be nice and it's all from experiencing any other questions I'm kind of blinded so is anyone in the back there going once yes when the container is removed maybe it depends upon what the logging back-end is so for and all right I have to caveat I haven't tested this so if you really want to know grab me afterwards we can flow up at the m and we'll find out but I would presume that for the the JSON file it would because that container the container directory that has that containers like the config v2 JSON file that's where the Jason log file usually gets dumped in that container directories gone but the Jason the Jason log outputs not the only

docker logging output they've got like half a dozen or more like one of them for example is Journal D which is just going to write a straight to the system Journal obviously docker doesn't really have control to be removing things from that and if they did I think people probably get very very angry with them other ones are routing straight off to things like fluent to hear I think rate log so for a lot of the logging backends actually even if the containers removed the logging data is probably still there somewhere it just depends upon which backend of the use as to where you're gonna find it I'll apply myself second yes way way way in the back what's that

a question no just someone stretching yes right that's gonna come down to your own judgment and your own policies if oh sorry the question was basically you have a trade-off between the easiness of grabbing data if the system still live versus if something's just got infected or hacked I might work I've got to pull the plug right away what's the trade-off like how do you how do you decide what to do and my answer is basically it's gonna depend upon your own judgment in the circumstances so if I'm in a lot of cases probably yeah pull the plug it's gonna be harder but I made you chichi that may be the one case I think of

where it might be worth considering not pulling the plug is if you're dealing with a situation where you have a high suspicion that the adversary are dealing with is gonna be clever enough not to write anything to disk in that case the only thing you've got is memory so what you might consider doing I mean if you think about okay well at what point do I realize the system's been compromised well probably a little while after it's already been compromised so there's already a time wonder the time window the attackers been operating in so the question now is how much more time am I willing to give them and is the added risk from giving that time

worth the payoff if for example you sat down and you wrote yourself a script that kitten that could in the matter of milliseconds go and numerate every docker container on a host pull down all the memory right at the disk make sure that sync the filesystem and then cut power alright if you've got a additional time window of a couple of milliseconds but when you gain from that is the ability to actually look at what the contents of memory was maybe that's a worthwhile trade-off or maybe if it's in something that's a really really critical system and you're willing to say all right I might not be able to solve the puzzle what happened here but

at least I didn't give them the extra hundred milliseconds just straight pull of pull the plug ultimately it's really gonna come down to like your gut knowing your organization and knowing what your trade-offs are but yeah the memory case is the one case where I would I would consider possibly giving them that additional couple hundred milliseconds or maybe even a couple of seconds of it's a slow system with the time but that would probably be only if you had a ready-made automated tool to do it and just hit the Fort Mingo other questions I have not sat down and done that mostly because my wife and I just had a delightful little boy who is now seven

months and it's got me quite busy but oh thank you but I mean I I will say I mean one of the things that my company does offer is tailored software solutions if you have someone who wants to pay for that done talk to me afterwards we will make it happen anyway else [Applause]

Our Docker app got hacked. Now what?

Related talks