
Hello everyone and uh welcome. We're excited to present the CSI hijack default Kubernetes storage driver exploitation. Today we are going to walk you through some critical privilege escalation we have discovered hiding inside the Kubernetes storage infrastructure. We will demonstrate how container storage infrastructures drivers can be weaponized against the user. We are going to show you how some major cloud providers grant this drivers excessive permissions by design. We also demonstrate how an attacker can be can bounce and break Kubernetes isolation bypassing standard ARB permissions, namespace isolation and network differences. First um let me introduce ourselves. My name is Idan. This is Shaul and Karen and we are security researchers in Sentinel 1 special specializing in cloud
ecosystem and Kubernetes security. Here is our agenda for today. We'll start with uh Kubernetes 101 covering some Kubernetes principles and how CSI drivers operate across different cloud vendors. Then we will discuss what attackers would love to see when they get access to your Kubernetes environment. we will dive into the actual attacks and wrap up with some conclusions and some Q&As's. All right. So, um in order to fully understand the attack surface, first we need to understand the demo set concept. A demo set is a Kubernetes controller designed to ensure that a copy of a specific pod actively running on every single node in the cluster. These are typically used for essential clusterwide background services such as logging,
networking and or in our case CSI drivers. You can see that uh all the pods are running across the cluster but only the demo sets have duplicated itself in each node. Moving on to the second topic role based access control or arbback permissions. Understanding this flow is critical because abusing it forms the foundation of our entire exploit chain. In Kubernetes, specific permissions like the ability to list services or delete pods are defined in a policy object called a role. A role binding then comes and granted permissions and attaches them to a specific identity which is the service account. When a pod is scheduled, Kubernetes API assigns it a service account and cublet generate a service account token assign JWT and
mount it directly into the pod's local file system. Crucially in cloud managed um Kubernetes environments like AWS EKS or Azure AKS these privileges do not stop at the cluster edge through cloud identity mechanisms such as IM roles and service account uh ISA in AWS or Azure workload identity. A Kubernetes service account can be mapped directly uh into a highly privileged cloud provider IM role in AWS. There is now an alternative permissions system P identity flow. This is default in all EKS. Um why are we emphasizing this? Because this token is literally a physical key to both the cluster and the cloud. As you can see here, the CSA driver acts as a bridge between our Kubernetes
environment on the left side and to our um cloud storage uh services on the right. Instead of building cloudspecific storage APIs directly into Kubernetes infrastructure for every cloud provider, the CSI driver translates native Kubernetes storage requests into a specific commands needed to provision, attach and manage cloud storage like a persistent disk or file stores. I would like to emphasize that CI drivers are not installed by default to the cloud provided Kubernetes environment. It must be added manually as a plug-in or an add-on. All right. So understanding this leads us um to this diagram which breaks down how responsibilities are layered within our Kubernetes storage infrastructure. At the very top we have the application layer. This is entirely userdriven. It
represents the actual services pod and workloads that we deploy. One level down is the controllers and the API layer. This is the core brain of the system. This control plane is uh the domain of Kubernetes managing the logic and state of our cluster to ensure our storage requests are correctly routed. Moving further farther down, we reach the CSI layer. The cloud vendors and Kubernetes share responsibility for this layer. While this cloud provider build the plugins, Kubernetes provides the framework and components that allow the plugins to work well. This layer is the critical handshake between the orchestrator and the infrastructure. All right. So um we have discussed uh really technical details. So let's uh do a short recap. First we learned about
the demon set and how our permissions mechanism work. Then we understood uh how CSI drivers operate in Kubernetes environment. Now let's wear our hoodies and find out what an attacker can do with this information. Show thank you for the amazing introduction. So let's try to connect all the dot together and try to understand how all this concept look from attacker point of view and let's try to understand what attacker really want and obviously as an attacker we want maximum impact once we gain any kind of initial access. So in order to get maximum impact, I don't want any kind of deployment pod. I don't want to attack any kind of user configuration or user management pod
because obviously they highly monitoring highly restricted and I don't want to attack any kind of low privileges pod because my goal is to escalate my privileges. I do want to attack demon set. As Eddie Dan said in the introduction, it's guaranteed to me as an attacker that no matter where I will land in the cluster, it's guaranteed to me to meet their demon set pod. I also want to attack any kind of no user configuration and no user management pod because the chance to find any kind of misconfiguration or vulnerability is much higher. And I also want high privileges pod that will allow me to escalate myself. So, anyone can guess which kind of pod
could be the ideal target for me? CSI. And the reason CSI pods are the ideal target for attacker is because they allow to escalate any kind any case of one single CSI compromised pod into full cluster admin and that what we're going to do today. So let's go back to how CSI operate in the cluster and how the CSI and we already know that the CSI act as a bridge between the cluster and the storage operation and in order to provide storage services to pod in the node. The CSI need some access to each one of them. Sounds simple, but in order to know which pod to serve, the CSI need to fetch the pod ID of each pod in the
node. So, what can go wrong? But instead of fetching the pod ID dynamically of each pod in the node, the vendor decided to expose the pod directory under var cublet pod like a wild card. And what exactly live behind those vari cublet pods? So as you can see under vari cublet pods there is the projected service account token of each pod in the cluster the secrets the plug-in the config map and much more. And with that I will back the m back to thank you. Um so moving forward with the real attacks. Um I would like to emphasize that these attacks are only relevant once an attacker gained access to the Kubernetes cluster. We beg you do
not try this at home. The first one involves two techniques lateral movement and the privilege escalation. So uh as as we mentioned the service account identity token resides inside the projected directory. Once attacker gain access to this directory by compromising a CSI driver, he can use uh the token to impersonate to the identity behind it. Using the O cani API, the hacker can understand the capabilities of the identity behind the token. He just sold and manage to use the permissions granted to it for lateral movement activity. We completed an exploit chain taking advantage of this idea um gaining credentials from high privileges pod but we had been asked not to publish it uh yet in order
to keep the community safe. Back to you. So let's try to use different attack direction. So once the attacker are inside the cublet p directory he can as you can see exposed to the plug-in directory and the plug-in directory is there is where the CSI driver socket live and it's most most interesting because it doesn't give you any kind of token or any kind of secret directly basically it give you a position position to become the man in the middle of every storage operation on these nodes So first thing attacker will do is to list all the plug-in directory on real node you will almost always see something like this at the top the file
store. This is the CSI socket itself. Everything below is what matters. PDCSI is the persistent disk driver. GKE MDS is the workflow identity broker and that one is particularly juicy because every pod in the node talk to it to get any kind of credentials. Secret stores speak for itself and there are usually much more. None of these socket belongs to the CSI. They belong to other drivers running in other pod under other service account token. And in Unix socket don't care who are you if you can write to the path you own the endpoint. So here the trick that make make it work. Normally cublet open a gRPC connection to the CSI driver socket and call node
uh node stage volume and node publish volume on it. This is the standard dance between them. And when the attacker will do it and what the attacker will do is basically rename the socket file and replace the file descriptor of the socket and drop its own proxy binary listening to the original pass and the cublet has no idea that anything changed. It's just reconnect again to the socket exactly like it always does and now it's connected that to the attacker socket and now everything is forward to the attacker socket and from cublet perspective the driver behaving perfectly normal and this is the key detail cublet create a new gpc connection to every CSI operation and basically now we are just waiting
and once A pod is scheduled with any kind of volume, the cublet will call node stage volume. And now look what the attacker can get for free. Every request contain the volume ID, the podu ID and the full target path. And now every pod and now we can know the attacker can know every pod on this node which PVC is uses which volume ID backs them and exactly where on disk their data is about to land and attacker basically control the entire data of the nodes with no API calls and nothing in the audit logs. So capture is capturing alone is already a strong primitive. But since the attacker is in the middle right now, he
can also decide it what happened to each call. For example, he can redirect, he can rewrite the target path before forwarding. So attacker can control what the driver reads like configuration file, secrets, binary. It can also uh deny the request. attacker can just return gRPC error to the pod and the pod would fail to start and basically targeting any kind of denial of service against specific workload and also to pass through persistent reconnaissances against every storage operation in the node and the important thing about all these three the attacker never touch the API server no token needed just just is the old attack so let's see a short demo
So basically here I can prepare the setup locate the CSI driver taking control of it and enumerate all the sockets that I have access to under the plug-in directory.
And now we're preparing to take control over the file descriptor and change it.
And now I managed to switch the file and now I'm controlling the file descriptor of the socket of the CSI driver. And I'm waiting to new storage operation to make it happen. And once it will happen I can you can see that I can control the data.
Here it goes. I get all the the request and now you can see which kind of data the attacker exposed to. Here we can see the service account token, the projected uh service account, the PVC and the pod ID
and that's it. >> Uh hello guys, I am Karan. So let's work on the EKS service door and let's hack some Okay, so the first thing you are red teamer and you get an environment you have some access inside a cluster which is very common but in a actual production environment when you're doing a red team the biggest problem is that you will check the node instance meta service or you will try to pillage all the secrets but in a harden production environment you will find nothing. So how do you escalate your privileges? How do you get control of the entire AWS account? So the answer for that is CSI. So it's important to realize that the Kubernetes environments
are very affirmable in nature which means that production environment will always need some kind of uh persistent storage. CSI is the thing that bridges that gap. So uh the fun thing is that uh every single CSA uh policy almost every single like production system has this and these are so overly permissive in nature that they allow you to take over a significant part of the data plane. So before we go in depth I'll just walk you through the targets. So right now we're going to target uh four different drivers each targeting different sections. So if you want blog storage EBS you will have EBS ESA for that. For the elastic file storage, you have EFSC
CSI. For the FSX, which is a third party uh high throughput file system, you have FSX CSI and of course the S3CSI. Now, all of these allow complete access to each one of the particular file systems and there these are the policies that are provided by default. So first thing we to actually get to this we need to understand the pod identity workflow. So uh the ones like who are familiar with Kubernetes so generally there's a flow known as Ursa and that is common across the all cloud providers but AWS adds something known as pod identity workflow. So what happens is AWS runs its own agent in their own plane. It recognizes on the base of essay and the
name space the particular uh pod and at runtime it injects AWS credentials environment variables into this pod. Now this thing never touches a disk and these are the important credentials that we need. Uh it's also important to note here that none of these credentials are part of the general monitoring stack that we go through or part of the general appsec steps that are done for the EKS. Okay. So the first one you need to in order to start you need to first identify the CSI drivers right. Uh in order to do that the best step if you are in a cloud environment just do cube cutter uh essay get all the essay and you will get the
three controllers. Now uh it's important like these controllers are all you need in order to get to the access. Also for all the attacks that we'll be showing now, it's important to emphasize that the only permission that is needed is the ability to create a pod in a Kubernetes cluster inside a single cluster. Okay. So first step. Yeah. So in order to steal this token as I told you the pod identity relies on the fact that the SA and that the uh s the name space is set. So what you need to do is you need to simply create a podl that has specific things. For example, it has a specific name space. It has a specific service
account and it also had few mounts. That's it. In the next step, you will generally just go and like create this uh attack and you'll be inside the pod and you will get the token for the CSI driver that generally you will not be have access to. So this is the important part. The impact comes next but like this is the part that you as part of like any engagement or in order to protect your systems would need to understand. So let's look at a quick demo.
Okay. So the first phase is to compromise the identity. So the idea is you have compromised a developer account. This uh essay only has the privilege in order to create a pod. So this is simple checks in order to make sure that there's no other privilege. So as you can see on star star we have no privileges but one specific dangerous privilege is added to that particular service account. This one is your ability to create a pod in the cube system name space
the phase two reconnaissance because before we deploy our payloads we need to look at the attack surface. As I told you there are four CSI drivers all offer different kind of attack surface. So you need to look at and look like in your specific cluster that what kind of attack surface is available to you. Once that is available for example in this one in the demo we are going to target the EBS CSI controller essay. U never target the node essay because it doesn't have the privileges that is needed for the attacks. So the first one the the third phase you need to trick the EKS pod identity agent that is running in order to give you the token for the CSI.
This is done by simply creating a podl and if you seen the yl the service account is set as I showed you and also the name space is set. So the highlight three again you only need the ability to create a pod. So, let's deploy the attack. And that's pretty much it.
So, the pod is scheduled and EKS pod entity will automatically intercept the request and it will start injecting the environmental variables for the AWS credentials that should give you the access to the particular uh rule. So if we do AWS STS get caller identity at this point we will get your uh specific names uh AWS EKS pod identity to CSA attack role because we were doing a lot of attacks we named it specifically so that it's easier to point it out. Yeah that's pretty much it but the impact comes next. Okay so now you need to realize that in any Kubernetes environment there are two types of deployments. One is the standard deployment that in which like the
customers are responsible for almost all part of the management security policies and everything. The second is your automated deployment which almost every single cloud provider gives out. For example, your EKS has EKS auto your uh Google has your copilot and like similar. So this policy is specifically targeting the standard EBS. So if you are using uh EKS and if you went on and just clicked on add-on and selected EBS CSSI this policy will be automatically by default added background and you will never get a prompt for it. And now let's talk about this policy. So the policy allows you to describe the entire account all the resources across all the services. It allows you to attach detach
any volume any instance across all services. It allows you to create snapshot, create volume, modify volume, and delete. Now, uh delete is like uh there's a static tag, but it doesn't matter because you can modify the volumes to reduce the size from 100 GB to 1GB or similar stuff. So, this policy is for your EKS auto. Now, it's important to understand that EKS auto is like uh part of the manage service. So this is the policy that is used Amazon EKS blog storage policy and there are three types of condition that this policy relies on. So the first condition is the AWS request tag. As the name suggests it's completely attacker controlled. It's the simply the tag that
you send as part of your AWS CLA command. The second condition is tag case which of course is also attacker controlled. And the third condition is the only condition that attacker cannot generally control. So if you look on the right side, these are from the specific policy URL. Uh the request tag of course it's attacker controlled and for all the values and tag values AWS only checks the particular tag key. They don't really care about the value here. Yeah. So in EKS auto how do you get from a single pod right to uh create to like the entire access. So let's say you have a EC2 and it is running a target volume and you need to read that all the
credentials in that place. So what you can do is you can take the target volume you can create a snapshot from it and from the snapshot you can create a new volume. This new volume will have the specific tags that are proper for it to be attached to your EKS cluster and you just attach it to your EKS cluster. Suddenly you're going to have that volume access. So on the right side if you look at the actions for that policy the create volume it's attacker controlled uh the on all the three sides uh the create snapshot again there are no conditions there create snapshot on snapshot again uh both are attacker controlled there are conditions for
attach volume detach volume cluster three uh like cluster scoped and the modify volume for the cluster scope but from what I showed you from this chain you can basically take any volume any snapshot across all your services in your cloud account, AWS account, you do this chain where you create a snapshot of it. You create a volume from that snapshot and from that volume you just attach it to the like cluster that you have uh ability to create a pod on. Okay. So we have a lot of things. So very fast uh the first thing is you need to map out the die AWS account from your uh cluster. So if you see on the right
side you will see the policy that lets this happen and the policy is literally resource star and it allows you describe availability zones instances snapshots tags. So you can do something like AWS EC2 describe instances with all the things and you will essentially get the entire recon value for the entire account. The second would be full write on any EC2 or any service that is using a block storage. How this is done? this we are talking first about the standard EKS. So in the standard EKS I already told you on the right side if you look at the permissions you have the permissions to attach or detach any volume any snapshot that might already be attached to a machine or might not
be. So u a important thing to note here is that if you like detach a machine from a already running EC2 like a volume from already running EC2 it might crash but then there's also these types of uh volumes that are tagged as IO1 and IO2 this specifically allow you to multiattach basically this would essentially allow you to write in any of the AWS services that are using in your environment block storage. Yeah. Third uh stealthy reads via snapshots. This is the attack I showed you previously quickly. So this is valid in both the managed and the like the normal standard version. So in order to get to the like um yeah so basically you
can read anything from u you take the target snapshot uh target volume that you want to target in any of the service. You create a snapshot from it and you you create a volume from that snapshot. You attach it to your node. you mount it and you have a full data xfiltration on this. Uh the commands are there for you for reference later in almost all the slides on the right side you will always see the uh policy that allows this to happen. Now in the if you see of course like uh anything on EBS on across the account will be something that will be now available to the attacker. So this includes your EC2 root
volumes, RDS databases, your EMR data, open search indexes, ECS storage and of course other EKS clusters. Yeah. So the fourth thing uh crosser lateral movement. So as like in I told you about the conditions in both normal the standard EKS as well as in the like the auto one the these are essentially the conditions. So uh if you have any cluster that let's say you got some attacker on and in that he has access to a essay where he can create a pod he can essentially move to all the other clusters for example from your dev cluster to your prod cluster to your data cluster uh compromise the volumes there compromise the snapshots there he
can also move to all the standalone AWS services like EC2 your RDS and Yeah. Okay. So, let's talk about other CSI drivers because in initially I introduced you to like four different services. So, yeah, attacking the EF EFS CSI driver. So, first I want you to look on the like the right side uh the permission like uh the policy. So if you see you have full uh permission in order to describe uh all the access points all the file systems and all the mount targets for the entire AWS account and you also have the permission in order to attach to any of these. So previously I showed to you three conditions and this is the first condition request tag which
is of course controlled by the attacker because he can simply do AWS CLI and just add this tag and yeah the same thing also happens in FSX. Now in FSX uh like the recommended policy is AWS FSX full access. So I don't really need to go in depth in it. Uh in the sense if this policy is attached to your EKS cluster, if you are using FSX CSI driver, this policy will essentially allow the attacker to uh interact, edit, read, excfiltrate every single data on all the services that rely on the FS6 endpoints. So for example the blast radius is S3 directory services IM PMS cloudatch cloud formation and other services. S3 is um again S3 is so widely
common I like I think everyone in the room must be familiar with it. U first let's look at the three policies. So the first and the third policy. The first policy is the AWS S3 full access. In the S3 document, uh they however do say that uh you can define your own manual scope policy which is the third one and even in that case you should always do this thing where you go through the CSI because as I told you this route is not like relying on your general node IM route right so this way you will be able to pillage things that you will not otherwise have access to or you will not never like the general service account
way of getting access to the resources. in cloud. This is different from that. The second policy is interesting because it was introduced uh on 7th April and it again allows uh you to attack every bucket in the account. So if you see on the right side uh list access file system and like list access points you can enumerate everything. Same for the get access point. Same uh for the tag resources and the delete access point you cannot but like it's it's a shared tag. So it's weak in the sense that uh if this tag these things are shared across all the uh because of the role name that they use. Uhhuh. Have I written it here?
Yeah. So because the role specifically is same across the all the clusters. essentially the data is shared across these different clusters and you can essentially get to it. Uh yeah, so let's talk about the blast radius. So we have as I told you two types of environments the standard EKS and the auto EKS the managed ones. So uh in the standard you get full read write on every single volume every single snapshot in the entire account across all the AWS services. uh in the auto mode you get like full read across any volume snapshot in the EBS across the uh entire account. Uh I also showed you crosscluster as well as like across services uh like boundary cross. So it's
important to realize specifically in the EBS you have a lot of services like EC2, ECS, EKS, RDS, EMR all of them rely on EBS. EFS is used by Lambda uh ECS, EC2, SageMaker. FSX is used by directory services, cloud formation, SageMaker and your S3 every data like generally that's stored in a bucket form is present in S3. Now the core problem here is that uh the policies that are scoping these permissions are not scoping this to a particular cluster but in the sense these are uh scoping this into the entire account. So yeah uh summary managed policies are trust assumptions you need to verify them right now and uh for you as like cloud security team if you are like protecting
your company your company's uh instances you need to go check these policies right now for all your modes and you need to also understand like of course AWS have uh updated this a lot of new policies have been pushed out that fix a lot of these things docs have been like updated but doc alone changes don't like change the threat landscape and for the offensive researchers you need to always check for the CSI drivers and their essays because you need to realize that these are completely hidden from the node IM these are invisible to all the standard record because this is you will not find this on hacks on any similar resource you will this of course because
of that is also invisible to almost all the cloud security solutions that we Yeah. So, uh, responsible disclosure and AWS response. So, this was initially reported to AWS on 10th September and AWS acknowledged on 24. Uh, few uh like last week on April 16, AWS released new EBS policies and updated documentations so that customers could protect themselves better. and on April 23 today we are doing the public disclosure for this. Now uh this is the official statement for the AWS. I'll uh like highlight few specific parts of it because like I genuinely feel this is important for all of us as defenders or like red teamers uh offensive researchers in order to understand. So if you see the second line for the
customers that do not use your auto mode and are not paying very extra in the sense for the infrac management the entire responsibility for managing these policies their permissions fall on your head. So these fall on the customer side of the shared responsibility model. And in the last line if you see uh AWS has released several new policies uh that would protect you better than this. Uh so yeah you need to go update them because this generally won't happen automatically. Yeah. Conclusions. Uh yeah. So essentially uh your pod security node security is equal to your cluster security which essentially is equal to your entire AWS accounts data plane. Uh then we would like to talk
about a little about because uh the responsibility of shared responsibility because in cloud environment a lot of things work in the sense a shared responsibility model where both the customers cloud providers as well as the infra providers such as kubernetes share this responsibility in a very complicated way. So you need to protect yourself and the third part of course like uh we have shown you cross pod crosscluster and uh yeah cross cloud services compromise. So yeah thank you and we'll take any questions now.