Mastering the Art of Attacking and Defending a Kubernetes Cluster

Name: Mastering the Art of Attacking and Defending a Kubernetes Cluster
Uploaded: 2024-10-31
Duration: 23 min 46 s
Description: Sanjeev Mahajan presents real-world attack chains against Kubernetes clusters, demonstrating how subtle misconfigurations in CI/CD pipelines, RBAC, and pod permissions can lead to cluster compromise. The talk covers defense-in-depth strategies across build, deployment, and runtime stages, including

BSides Canberra · 202423:46430 viewsPublished 2024-10Watch on YouTube ↗

Speakers

Sanjeev Mahajan

Tags

CategoryTechnical

TopicCloud IAM Container Security

TeamBlue Red

StyleTalk

Mentioned in this talk

Tools used

Falco Helm kubectl Kyverno OWASP Dependency-Check Sigstore Snyk Trivy

Platforms

Docker Kubernetes

About this talk

Sanjeev Mahajan presents real-world attack chains against Kubernetes clusters, demonstrating how subtle misconfigurations in CI/CD pipelines, RBAC, and pod permissions can lead to cluster compromise. The talk covers defense-in-depth strategies across build, deployment, and runtime stages, including secrets management, image scanning, network policies, and detection controls to secure containerized environments.

Show transcript [en]

mastering the art of attacking and defending a kubernetes cluster sanjie Mahajan thank

you all right I hope you can all hear me okay at the back give me a chair if you can hear me all right thank you all right uh thanks thanks for joining me on today's session about mastering the art of attacking and defending a cuties cluster um I really appreciate you all coming here and spending your precious time with me today we'll be talking a lot about attacking and defending cuties cluster as the name says a quick introduction about myself um my name is sanjie Mahajan I'm currently working as a principal security engineer with cyber Services um I've been in industry for more than 15 years now um and started off as a system

admin and moved into world of security for which I've been doing for the last 10 years uh for for the past few years years I have been heavily involved in the world of cubanes both as an attacker and Defender uh it has been a great journey it's been full of learnings and lot of challenges U unfortunately I only have 25 minutes today I won't be able to cover everything but what I'll do is I'll try to cover the maximum the the significant learnings that I have gained over a period of time and always I'll be in the hallway to chat more after the talk uh so that brings us to the agenda of today's talk we'll be talking about

what is a cuberes cluster uh common myth when it comes to your security of cuberes cluster and then talk about my favorite section of today's talk which is attack chains on the real world cubanes environment uh and then moved into the the world of Defense in depth strategy which is for my security Champions who are the defending uh Champions uh we'll look at some of the controls and tools that can help us defend our secure defend and secure our clusters and then conclude with key takeaways before we jump into the technical stuff a quick disclaimer I will be referring a lot of project teams especially devops and and product owners throughout my presentation talks and

refering them by no means I'm here to disregard or disrespect any of them I love them all and they they have been a great uh like when I comes to learning the the cuties world and the cloud stack uh so no no disrespect to them and then I'll be referring lot of security tools both open source and uh commercial in my presentation by no means I'm here to endorse any of the tools um they're only for reference purpose and then the examples and the lists shown in my presentations are not exhaustive they are only for reference I've only picked up the most significant ones that could be of your use all right so enough of talks about

the the disclaimers now we'll jump into the the world of communities cluster so what is a communties cluster the first thing so in in when if I were to explain technically it is called as a container orchestration tool now it's too technical right and it doesn't make sense but if I were to lay down simple terms it is like a set of Machines working together to deploy manage and run your continuous workloads still technical right uh let me make it more simpler I like to call it as a party planner for your software now why do I like to call it as a party planner because just like party planner cuties takes care of everything for your

containerized workloads so you as a developer of the project teams can spend most of your time focusing on building great softwares so that's the kuties container cuties cluster for you in simple terms now but what with great responsibilities comes great risk and end comes lot of confusion uh when I was uh doing in like I was working on kuties cluster engagements and in my research I saw that there were a lot of confusions prevailing in the industry when it comes to security of cuberes cluster and the most significant ones were like when we were talking to the teams the project teams and Dev they were saying like cuties it's all secure by Design and another thing we we keep on

it keep on coming to us was that uh the cloud service provider like Amazon or Google they're responsible for maintaining security of the cluster now let me tell you that that's not the case it has always been and it is a shared responsibility model now what does that mean it means that the when it comes to kuber Cluster security the the the the the role or the remit of the cloud service provider is only to make sure that they have control over the physical security of the of the containers or the physical machines like data centers they need to they are responsible for securing that whereas every other thing like configuration of the cluster where for example your

network policies your IM controls and uh also your your data storage that all realize and that is under the remit of the customers or the or who are using cubes so basically the configuration of the cuties and based on our experience based on the the research and the engagements performed we have seen that 90% of the findings comes due to the misconfiguration of the Clusters because it's not secure by default and people assume that uh other common myths and Illusions we identified during our uh like interviews and sessions with the the tier one organizations were that uh we don't need offensive security assessments happening on kubernetes cluster or there's no value add of the S such engagements uh

in later slides I'll show you the attack chains and show you how these uh the the offensive security assessments are so important on clusters and uh we'll talk about it in later

slides all right so moving into the my most favorite section of today's talk which is attack chains so the these attack chains were performed on The Real World environment with kubernetes clusters as the hero hero Champion or the hero component and they were and and the best part was that when we were attacking or we were performing this Eng ments the clients they had already mature practices in place they like when we were doing the prep work or the scoping the clients uh informed us that we are already perform we have a bunch of Fleet of security Architects and Engineers who are doing hardening on day-to-day basis so we the the the the summary of these attack is

that we were attacking a pretty much mature and locked on version of cuties clusters and and U let's see how it the let's see the details of the first one now um unfortunately I'm not going to run any demos today because I've been burnt a lot with the live demos and done and these were specific like u in the you can't replicate these environments because this is not a lab environment this was found in the real world so so there's no assumptions or there's no lab sort of environment so we'll talk about details of this so the first one was um we were talking to a client again this is a tier one client with mature practices the first

thing client says hey we have everything that is required to secure the cluster and when you are when you're going to do this engagement please don't give me any findings on the non prod non- production environment please give me findings in the production environment only because anything is nonpr I won't consider as a finding so that was the first challenge put in front of us and Second Challenge was that how far can you go when you have a limited access in my environment with like a readon user and can you do that it's like like almost like a doing a red teing for them on the cost of an intern assessment all right we started our

reconnaissance with full force we wanted to prove that the client that we are going to give you some value at it's not just about the money it's about the some value had given and and what we saw was while doing the reconnaissance the client was pretty much right it was it was so properly configured the communities we have seen one of the best configurations the client could ever make there was nothing misconfigured and the client had all the security controls in the cuber uh almost like next to impossible to crack anything and we were almost going to end up sending a vanilla report saying that hey you are I'm doing amazing work and we couldn't

find but doing like we we did not give up uh what we saw was the during the Recon process we we looked at every possible integration and every possible configuration the interesting thing we found in that Recon process sorry was that there was a single pod running in out of like like a millions of PODS or the pool of PODS that was running with a cluster admin privileges now a cluster admin for those who don't know it's like the cluster admin privileges are you can basically create any pod inside the the cluster or you have permissions to to create or or laterally move into the cluster so if you have the credentials for that pod and we knew that the the

credentials of that POD live in the memory of the containers of that particular pod so if we were able to somehow get access to that pod we will be able to compromise the cluster and it will be like a a great finding and BN for us so now we knew that we have to go after this part but how do we do it we had no clue because it was sitting securely in the cluster so what we saw was because the the cluster was not giving us any any U like uh movement or anything like we could attack we started looking into the integrating assets and the best thing we saw was there was a cicd pipelines that

were configured and they were provisioning CL assets into the cluster uh both praud and nonpr so what we saw was that again in the in the series of the cicd repositories there were a bunch of CI Runners they were running and one misconfigured Runner caught our attention which was running with a permission that is called as get Secrets now this is because I I'll I'll tell you in details about I'll tell you how this was misconfigured but for now this get Secrets permission uh that Runner was allowing us to get secrets into the production non- production environment and we had to somehow get uh CI Runners exess token again how do we do it we had

no clue but we knew that there is a way to do it through running an exploit on the repository so we started looking into the repository thanks to the naming convention that CI Runner uh we got to a repository which was running that Runner and we another misconfiguration we saw was that the runner could be configured in a way that there was this repository could uh anybody on the network could create a feature branch and U on that repository and what we just exploited that we WR a a simple script like a python script and we kind of extracted the the exess token for that Runner using the metadata apis and there it was we had the excess token of the runner

who could get secrets for the pod which is running in the production cluster so so the best part was we got the runner token replayed it and got the secrets for the port that was running with cluster admin and that's how guys we were able to compromise the entire cluster now the best part about this attack was the reason that that uh the reason that uh developer or the engineering team got gave get secrets to the runner was due to the the naming convention confusion in cuberes you have two permissions get secrets and list Secrets list Secrets allows you to only see the metadata of like just like secret name no value at in terms of the when it

comes to attacking whereas get Secrets lets you see the actual value of the secrets so the team was confused between get secrets and list secrets and they had confusingly given get Secrets permission instead of list secrets and that's how we were able to exploit that slight misconfiguration of get secrets and com compromise theti cluster so in when it comes to cuties the slight misconfigurations can create significant impacts is the message from here now we just saw how we were able to compromise the entire cluster starting from a readon user from slight misconfigurations in the cicd ecosystem and and also attacking the cluster from the cicd ecosystems now the next Target given to us was from another client

saying that if now we understand you're able to compromise the cluster what if uh in my case this uh this scenario was that can you break out of the cluster and get read and right access to the storage buckets which has sensitive data sitting there um another challenge interesting thing and U we started with our standard process of reconnaissance and we saw that on the surface as always it was looking pretty perfect we had no U no access from the nonproduction cluster to the storage buckets uh especially the production buckets and we we wanted to get a read and write access to production storage bucket that was the the task and the challenge for us so

we started looking into the the configurations of all the pods and the service accounts and and what we saw was out of again there's so many like millions and and thousands of Parts running and in one part there was a storage admin permissions configured now we saw okay it's not a big deal you can have storage admin rights given to a part but what we saw was again that this this part particularly had storage admin rights on non production and production I'll I'll tell you why it was the case in in uh in the later slides but what we saw was like we knew that once we get access to this pod we can then craft our

attack towards progressing towards the storage buckets getting access to the to the storage buckets so what we did we got an exec to that part because it was running a beautiful shell and once we had the shell there was a curl utility sitting for us to be exploited and and we use that curl called the met apis and got the access token for that pod and that pod had access to both production and non- production storage buckets uh with with storage admin so we had the read and WR access now again interesting fact here was that while we were investigating why this sort of in misconfiguration was existing or why it happened was the the the engineer team

had an incident in the production and they for just for certain period of time they had given temporary access to that pod to talk to storage buckets with the storage admin however they did not revoke that credential back and they thought like it's fine like like they they forgot to to revoke it back and that slight misconfiguration led to this storage bucker compromise so we always need to make sure that the IM controls are uh reviewed on periodic basis when it comes to the Clusters and the cloud security

Now All Right Moving to the next attack so I can go on and on but I'll due to time constraints I'll be only talking three attack chains in details but I have many more to discuss I'm more than happy to talk in the hallway on the other attack chains but for now the third attack chain will will'll be talking about container Escape now since I started uh doing offensive security engagements in the cloud space container Escape was always on my hit list and everybody talks about it when it comes to Cloud uh engagements so container escapes and container break outs they are the most Jazz words and most um the Google words I can say when it comes to

Cloud security people tend to do that uh this time we got an environment to do that so first thing first as always it looks super secure in the surface there is nothing you can do from when it comes to your cluster uh like breakout or or cluster hacking uh but what we saw this time was that that the task given to us was from the cluster there you're running so many pods and containers can you break out of the container and access the host file system and also get a working shell on the Node because the way cuetes works is you have a worker node and then which is hosting all your pods and and the containers running

inside the pods so we started with our standard process of reconnaissance again in the cluster there are so many pods and then like doing uh finding that one pod which gives you access to to your your a your task is always the hardest thing and and U we were lucky enough to find that there was one part running with the Privileges of uh of like it was running as a privileged pod and the reason it was that it was meant to do uh because it was a security tool it was meant to like collect all the metrics and everything so it was requiring that permissions uh but how we were able to abuse it I'll show you in next slides so

that privileged pod was running it uh in the cluster safely and and trying to do his job but what we do we did uh we saw that that's that's our window of attacking the cluster we got a shell onto that privilege pod and what we saw was the M the configuration that pod was running with it was mounting the host file system onto the containers and due to the misconfiguration that you're seeing on the screen we were able to pull all the host file system on the container and we had access to the entire host file system and we were happily sharing this finding to the client and the client said please go away we he did not consider my find this

as a finding because he said you are showing me the host file system you're are not showing me the the working shell on the Node so it was like all all our efforts went into some sort of drainage but what we did was we did not give up we said okay let me let us try again and uh we extracted the host file system onto the container and we started looking into that host files like what are the files we we could we extracted from the host and what we saw was there was a lot of KSA like cuberes service account tokens we we had captured from the host file system and one of the the

exess token that had permissions to create pods uh that was the game changer for us so what we did was we used that to access token to create our malicious SP so whatever you're seeing in the screen this is a suicidal bomb please don't cons configure this um in your clusters uh this was only meant to do a p to break out of the container and show have a working shell so we did uh configure this pod with all the misconfigurations of the world and we had a working shell on the worker node and that's how my friends we were able to to compromise the worker node and get access to all the pods and containers

running on that worker

node all right uh so as I said I can go on and on because this is something I do on day-to-day basis attacking cubin cluster so what we did was like what I've shown is you the details of the top three but the rest I can talk in the hallway if you want to if you're Keen to to look into the details uh now we'll move into the next section for our security Champions which is defense in depth which is about how you Safeguard your your clusters uh to make it simpler I have divided the stages so the application running on a cluster into three main stages which is build deployment and run and as the name says

the build is all about building your application code into a container and then pushing that container to a safe repository and deployment is all about provisioning your cluster how you provision your your IM roles and your secrets in the cluster and the runtime is all about application running in your containerized workloads so let's see just conscious of time I'll I'll quickly skim through the the key security controls and these are not exhaustive these are the most significant ones I have in the build stage you only you always have to make sure you scan your base images because that's the pillar of the contain your cluster your container images so always scan it for the vulnerabilities and then you have um the

next stage is to sign your images with a digitally signed tools and then push that uh regly signed and and and vulnerability checked images to your trusted container history uh now how do you do it there are some tools you can see on the screen that can help you achieve those tasks so so that's the build stage uh then you have your deployment stage where you deploy your your cluster and you the most significant controls there are secrets management and IM controls and network policies so you have seen like how attackers are always after the the secrets in the cluster and also the IM controls are always misconfigured because it's a beast you can't you can't

go right in once on day one so you it's a process so now how you achieve these controls are through the control again through the tools that are shown on the screen so that's your deployment now moving to the runtime controls you have your security assessments the periodic and the regular security assessments really help you achieve that runtime control then you have security policies security policies have to be in enforced mode please don't put it in audit mode it does no no good I've seen most of the organizations putting security policies in audit mode so please don't do that haveit en Force Mode then you have runtime uh threat detection like um containers you can run twist lock or

other tools in the containers that find that does wonders to the real time detection of the attacks so again tools like shown on the screen can help you achieve that all right uh thank you so much for for listening to me I was s it was on roller coaster because of the time constraint but uh I would like to conclude with key takeaways here um the key tick is sorry for the attacking teams here is please don't restrict yourself to attacking cubes make sure you go beyond attack cubes and look at all the integrating assets when you're attacking a cubes cluster make sure you are focusing on the blast radius just don't look at the misconfigurations of

the network policies so what I like to call it as so what so if you have a network policy misconfiguration go beyond that and find ways to exploit that Network policy of course with consent of the client uh for the blue teamers the security Champions make sure you have defense in depth strategy every environment is different and every organization is different uh look at the risk appetite of of your organization and have a defense in depth strategy in place I have done I have seen it works really well in the organization uh and last but not the least for organizations always make sure that the first thing I would like to tell the organizations is cuun cluster is not

secure by Design and not secure by default it has lot of misconfigurations that could be exploited so please come out of that uh confusion and then myth and and have regular offensive and defensive security assessments by trained Security Professionals that's the key takeaways perations and all right thank you guys that's for much time and uh I hope to catch up soon with you all

Mastering the Art of Attacking and Defending a Kubernetes Cluster

Related talks