
I'm going to head it off to Lenin here, and now he's going to bring us home. Designated hitter. Boom. Thank you. Thank you. Thank you so much, everyone. I really appreciate especially for being at this time of the day. I know like if I could I would be getting a drink. Otherwise, let me start. So, today we're going to talk about Kubernetes and specifically a technology called Rback. So, I already got a little bit introduced but my name is Lenin Alesky. Currently, I'm working as a security you have a project, if you are looking for feedback like for volunteers, please reach out to me. I would like to to get involved. I've been working for startups and big
corporations and everything in between. I'm also a volunteer at Pacific Hacker Association, a non-profit that focus on helping people getting into cybersecurity, connecting with hiring managers and stuff. And I'm obsessed of cybersecurity. I really love talking about security, hang out with like-minded individuals. Like please reach out to me. Let me know what are you working on. With that said, let me continue. The agenda for today. So, I mentioned this is going to be a talk on Kubernetes. I'm going to talk for a few minutes about this technology used for the people that maybe they is the first time they hear about it or they are curious. Maybe maybe your organization mentioned about it and you have to learn
it. You have to secure it, how to do it. So, that's going to be the introduction. Then we are going to move fast to the threat model of Kubernetes. We're going to explain a little bit of some of the most common attack techniques and then I'm going to be introducing this this website, this platform that I that I built with the tool called Rback Atlas. With that said, Kubernetes 101. What the hell is Kubernetes? So, I whenever people ask me about Kubernetes, I like to answer with with an example like this, which is like a diagram that has uh different environments. They have like a testing environment, production, you have pre-prod, development. Usually, this is what companies in the enterprise
world they want to have, right? They have like they want to have like different environments that they are like very close to each other. They can in a declarative way they can spin up, they can destroy, they can mutate, and hopefully they can do all of that without having to worry about the actual hardware infrastructure. So, that's what Kubernetes offer. Kubernetes is a an orchestrator of containers. Uh it's very popular in the in the internet, in the enterprise, in the organizations, and it's often called the operating system of the world. So, usually when you are working with these technologies, when you look at the the YAML manifest, what you see is some kind of definition of a Kubernetes
service, then a deployment, how many replicas you want, the different images where it where the actual source code will live. But, in reality, when whenever uh Kubernetes take this configuration and make their magic, this is going to be deployed in a way in which you will have like a service that is going to be um exposed to the outside world. But, behind that service, there's going to be a series of nodes that they can be physical machines or bare metal. And inside those nodes, there's going to be different pods, and inside those pods, there are going to be different containers. So, right now you can very quickly see how the attack surface is is growing.
So, with that in mind, uh there is a a distinction that you have to make, right? We can divide the different machines into two types, the control planes, which are the most privileged ones. These control planes they contain high privileged processes such as kube API servers, etcd which is a no SQL database that Kubernetes use to restore the the state of the cluster. You also have the things like the scheduler and the controller which helps Kubernetes to orchestrate the software, to deploy it, and basically maintains the state between the manifest and the actual application running. And then you have the node components which are the ones in the bottom. Uh these nodes will run the actual
source code, the actual images of the applications. And here you will have some I would say less privileged processes such as the kubelet, the kube proxy, and very important a container runtime because here we are talking about about containers, right? That's that's kind of the backbone of of Kubernetes. So, putting our red teamer hat, like what's going what what do you want an attacker can do, right? This is the threat model. So, if an attacker controls a container or a pod, let's say you use get like remote code execution on an application, an attacker will automatically has access to to the source code of the application. If let's say this is like a Python application or
like a PHP application, you can go and inspect the source code. But not only that, you can also read the configurations, environment variables. You can try to hit some other services running on on on on the same host or or other pods. Then you can try to escalate privileges and also try to attack the kubelet. The kubelet I mentioned is a a very important process that in previous versions of Kubernetes didn't have any authentication. And if you use query that that endpoint is going to give you some information about the running containers on that machine, right? Basically like a some kind of inventory. Then if the attacker control the whole worker node, then from that the attacker
can use basically try to escalate privilege or move laterally to a more privileged node like the red ones. And also they can automatically own all the containers running on that particular node. And then finally if the attacker use control the control plane uh node, that's use the game over. Why? Because each one of these processes are critical for for Kubernetes, for how it operates. Let's say if you mess up with the scheduler, you may cause a notice, right? And then people will be yelling at you. If you read directly from a CD and the database is not encrypted, basically you bypass all the authentication and authorization mechanisms that Kubernetes has in place because you are accessing the data
directly. So with that said, I'm just dropping here some resources for people that if if if you as part of your job you have to learn these things, it's always good to start with the OWASP top 10. They did this in 2022. It is still very relevant. Uh I'm not going in much detail on on this on any of these because that's not the idea of the talk. But I I highly recommend for you this resource. Another good resource is uh what they basically did on the threat matrix for Kubernetes, which is for people that are familiar with the MITRE attack matrix, which is more focused on like traditional uh tactic techniques and procedure. Here is the same idea.
But it's focused on Kubernetes and containers technologies. For example, on the traditional matrix you have something like initial access to a machine. Here you will have initial access to to the cluster, right? And all the techniques that can help you to do that. And and you have the same kind of uh kill chain like initial access, execution persistence privilege escalation, defense, and and so on. So, I'm going to very briefly talking about some of these ones. Uh I have a longer talk that explains in detail, but because I want to um move fast, I'm going to talk only about basic one. So, in Kubernetes, like if you want to execute commands in the context of a
container of a running container in the cluster, you can do it two ways, right? The first one is the traditional way. You are an attacker, you use find an application, you poke around, and you got a remote code execution, and then all your code all your command that are you that are you typing here may be able to reach other applications that are not necessarily exposed in the cluster. So, that's kind of the traditional web pen testing. But, as a as an attacker or red teamer, if you, for example, fish some credentials from someone that operates a cluster, and you manage to use talk directly to the master node or to the API server, you can use regular uh kubectl commands.
And the command for that is kubectl exec, and then you use type any command. And if the container allow on the credential, basically you you will be able to very easily write commands inside that, right? So, those are some you some ideas. Uh the the idea of using existing tools is the concept that I'm pretty sure everybody is familiar with, which is the living off the land. So, as a pen tester on Kubernetes, you will be using this binary a lot. Another idea is if you compromise the pod already or the container, you can do what I just mentioned, which is you just have access to all the credentials that are inside. You can reuse those credentials to see if you
can get access or move laterally to other services. And depending on if you are running a Kubernetes cluster on a specific cloud provider like AWS or Azure or any of the others, you can just try to also attack the the different metadata services that they have and try to obtain uh cloud identities that will allow you to do lateral movement and get access to other resources. Um so, there is like uh the attack surface is is is almost infinite. Another thing that you can do is a lot of people when they deploy applications in Kubernetes, they they used to treat it as a traditional network perimeter. Uh what do that mean? They use like deploy them
without authentication. And there are plenty of Kubernetes dashboards that basically they allow you to poke around and do anything on a closer. And if an attacker just compromise an application that allow you to basically use them as a proxy, you may find a lot of um these high privilege applications running inside inside inside the closer and you can use do whatever whatever you want. So, I already talked a little bit about that, but usually on these engagements, you will find that companies love to deploy their clusters across multiple clouds. So, it's very fun to be like jumping around with multiple clouds. Like maybe you compromise uh a credential or an identity on on a data
center on their bare bare metal setup, but that allow you to move to an AWS environment, or you compromise a service account key from GCP and that allow you to move to that closer. So, you can totally do that. If you, for example, have access directly to the node, There is also very well-known paths that if you use like look at all the files in there, those are all the all the virtual file systems that are being used by the by the actual containers, right? So, you don't have to be like hunting around and and going into each different container. You have all the all the all the files on on that particular path, right? So, there is a
lot a lot in this to learn on this. Moving on. Let's take a minute to appreciate this beautiful command that fits on a on a tweet. This is considered the canonical way to compromise Kubernetes clusters in the in the Kubernetes community. And basically, if you run this command on any you know, like fresh deploy cluster, I believe on on any cloud that by default is not hardened eyes, there is no policies in place, it's going to be it's going to be a spawning a container that is called a privileged container that basically is going to give you the same experience of getting a a root shell on that particular node. And I'm going to explain a little bit
what. So, what you see here in this command is it's using a lot of concepts like host PID or or privileged container or even NS enter. So, for people that is you know, hardcore Linux users, they may immediately identify what is going on here. But basically, this command is is deploying what is called a privileged container, which is a a process that basically has access to the same PID namespace of the node, the the network namespace, and you can even mount the file system the underlying file system on the node on a particular folder. So, for for all the effect like you are just like root on the file system, And this is like when the attacker has
like a powerful enough credential that allow you to to deploy these these kind of privileged containers. But these credentials sometimes already many times already exist on the cluster. By default in Kubernetes there is a particular role that is called the cluster admin role that basically is like a like a root user or a super user but in the whole cluster. So if if someone made a mistake of assigning this role to a service account that is associated to an application that that application gets pawned because it's exposed to the internet, that's the same of giving the attacker complete control of the cluster, right? With that the attacker can do malicious things like deploying crypto miners,
read all the secrets from the from the TV database, deploy additional privileged containers and so on. So I think that's enough, right? Everything that I mentioned here if you if you pay attention, everything that I mentioned is if the container has enough privilege, right? Why? Because Kubernetes use a system called Rback. Rback stands for role-based access control and it's basically the authentication authorization mechanisms that we use in this case to define in a very granular way who can do what on on on which resource and how, right? When I say who, I mean like which type of entity like a user, a group, a service account, who can access what like any other resource like a pod, like a config map,
like a secret and what can do with that like the verb, right? The how which is like get, create, delete and all of that. So the power of this system is that is very flexible that you can create like very granular configurations but at the same time the disadvantage is that is too complex sometimes, so people use to render and they they just create like very me powerful policies that allow you to do everything on the cluster. And that's exactly what um the motivation of this research that I was starting to do, right? This This vulnerability used to be clear. This vulnerability was discovered by Wiz researchers. I'm not part of Wiz researcher, but they they inspired me to
then do my follow-up work. So, very quickly I'm going to explain what they did. They found a vulnerability on Ingress Nginx, which is the Nginx version that you use run on on Kubernetes, that allow you to basically read every kind of sensitive secret inside a cluster. And I'm going to explain a little bit more how they did it. Uh this was like a very high-profile vulnerability. It was like a 9.8. And basically, if you have this vulnerable piece of software in your cluster, and somebody can talk to it, it automatically can read any kind of TLS certificate, any kind of config map, any kind of everything. Um So, because I was reading the news and they
were like explaining how the exploit work uh they they didn't actually publish exploit, but they they they explain like the attack chain and everything. So, I kind of challenged myself like, can I use build my own exploit based on on their paper? And it turned out yeah, after struggling like for a couple nights, I have a functioning exploit. And I can replicate the attack, and it was it was kind of challenging, but also kind of cool. So, the way it works is is two steps. Basically, uh the first step is as any web server, any HTTP server, you can just send a request to Nginx, right? In this case, it can be a malicious payload. It can be a reverse shell
shared object. So, Nginx after if the file is bigger than 8 kilobytes, it start writing the content on a specific path on the file system, but you don't know which path. So, the catch is that you have to start a request on a second endpoint, which is this is where the vulnerability happen because this second endpoint will take configuration um without any authentication, right? So, it's a unauthenticated endpoint that allow you to basically write configuration directly into Nginx. That's the vulnerability. And the catch is that if you don't know for people that are familiar of Linux, if you don't know the name of a file, but the file is open, you can try to guess it by kind of guessing the the
process ID and the file descriptor. So, what you have to do is basically at the same time that you are submitting the file into the server, you start a second connection like basically brute forcing the PID and the and the the the file the file descriptor. And if you are lucky enough, you just guess the right the right file. And basically that will load your malicious uh code into by by Nginx, that in this case is a is a reverse shell, right? Once you have the reverse shell, you can basically write any command in the context of Ingress Nginx. And the problem here is this is yeah yeah traditional remote code execution vulnerability, right? The problem is
this particular Ingress Nginx run with a very powerful credential that I'm going to talk in a few seconds. So, at a very high level, this was kind of my experience like doing all these experiment because I'm I do like Kubernetes professionally, but I'm not a exploit writer professionally, right? So, I struggling doing kind of the the race conditions and then like in a in a Kubernetes environment on a container environment, the more tries that you the more that you try, the more the bigger the number of processes or the longer it takes to guess. So, it was a fun experiment, right? But at the end I kind of achieved writing the exploit and the and the proof of concept.
And as I say, once I have access to install that identity, the the permission that Ingress Engine X use for starter is a is one that is called a cluster role. So, the cluster role the scope is the entire cluster. So, that means this identity will be able to touch any namespace any resource that is allowed by by the policy. You can see that um what the policy allows is to leech and watch all these different resources that are that are the config maps, the endpoints, the nodes, the pods, and the secrets. Which is uh I can understand the case because uh Engine X requires you to create like some it handles uh rotating certificates
and things like that. So, it needs to be able to access the secrets of different namespaces. But in the hands of an attacker like you can use that to basically not only read TLS secrets, but read like any other kind of uh secret in the cluster. So, this is what I what I just mentioned. I used to show you like a like a visual aspect of this is Imagine you have a cluster with many applications. You have your Engine X, you have a WordPress application, you have a tool called Argo CD, which is a GitOps platform that allow you to do a lot of uh configuration by code and deployment by code, which is very powerful. So, if
someone use like compromise the WordPress and start uh has code execution on that on that container on that service, and from there it can reach an engine X that is not necessarily exposed to the internet, but you can uh basically send requests. You can use the exploit that I I used to explain to you, right? Once you compromise this engine X, you can use that identity to basically go and read all the all the secret credentials from the Argo CD uh namespace and in particular this Argo CD uh platform, whatever you deploy on Kubernetes, it creates a a default credential that if if the attack if the administrator doesn't change, it will allow you to log in with admin
privileges. So, if you steal the credential from there, then basically you can deploy the the privileged container that I explained in in the before, right? Like you can get root access to any any machine on the node, right? So, that's why kind of this vulnerability was like a very big big deal in the in the in the community. So, everything that I explained is like why RBAC is a hidden security minefield, right? The first thing is because RBAC misconfiguration is among the, you know, like one of the top causes and the reason is because it's very powerful, but also it's very very complex, right? It's super complex. Not a lot of people know how to
configure it properly. A lot of open source projects, they use um you know, ship their their stuff without like doing like a proper security assessment and it's nobody really audit these kind of things, right? So, I think we're How are we doing on time? Okay, I need to do the demo. So, so yeah, the idea was basically I I was able to replicate this. Maybe I can automate it. I can use like create like a small policy analyzer and then I can scan all the Kubernetes projects, at least the most popular ones, and maybe I can like automate the whole thing and make make it like a like a website that is publishing findings every day. So, that's exactly what I
what I did, right? And here I'm introducing ARBAC Atlas, which is the platform. I'm going to show it a little bit more. But before I was doing uh the actual policy analyzer. So, I think you can see that. So, it's just a small CLI tool that you can point at any artifact. And basically it's going to I think you can read that. If you see like my goal with this tool was to basically detect the finding that the Wiz researchers just just detected, right? And here on this particular version that is vulnerable, you can see the two critical ones, which is they detect that it has cluster role on and list watch on all secrets, right?
So, I could do that and then I I just automate the whole thing to on a in a website. The website right now is uh Well, here are the the repositories. This is ARBAC scope and something super cool is if you go to any GitHub domain and then you change instead of dot com to dot dev, you open like an editor. And I created a series of like rules. I have like 110 rules now to basically flag all these misconfigurations. And by the way, all these was like by coded, right? And I I'm able to do that because I work on Kubernetes like for many years before AI. So, I'm able to smell the AI slope and fix it.
So, so yeah, like I I've been growing like this list of detections and if you want to contribute like a new detection, like the last one was somebody the most recent Kubernetes vulnerability was like a remote code execution on node proxy. So, the The that they released the the announcement, I just like by coded a new detection rule and I was able to find this one in many other projects. So, this is the this is the RBAC scope component, which is the policy analyzer. Then I created like RBAC Atlas, which is the automation, which is the one that is going to be taking a list of projects, and the list of projects is this one. So, if you
these are all the projects that I'm tracking, and if you I don't know, you want to add your competitor there, you can just go and and put it in there. Uh but basically, this runs all I'm not spending any money on all this everything is GitHub infrastructure, so thank you so much, GitHub. Uh this will run like daily, and it's going to be updating this website. And in this website, you can, for example, come here and you say ingress nginx, and then um ingress nginx, and you can see the latest versions, and I I I recently I this week I added that because I've been scanning like all the artifacts. I just scanned over 25,000
artifacts, so I have data from the last 6 months. So, you can see like all the detected vulnerabilities, like which service account, which identity have it, like um you know, like an explanation of of the risk, and the actual commands of how to how to exploit that. So, the idea with that is like I have now like a you know, like a cloud native threat landscape of the whole at least the most popular projects in the in the industry, and I can see like how are we doing it as a as an industry, right? And all this is open source all all this is basically on the on the repos. So, I think we have like 3 minutes.
So, I already showed you the demo, but now that the results, right? When I'm getting all these data, like mainly two websites, operator hub and artifact hub, when you deploy when you develop something for for the cloud native ecosystem, you mostly publishing there, and I can see which ones are the most popular ones in terms of downloads, so I use take all of that, and I start getting some interesting metrics, right? Like for example, the top five are by grace detected across all the most popular projects, and I was not surprised, which has information disclosure, that exposure, things that a lot of um policies get wrong, right? Then I also I was more interested on
seeing like what are the top 10 riskest projects and why why they are risky and the different risk and the different potential vulnerability they may have, right? In terms of um critical, high, medium, and all of that, and I was not surprised that some of them were like in terms of like managing hardware and volumes, but the whole list is is on the website. And yeah, the report summary, right? Like I So far I'm tracking over 257 projects, 108 repositories, and we have an average of two service accounts, 30 bindings, three workloads, and and so on, right? So you can see all of that. The The criteria for me to track this risk is if I find like a start on any any kind
of like API group resource verb, like that's a critical, right? Because that will allow you to basically do everything on that particular resource. For the mediums is yeah, you you still have some some stars, but you are scoping to particular resources, right? So it's still like the the risk gets a little bit lower, uh but still, you know, it's bad. And then the medium kind of follow the same idea. This is the the risk. Like with the policies, I'm able like to tweak around and do very specific misconfigurations for for the detection, and that's how I'm I'm able to, for example, replicate the Wiz findings or or the the no the the the port forward the node exporter
that I that I just showed you. But with that, I think that's all I have for you today to show you. If you have more questions, you can reach out to me. This is my my social media. I'm always happy to connect, talk more about security. If you want to contribute on the projects, here are the links. The first one is the the actual policy analyzer. You can totally integrate that on a on a GitHub flow. And then the the actual website is Rbac Atlas, or if you want to add like additional projects to be tracked, you can do it directly by sending a GitHub issue or a PR. I will appreciate it very much.
Uh with that, thank you so much. >> Ladies and gentlemen, give it up for Lenin Al-Ghaffari. Yeah. And And since you're the last talk of the day, if there are any questions, if you guys want to shoot questions out, even if you didn't put it in the app, I'll let him take a couple questions if you guys want to do it, cuz this is the last talk in this theater for the day. I think I think Nginx Ingress is now basically abandoned. Yes. Yes. public too that critical CD. Yeah, that that's the problem with a lot of times with all these projects, right? Like they abandon them, they discover new vulnerabilities, and nobody will will patch them. So,
hopefully somebody pick it up already and Yeah. Yeah. Yeah. But yeah, I read that, and I was kind of surprised that because it's like a lot of companies that I know, enterprises, they use it. And like they have to do something about it. But Yeah. Great. Yeah, I think there is a question. There is two questions over there. Yeah. If you're allowed to answer this this Google internally use the open source Kubernetes or did they have their own copy? Uh I I'm not allowed to I I I didn't mention this here, but I'm not representing my employer in any way, my past or present or future. Uh but big companies are very big. They use all kind of technologies.
So yeah probably Yeah. a 500 company probably wouldn't use the open source. I mean, Kubernetes like the Kubernetes maintainers are distributed across different companies. So, I know that and this is public information like CN CNCF like is integrated by many engineers from different companies. So, there are actively Google employees working on uh protecting Kubernetes or making it safer as the same as other companies. So, yeah. You know, thank you. Yeah, yeah. I think there was another question there. So, for a company or enterprise that's that's rolling out Kubernetes and has lots of first party stuff deployed on it, can you point this tool at like our our back configuration and all the roles being defined for our first party software and
understand what the risk is and what Yeah. So, so the idea for this is I designed it in a way that it doesn't require to be running inside the cluster. It doesn't require any kind of identity. You can you can just point it at helm charts or config Kubernetes YAML configuration and it will use all the tools all the rules all the detection rule that I have available. And if you need a specific one for your use case, you can totally create new ones. And it it's going to give you a report like the one that I showed you and it's going to tell you like the different uh levels of risk depending on on the toxic combination,
right? Which is what what they call it. So, you can totally do that, yeah. One more. We have time for one more. Please, one more. I'll project. I think it's an absolutely exciting work. It's really great to see this based on the back of a simple flexible Rback definition that this ecosystem has the same access control syntax for a whole loose tools. Therefore, you can write this analysis offline or without having to interact with it. What are the prospects or have you heard of or has anyone else approached you with systems like this for other Apple ecosystems or other kinds of policy languages because I'd argue that the other big threat to our industry is the
sheer fragmentation of them that you can't do this for everyone's force.com integrations on Salesforce or other IM systems and other clouds? Yeah, I for that particular question, I think like this project I have it in my mind like for the last four years, but I never have time or never feel motivated enough. But then with all this by coding, I have like a working version on a weekend. And then I mean iterating over that mostly like making the website not look that ugly. Uh but the the engine itself I I think I have like a mature version like in in a week. And and now I think with AI, you can totally translate like the concepts
like into other YAML definitions or other policy definitions. So, I know company that they are basically translating query syntaxes like from like let's say Splunk to like a the other seam kind of thing. See abstraction. Yeah, yeah. So, it's because they have the same problem. They have like all these different syntaxes, all these different applications. So, I think you can totally do that um analyzing different type of policy definitions. Yeah. I suggest open policy agent or Mhm. Yeah. Yeah, try to Yeah, that's a good idea. Thank you. Yeah. Yeah. Again, ladies and gentlemen, please give it up for our last speaker. Good night.