← All talks

Effective building blocks for securing multi-tenant Kubernetes clusters

BSidesSF · 202430:42217 viewsPublished 2024-07Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleTalk
About this talk
Effective building blocks for securing multi-tenant Kubernetes clusters Shrikant Pandhare, Sagiv Sheelo Learn about Snapchat's journey enabling secure multi-tenancy in a Kubernetes based Service Mesh platform. This session dives into successful patterns of automated least privileged access provisioning and practical security trade-offs for securing against container escapes. https://bsidessf2024.sched.com/event/840b4a8179a8275d6c6450e4e5a304df
Show transcript [en]

please give a round of applause to our next speakers and enjoy your talk on effective building blocks for securing multianal kubernetes clusters thank you good evening everyone just a quick note for everyone uh we have a Q&A online so if you want to ask questions uh for the speakers please go on besid have.org Q&A and then we'll read out all the most of the questions whenever we have time at the end of the session thank you than you cool thank you uh good evening everyone and thanks for joining us I know this is last session just before drinks but the topic is important it's about platform security um and I believe it's it's foundational thing for any

application so it's going to be interesting uh today we are going to share our journey and learnings of securing multi-end and K test

clusters what the slide is it

going okay there we go okay we are back I'm I'm shant I'm engineering manager for infrastructure security team at snap I work on securing service mesh platform which is based on kubernetes uh we use AWS and uh Google Google Cloud uh public clouds for our Cloud native infrastructure hey everyone uh super excited to be here today uh I'm Sig Shilo I'm a security engineer working on uh infrastructure security at snap uh with a focus on our cloud and kubernetes security controls so we'll briefly talk through the evolution of our platform over time then we'll cover why certain decisions led us to consider multi-tenancy uh then we'll talk about how new design change our existing sec boundary and introduce

new security risk uh next we'll talk about security controls we buil to mitigate against those risks and as we all know uh every project goes through challenges so did so did our we'll share some of the organizational as well as technical challenges that we faced so here is a quick his history of how our platform evolved uh until we started this this project that we're going to talk about snap started on gole gole app engine uh it was a great platform to boost strap uh an idea and scale quickly to millions of users uh sending and receiving snaps however about five years into its life uh snap G with multiple uh user facing products as

well as backend Services uh and as you all know monolith architectures have limitations uh we started hitting those limitations like scalability developer uh velocity uh at at the same time snap made a strategic decision of uh going multicloud U hence we wanted to have a platform which will suffice or basically support both multiple clouds uh and that's why we decided to have an abstraction like kubernetes uh that can work on on multiple clouds and that's how we came up with our first service manage platform so this is a quick overview of our of our single tenant service mesh platform we use open source Envoy for our API Gateway at the edge as well as uh on the side cars uh that on on each

of the service deployments uh switchboard is are in-house uh service configuration portal that syns all the control plan artifacts like service to service or G Gateway routes and other configurations like transport parameters each of the services gets his own uh communities cluster which is managed by respective service teams and this is this is the uh bigger difference from what was the existing case and from from where we moved on so every service having its own cluster so here are some of the design choices for the mesh and the limitations we we we chose a cluster Pur Service uh we made this design choice to limit the blast radius uh and as a cautious step uh as we are adopting

kubet is for the first time that time and at a very large scale we and given a businesses of multicloud there was no feature parity between both the both the cloud providers that we're using and we planning to use manage Community Service uh so we we had to make some some of the trade-offs there however as number of services grew uh it exploded the number of clusters that we had to uh manage and secure our operations were decentralized uh during that time and we did not follow any s model so we had to rely on clust rely for cluster Management on our engineering teams and this often delayed cluster upgrades and patching our part two production story

was not strong we we did not were not able to enforce golden path cicd uh this made service roll out to to our mesh challenging it was apron and needed redundant engineering work so all of us all of these challenges basically made us to rethink our mesh platform and as we are rethinking uh the mesh we also thought to think about what additional security controls that we need to uh put in into this platform we we we had pretty broad IM uh for for our services um operators had unrestricted access to the production and we wanted to minimize that uh for as a security based practice the deployments lacked uh Provence and our production liked Integrity against

unknown compromises also as I said earlier because of the large Fleet and exploded number of clusters uh the vulnerability responses U was a little bit difficult it was uh delayed and it was difficult to get a unified toing uh around it our Fleet had become very heterogeneous and um it was becoming expensive for our detection response teams or our runtime monitoring tooling to be deployed across such a large

Fleet so given this limitations that we discuss about single single T service mesh we we thought about optimizing this and that's how we we came up with building a multi- tency within our uh kubernetes clusters now before we move ahead uh let's establish some common terms and understanding of what we mean by multi tency um for by single tenant as I said before we we we mean uh cluster per service for multi tcy we are packing multiple services within the same kubernetes cluster that means the same service would be sharing a node and though all of these Services uh belong to snap they they are owned by different teams uh those service might have different uh profiles threat models uh

permission models uh these Services might be owned by teams which are geographically distributed so so this there were a lot of security concerns uh when running these Services into the shared environment and I think some of you might have similar kind of model but the tenants could be your customers or or some of the third party code or applications that you're running but I think the controls or the risk remain similar so what was the problem statement for us to build a multi platform that works on uh multiple public clouds and the P primary objective is to is constructing an abstraction that Provisions all the kubernetes prematures for the multitenancy and standardize the developer experience across uh multiple

clouds basically we use Google and AWS and these were some of the uh tenants that we put ourselves forth it to make an opiniated platform that enforces best practices of cicd observability and security so we didn't wanted to have lot of drift within our platform lot of heterogenity uh because of the way the security tooling upgrading patching and maintenance becomes uh tricky and to centralize the infrastructure to enable high velocity of infrastructure changes so given this problem statement objectives and tenets this is this is how we started building this now I will hand it over to Sag who will take us through deep Dives on security risk and mitigating controls uh in our multitenant mesh

uh thank you shant for introducing our path to get to multi-tenant clusters uh now we'll continue the discussion on the security risks and controls in multi-tenant kubernetes clusters uh let's begin by exploring the security risks in shared compute environments in a traditional single tenant cluster the boundaries of a service are well defined only a single service is deployed per cluster and there is no risk via compute or kubernetes abstractions for services to interact with each other however kubernetes was not originally built for secure multi-tenancy when services are deployed to the same clusters this lack of default isolation can lead to privilege escalation between tenants uh for example permissions via the control plane are not restricted to single

tenant name space um and pods can be configured to access the nodes allowing service owners or deployments to abuse the lack of isolation and escalate their privileges onto the node and other services running there another risk is the use of overly permissive identities um by default kubernetes uses a node identity for all services and all tenant required permissions would be needed to give on the Node uh it does not follow the P the least privilege principle Services would share each other's permissions to Cloud resources unauthorized access is a significant risk in multi-tenant kubernetes clusters developers often have broad access to their clusters by default uh for example developers May inadvertently or maliciously access other services or their logs belonging

to other tenants allowing developers access to Services owned by their tenants increases a risk of like intentional or accidental changes that can impact the entire cluster uh so to address all these risks we need robust security controls uh we'll overview each of these controls for the three areas so for privilege escalation we proactively block known bad configurations that could lead to privilege escalation we enforce policies for service placement uh to ensure that services are scheduled on clusters based on their sensitivity and potential impact and we Monitor and Patch disclosed vulnerabilities to minimize the risk of exploitation um for the overly permissive identities issue we Implement per service and component identities ensuring that each service has only the

permission it needs to operate uh following the principle of lease privilege um for unauthorized access we limit developers access to only their service name spaces and we also restrict access to service logs ensuring that each tenant has access to only their logs okay uh kubernetes By Design does not provide any isolation guarantees between pods running on the same node uh with a simple pod configuration pods can access the node and all the containers running there uh this means that a compromised service pod could potentially access and manipulate other pods on the same node even if they belong to a different service in a different namespace uh to mitigate this risk we employ a custom admission controller uh

for those that are not familiar an admission controller is a kubernetes mechanism that intercepts and validates request of the kubernetes API before they are persisted into the cluster uh it allows us as a production security team to enforce security controls that apply to all kubernetes deployments and to block deployments that don't meet our policies in our case we built policies into our admission controller based mostly on kubernetes pod security standards uh these are standards for Best Practices to harden pods and kubernetes uh we chose controls from these standards that prevent known privilege escalation paths that we care about in multi-tenant clusters for example we restrict the usage of privilege containers which have access to all Linux capabilities and we

prevent pods from using the host name space which would allow the pods to access the node and all the containers running there both of these would break isolation between co-tenant services and by enforcing rules from these standards we can significantly reduce the risk of privilege escalation between services and our multi-tenant clusters we occasionally encountered scenarios where service owners were used to quickly debugging their service by exing into their containers using pod exec uh you'll recognize that pod exec can be extremely risky especially if they're containers with elevated permissions within a service name space uh to mitigate this risk we implemented a targeted approach that allows the necessary access while maintaining security um specifically we limit the

Pod exec permission exclusively to non-privileged service owned containers uh in addition to auditing all actions performed on production uh this control ensures that any privileged system containers don't become a path to privilege escalation identities are a basic security concern uh the use of overly permissive identities is a huge risk as you may have run into pods run with the node identity in gcp this defaults to the compute engine service account with editor permissions um using this identity does not meet lease privilege and especially when we start needing to Grant permission for different services to access their data uh to ensure least privilege identities we use workload identity Federation to deploy pods with a cloud identity uh distinct from node identity

it allows us to deploy pods with per service identities ensuring each service runs with only the permission it needs uh for instance a friend service might have permission to access the friends database but other services running alongside it will not be given that permission we've also ensured provisioning of identities per kubernetes components and internal infrastructure add-ons and with workload identity Federation the identity access moved away from the node and onto identities that are linked to pod service accounts associated with Cloud identities okay now that we have established our identity model let's jump into how we think about access as Trant mentioned when we were re-evaluating the security posture of mesh one thing we wanted to address was

broad developer access to their clusters and deployments in our multi-tenant environments hardening access becomes a critical control developers with direct access to Shared clusters would be a risk to all services deployed our platform wouldn't be possible without limiting access to clusters in order to accomplish this we limit deployments and changes to the cluster via Golden cicd path uh the golden path ensures that all changes are tracked and audited and provide a clear trail of who did what and when and we use arbec to ensure that service owners access to their clusters are limited to non- mutating actions in their own name spaces we don't want to let them do outof band changes to production uh

outside of the golden cicd path now that we have covered access to deployments Let's cover access to the kubernetes control planes uh they're on the Internet by default and we need to protect against external attacks and follow best practices of limiting access to control planes at snap we have our own Beyond Corp zero zero trust access pattern and we use an off proxy that we built at the edge to provide context aware secure access we have added support to kubernetes to that same proxy and we enforce its usage by limiting control plane access to its apis to its IPS um the security control is transparent to developers uh their experience in connecting via cubec tail

is unchanged um we generate cubec tail context for them that go through our oth proxies and also set their service namespace access by default um so far we have discussed controls to production let's talk about hard nating the path to production um to ensure deployments come from our golden path our cicd pipeline signs digests of built images and our admission controller validates the signatures of pod container images uh it it allows us to ensure Providence from code commit to the build to the image and only approved and verified code is allowed to be deployed on clusters reducing the risk of malicious code being deployed uh one of our challenges here was moving away from using mutable image

tags to image digests it required us to update all our platform components to support uh container image digests uh but this cost trade-off was worth the security guarantees that come with using immutable digests that are signed um it's not only about platform security we also care about the isolation of our service logs uh our service log our services use a stack driver logging that emits logs to Shared cluster gcp projects without per tenant isolation to address this we configure gcp log syncs and buckets that R Lo uh logs to service owned projects ensuring a per service isolation it means that each service logs are ingested directly to their respective service owned projects preventing unauthorized access

to logs from other service owners and the configurations of these logs sinks and buckets all happen automatically during service onboarding to a shared cluster uh at this point all our security controls give us isolation between services and secure the access to them but for our team and the other security stakeholders multi-tenancy still felt like a pretty big risk um and in order to further reduce fears around the shift in the security boundary we had to go back to the drawing board to think of how we want to collocate services on the on clusters by threat modeling their attributes such as access of sensitive data Internet exposure and service tier it was a tough discussion as it was hard to align on

the security benefit given the increase in complexity of managing more clusters but in the end we to further control the blast radius of services sharing the same cluster we evaluate a co- tency policy uh at workload on boarding that limits uh which Services can be deployed to which clusters based on their attributes um at this point we felt our multi-tenant cluster are secure with all the controls and policies but uh new kernel vulner vulnerabilities could undermine our assumptions around container isolations allow a service to escape its boundary and onto the node and other services running there in order to mitigate against this and other potential vulnerabilities uh we enforce the usage of Auto upgrading node pools

when possible and we have the ability to upgrade the entire fleet at any given moment uh we have a pipeline for our on call to get alerted of security bullet as they are disclosed ensuring that we patch critical vulnerabilities promptly uh I hope this was an insightful journey through all our multi-tenant security controls I'll now hand off to Shri count for a recap and conclusion thank you sagiv for discussing all the effective controls to mitigate security risk so this is a high level view of our end platform and this en encapsulates most of the things that we said in this session so far so if we follow from the left hand side um notice the engineers they access the

platform only through our zero trust authenticating proxy and the this is our first security control uh in in this entire platform this validates Engineers identity and access levels this also has Edge policies that uh enforce things like uh device trust if in case of uh sensitive applications our Engineers only have limited readon access on on the production for the specific kuties name spaces and uh to to just check their deployments but for mutating operations or uh changes through your deployments or the service it has to go through uh cicd and before each of this deployment reaches uh the production it has to pass through a set of security policies and these security policies are enfor osed

by admission controller and if any of these policies fail the deployment is uh failed and it doesn't go to the production so some of the policies um uh are like the controls that we discussed earlier uh it's it's about for example uh doing bill attestation that is Provence measures uh indicating that the services have been built through our trusted uh cicd platform uh another policy that we enforce is hardening configurations on our containers so for example if there is an unapproved capsis admin uh capability on the on the containers we validate that and then fail those uh those deployments if those uh that's Orly permissive capability is not approved we also validate for the reg

register of the image and ensure that images are pulled only from our approved uh sources now if you look into the production uh platform on the on the right hand of the Central right side uh you will see uh there are several platform security controls here though the service are running in a shared clusters there is uh isolation for example each service runs with its own identity and this is different from the node identity so the service identity uh has only permissions that are needed for its own uh purpose and every service has specific service scope uh permissions on those identities and the node generally uh node identities all are permissive in the C is uh environments and we wanted

to kind of remove or reduce that attack surface as well so these run only with permission that are needed for the normal operation of the kubernetes cluster so coming to our uh platform evaluation road map so we went from a monolith to a single tenant service mesh and now we have multi- service mesh with strong Service isolations uh and workload hardening in shared environments uh running running on our kubernetes

Fleet so this this is the this is the most uh interesting point and some of the takeaways for for you all if you plan to have some such kind of initiative in your company uh as you all know large projects are not completed without challenges and uh we we had quite a few along this lines first thing is about migrations uh I I think everyone over here will agree migrations are are tricky uh there's there will always be some unknowns uh in the path and something something will break for our migration path uh we we have kind of categorize our services into four archetypes we are having Services running in Legacy infrastructure Services which are integrated as part of

uh merger and Acquisitions Services running in our Legacy mesh and the new Services which were uh moving to the multi- uh environment to approach this uh we we basically basally followed our service tiering model that we have uh so the the tiering model is B based on the business needs of the service and how critical it is to the business uh of of our company so for example minus one tier is a foundational uh service which basically could be something like API Gateway tier zero or tier one services are user facing Services supporting Coe products and uh those are directly in critical path and tier 2 plus services are not in the uh critical path but they provide

auxiliary functions so we started adopting uh the platform with uh tier 2 plus Services uh and that's how the initial onboarding started but what we realized is if we had to ask the tier zero services to move to this platform we we were not trusted enough and our platform was not trusted enough so to go around this we had to do our own Doc fooding and we we had our own uh tier zero tier one Services migrated this platform that way we could build the trust amongst our other service teams and product teams and then then we can convince them that this is a platform with strong isolations we were also recommending change in the developer behavior and uh

where where the direct access to the production the mutating uh permissions on from the production were removed and this was a behavioral change for developers uh but to ensure that we we understand their concern as well we provided easy paths and Alternate paths as well as break gas scenarios wherever there was a critical business need this was a large initiative and the MVPs of such large initiatives go longer um and what what we realized is very important for leadership to always show patience and kind of uh endorse a back search project because the realiz impact could could take take a while and some of some of the technical challenges um the the requirements of the services were not as uniform as we

we thought there were like lot of bepoke uh requirements as we started onboarding or started understanding the profiles of the services some of the services needed local vpcs now this becomes tricky in case of uh shared environments then you have to rely on either Network policies of the kubernetes but then again that kind of has to uh sync well with the with the either security groups on AWS or the firewall rules on gcp and it the whole model was becoming too complex um we also had to uh ensure that all this we provide backward compatibility as we are migrating the services so we had to cover lot of different uh areas to ensure a successful

migration access patterns are also very tricky uh there are operators accessing then there is network based access for the services then we have to have um access to the control plan of the kubernetes for example the authenticating proxy that I I uh mentioned so if we had to move all of our kubernetes Fleet behind uh the authenticating proxy that kind of overloads the authent authenticating proxy especially if uh a CI is a CD system is going through that so we had to make some of the trade-offs there uh to ensure we provide security but as well as reliability of our infrastructure is intact and uh integration of all these controls these are a lot of moving Parts

into into a platform that that is another trick trickiness and there is quite of subtle differences between uh AWS and gcp even if we are talking about abstraction like kubernetes uh once we get into the integrities of networking and other uh other things it it becomes uh pretty tricky so we had to consider a lot of that and increase the engineering cost so this is uh this is the final slide then coming back to our uh evolution of the platform uh as as I said we started with monolit then moved to service mesh which provide U security guarantees uh productivity gains cost optimizations but that was not enough that's where basically we moved to a

multi kubernetes platform which was more much more optimized infrastructure more more abstracted out from from our service teams uh lot of provisioning uh Auto provisioning controls and auto upgrades and auto updates on the fleet but we are not done here um I think there's there's some room for more iterations and we think started thinking and brainstorming the idea of of managed service which is uh the idea about to decouple Service deployments uh whether at the code level or packaging level uh so that the engineering product teams only focus on writing code uh merging to their branch and interacting through cicd and looking into their deployments only through uh metrics and charts while the platform uh takes care of all the

auto patching Auto upgrades uh even on the code dependencies or the container image dependencies Etc so this will be a huge uh win for uh software supply chain security as well so that concludes our talk uh we have couple of minutes for question and answers but thank you uh for giving your time hanging out with us this evening uh it was pleasure uh sharing our journey with you [Applause] all thank you so much for our speakers I think we have probably time only for one quick question uh so I can read that out loud if you want so it's question from Victor how does snapcraft kubernetes configs for team members especially for those who require access to multiple

name spaces and clusters so we have a we have an internal tool that lets people connect to uh their clusters or their services um so they'll request to create a cube CTL context for it and they'll give their service name and then if there's 's a few different clusters it's deployed to then it will let them choose which cluster it is um and then that will set the cube CTL config uh and and that's where we inject the pro to go through the proxy and also to set the default namespace access okay thank you I think we don't have much time left because we have to be out by the theaters at 6 so just

couple of quick announcements you can already enjoy happy hour at the bar that started 15 minutes ago and in 1 hour you have a party that's sponsored by Adobe that you can all enjoy as well thank you so much and give please a round of applause to our speakers