Centralizing Egress Access Controls Across a Hybrid Environment

Name: Centralizing Egress Access Controls Across a Hybrid Environment
Uploaded: 2025-10-30
Duration: 29 min 3 s
Description: Block's approach to managing network egress policies across hybrid infrastructure with diverse enforcement endpoints. The talk covers centralized governance for outbound traffic, application-to-domain mapping via SPIFFe identities, and automated rule deployment across firewalls, Kubernetes, DNS poli

BSidesSF · 202529:03230 viewsPublished 2025-10Watch on YouTube ↗

Speakers

Ramesh Ramani

Tags

CategoryTechnical

TopicCloud IAM Network Security

TeamBlue

StyleTalk

Mentioned in this talk

Tools used

Suricata

Platforms

Kubernetes

Standard

SPIFFE

About this talk

Block's approach to managing network egress policies across hybrid infrastructure with diverse enforcement endpoints. The talk covers centralized governance for outbound traffic, application-to-domain mapping via SPIFFe identities, and automated rule deployment across firewalls, Kubernetes, DNS policies, and proxies.

Show original YouTube description

Centralizing Egress Access Controls Across a Hybrid Environment at Block Ramesh Ramani Hybrid environments complicate network egress. Learn how Block is centralizing network egress policies to ensure consistent deployment of rules across diverse enforcement endpoints—regardless of type or location—enabling secure, scalable, and streamlined outbound traffic management. https://bsidessf2025.sched.com/event/5f0d900ebd6be22b430006e7426b57a4

Show transcript [en]

designing our entire network airport to make sure that our network egress access is safe, secure, and definitely hopefully with shorter security lines. All right, think about this as our flight plan. Um, you know, for the rest of these 30 minutes, we're going to start off with introductions. We're going to move on to terminology. We're going to talk about why egress access controls even matter, right? What is the problem that we're trying to tackle? Why do we care about this so much? Of course, we'll move on to the use cases. We'll dive straight into solution architecture and then we look at how we do this end toend egress access workflow. All right, this is where I

tell you why I'm qualified to talk to you about this right now. Um, my name is Romesh Ramani. I'm a security engineer at Block. I've been at Block for a little over 5 years right now. I have over 15 years of experience in security. I started security by configuring seriously simple firewall rules and firewall appliances. Um, and now I'm helping orchestrate complex cloud environments with uh, you know, and Kubernetes stacks. Um, you know, if uh, if I'm not worried about packets flying around our network, I am generally geeking out on all stuff Kubernetes, network security, cloud, and digital forensics. And in my pastime, I enjoy playing some really nice RPG video games. All right, let's get our

terminology straight. Our airport terminology, right? Network egress. What is network egress? Think about your organ. If you were to imagine your entire organization as an airport, think about every single flight that's leaving the airport and going out to different destinations as destinations on the internet. This is a classic example of or or a definition of what network egress is and the network egress policies. Think of this as your TSA digital TSA agent, but instead of looking at your 3.4 ounces of liquid, it's probably going to look at what kind of unauthorized data, you know, is trying to sneak out and what kind of applications need to reach out to sketchy destinations on the internet. So, it controls all of that. And then we

have the enforcement endpoints. Think about the enforcement endpoints as security checkpoints at an airport, right? Like there are multiple security checkpoints at an airport. And each checkpoint speaks their own language. So if users want to go to their gates, they go to the checkpoint where they speak their own language. They're comfortable and that leads them to their gate. So over here at block, we have multiple enforcement endpoints. We have firewalls you know like we have like different types of firewalls you know running soricata engine and we have like you know multiple other new age firewalls and then we have Kubernetes network policies Kubernetes DNS policies we have STO egress gateways we have like proxy oh my god the list is endless we

have quite a few actually um and then so these are the enforcement endpoints and then we move on to the group management system what is a group management system think about it as a meticulous travel agent And what this travel agent keeps a track of is say what destinations your applications need to travel to and what I mean which applications need to travel to destinations and what are these approved destinations. So it keeps this mapping of application to domain in a central repository and this helps a little later and I'll talk about how this works. Um and then we have finally the governance uh you know the group governance policies. Uh think about this as your airport aviation regulation

system and this completely has all the rules of what you can request access to and would you get access to it or will you get uh or would access get denied. So let's before we move on to what the solution architecture is like let's first try and understand why egress controls even matter right okay now imagine you have applications within your environment and it's and you're just letting it talk to anything out on the internet right you're setting yourself up yourself up for u risks both for both from a inadvertent u you know data exposure or malicious data exfiltration. So apart from this, we're also able to look at I mean it's not just about controlling who

can talk to what, right? We're also what type of data you're sharing. You want to make sure that sensitive data does not go out or is not shared to unwanted partners or maybe they're not even partners, right? And of course regulatory compliance. If you're running PCI or GDPR, you want to make sure that your your customers data you have clear visibility over where your customer's data is going and you have order trails for all access controls. Okay, but what exactly is the problem at block right? egress access you know as you can see we have like this large environment and we we are on multiple clouds and we're in the data center as well and our applications

right our applications are on diverse technology stacks we have applications on kubernetes clusters and we have it on cloudnative environments as well lambdas EC2 etc etc of course we have our data center as well so you know all of our applications running on our data center um equipment all of this becomes really complex to manage as you can imagine. Okay. Now let's talk about visibility. Right? Let's say a particular during a security review someone comes up and asks hey who's trying to act which are which are all the applications that are trying to access api.partner.com right now we immediately have to scurry try and find out which are all the sources of truth that we

have. And if we don't have sources of truth we need to start parsing logs. We need to look at flow logs. We need to look at VPC flow logs. We need to look at I don't know proxy logs, Kubernetes telemetry logs. Just have to look at a lot of logs. Auditing becomes really complex and this can delay incident response as well. Okay. But what are the benefits of centralizing egress access controls? And I and this is this I'm going to tell you that this has unlocked a a lot of really cool features for us. And let me let me quickly tell you for example because we are we now maintain a mapping of application to domains we now can

clearly say which domains or external entities our applications are talking out to across the organization. This is a huge security win for us. And apart from that because we maintain this mapping of application to domain in a central repository we're now able to take intelligent uh you know actions on this. We have like um these lightweight modules which actually pick up these ma this mapping understand where these egress policies have to be deployed and deploys it seamlessly regardless of where the application exists. This leads to extremely uh good traffic management. And of course what happens over here is that the system immediately provides feedback to users. For example, if a user wants to go ahead and access

request access to a particular partner, we ask them certain questions and then we provide immediate feedback if there are any issues. And I'll talk about this a little more as we go uh as we move along. All right. So what are some of our use cases? Let me talk about incident response here. Now because we have a centralized egress system over here. Let us assume a particular partner of yours reaches out to you and says hey I'm sorry one of our products got compromised. Now how do you identify which are all the applications across your organization that can actually reach this partner? Right? With this centralized system we're now able to immediately find out all the

applications that are talking to this compromised partner and we're able to like make intelligent decisions about it. And not only that, let us assume a particular partner uh a partner's compliance status changes. I'll delve into this a little more a little later. But just think about this. If a particular partner is no longer approved to receive a particular type of data, the system automatically identifies this. It automatically identifies this and updates it without any intervention. I'll talk about this a little later. And emerging threats. Again, let us assume a partner comes up and you see that the type of data that you're sending to this partner is anomalous. You're seeing anomalous data patterns. You can

immediately identify all the other applications and see if they have similar anomalous data patterns. This could be indicative of an emerging threat and we can immediately you know uh you know provide responses to it. Then of course multiloud failover as every other organization as I alluded to block is on multiple clouds right if a particular application is in a particular environment and or if it's in any other cloud our system automatically knows where each application exists. If it is in multiple cloud environments it doesn't matter. Our system still knows it. It knows which cloud environment it in it is in which business unit it belongs to. what are the enforcement points for it? And so if a particular

cloud environment fails, it doesn't matter. The system still deploys egress access policies to the other environment making sure that the application seamlessly accesses these external endpoints. Right? Let's dive into the solution architecture here. Right. Uh it consists of um so at the heart of it all is our network egress management system. There are I mean consists of five different components. The first one is a centralized UI. Now users no longer have to go into like these different sources of truth. There's a singular a singular UI. Users just have to go in there and request access to certain domains for their applications. Then of course we have the group management system because once these users have requested access

to these domains this mapping of application to domains or IP addresses is maintained in a centralized repository and that's a group management system. And of course we have the governance policies right so this is what regulates group membership requests and domain creation requests. I'll talk about this a little later as we move move along. And then finally we have the automation modules. These automation modules pick up this mapping of application to domain understands where the application lives. Understands the business unit that this particular application belongs to and hence it has an understanding automatically because we have a mapping of what the enforcement endpoint for that business unit is and so it converts it into that

particular enforcement endpoints rule. It's like a translator. Converts it to that type of rule and deploys that rule seamlessly. Let's first start with the centralized UI, right, for this network egress management system. So if you think about it, users no longer have to go to these multiple sources, right? They have a single source of truth, right now, a singular system, a singular UI. They go into that UI and they're now able to, you know, request access for their application to specific domains. Now this greatly enhances usability right because they don't have to go to different other places right now and apart from this because all of this data is maintained within one UI we can now

put governance policies and rules in this one UI to make sure that access controls are uh seamlessly maintained. Of course, this also ensures that you know there's absolutely uh lesser uh areas for error because what we do is that we immediately give users feedback in case they enter some incorrect information. I'll give you an example right after this. Here's an example. Look at this particular UI. Let us assume that I am the application owner for this application called Kubnet. Now let's say my application requires access to chatgpt.com for some LLN transactions. Right now if I want to access chatgpt.com I first go into the system I search for the domain group which is chatgpt.com if it is already approved

and in the system then I would go in and request access or membership access to this domain group. And as I say over as you can see over here it's called a workload right? A workload is just an application over here and I'm entering the name of the work of the application which is Kubernetes sec. The system automatically detects it and then now once you hit submit the user gets access to the system. It's not that's not the end of it though. If you can look at it over here it says over here intended data security level. Not only will you, you know, not only we have a clear indication of what application is talking to which domain, we're also

asking users the type of data and the security level of the data that they're sharing with this partner. So apart from this, we also have domains. Let us assume that chatgpt.com does not exist as a domain within our system. We're now empowering users to go in and create this domain. But that's not the only thing. We're also validating that this domain actually belongs to an approved partner. If this partner is not approved or if the kind of data that you're trying to send to this partner is not approved, access will be denied. Especially for example, if you're trying to create a new domain, for example, chatgptt.com and it belongs to the product, you know, it belongs to the

company which is open AAI and if OpenAI is not an approved partner, you will not get access. How does this work? We have a central source of truth which maintains all this information about every single partner. I'll come to that a little later. But is this so we use this source of truth to actually go ahead and check in fact we've actually integrated this with with an LLM what we'll do is that the minute a user enters chatgpt.com as a domain the port as example 443 protocol as TCP and the product ID. The product ID is the key here. This is what identifies a partner in our central source of truth. Now the LLM will go ahead and take this data

check to see if this domain legitimately belongs to that particular partner and if that partner is an approved partner within our system. All right. Now comes a group management system. Right. Let us say the users have gone ahead. They've finished requesting access for a particular access for their application to a particular domain. Now this mapping of applications to domain um applications to domain is maintained in this central repository which is our group management system. Right now every single application is um identified by using something known as a spiffy ID within uh with by the system. uh SPF ID just think of think about it as an identifier for your application but it has rich contextual information about the

application like where it lives you know what you know which business unit etc etc here you go spiffy ID as I said think of it as an unique identifier for your application so over here you have a sample spiffy ID format which is spiffy slrust/domain/type/ / business unit/ environment/ region/ application name. So simple example spiffy/weblock.xyz XYZ is the trust domain slash app meaning that it's an application slashcash which is a business unit slash production which is the environment / us west which is region and finally the application name in this case it's going to be payment processor right okay cool so this is fine but how do we maintain this in the repository as I alluded to earlier every domain is a

first class object which means it's it's we handle it as a group and so every group is a file name. For example, over here in our system we'll have github.com as the file name because it's the domain group and then we have the application spiffy ids written into this file. Once a user requests access to a particular application, it gets converted into a spiffy id and written into this particular domain group file. So over here you can see payment processor and analytics service are two applications under github.com and then we have only payment processor under bank of america.com. So this is how GMS maintains this mapping so to speak. Okay, but how do we govern all of these

group membership requests and these group creation uh requests? That's where our governance policies come in, right? our our aviation regulation authority. It's going to make sure and it has all the rules to ensure that user access controls are in place and it goes through a validation system and a multi-party approval. What does this mean? For example, if a user wants to access github.com, they go into the UI and they request membership for their application to this. Right? Now, the app, now the system is going to validate this request against our source of truth to make sure we're only sending uh expected data to this particular partner. Now if and post validation if the validation is successful a team

member an application owning team member would approve that request and voila now this membership is created and this mapping is created in the GMS repository as well but let us assume github.com does not exist they would have to go in and they have to create a new creation request and now once that is validated as well in the back end by an LLM what it's going to do is that we're now going to go ahead it's going to move across to the network security team who are going to approve this request and then that domain is created. Okay, I kept talking about validation and sources of truth, right? So what exactly is that? So in block

this is called the block software list. It is essentially a catalog of all our partners and it is also a catalog of it maintains a list of everything about a partner essentially including what type of data that they are allowed to receive. Now this greatly enhances security right because we know that at any point in time for any access requests we're ensuring that it is validated and with verified data right so apart from this we're also ensure we're also ensuring compliance asurances right because we're also checking it against the type of data that we're sending to these particular partners. So what exactly does block software list contain for each partner? It contains information such as name of

the partner, the company name, the URL of the particular partner or product and of course the product ID. Recall I spoke about the product ID earlier. This ID is what the system uses to identify a partner. And this is how it's able to bridge the uh the gap between the partner and the domain. And of course access levels for each partner like we we have a few you know we first check to see what type of data types you're allowed to receive and also what's a data security level data that they're allowed to receive or we're allowed to share with them. What exactly is data security level though? So at block we classify all data. the higher the security data

security level the more uh sensitive that particular information is and hence requires more care. So this is and apart from this we also have something known as information sharing policies where we clearly define the type of data that we're allowed to share with partners. For example we have customer data like uh and then we have like merchant data or you know uh consumer card data etc. So also at any point in time if a user wants to share some data with say chartjpd.com or any partner out there you know how would they know what kind of data they can share with them they just have to go to the block software list and they can look at it and now you

know they can be more judic they can be more judicious about the kind of data that they want to send to these partners and thereby reducing uh data breach risks as well right so how do we put all of this together and this is where the two part of the system actually is now we spoke about the block software list which maintains a list of all the partners and the type of data that they're allowed to receive. Right? So this is step one. Now, now the system also is now able to bridge the product ID with a domain because it has access to the block software list and it understands each partner has a unique

product ID. And so we h we can now with um you know at any point in time if anyone wants to talk to a particular domain that is not belonging or not approved by the company, they'll just deny their access. So now this greatly enhances security because we now know that every application can only talk to an approved partner. Now we take it a step further. What happens is that because the system also is clearly aware of all the data types and the data security levels that this partner can receive at any point of time, we're asking users to declare their intention. Declare their intention and and and tell the system this is the type of data that

I plan on sharing with this partner. And if their intention does not match approved levels, access is automatically denied or access is automatically granted. This is an extremely powerful uh solution because if you think about it, we have a clear fingerprint of every single application that needs to talk out to the internet both from a security and a compliance standpoint. All right, so all of this is great, but what can we do with it? Now, now we spoke a lot about the governance aspect, right? And we have this beautiful mapping right now between application to domains. But what can we do with it? We can do some really interesting things with this. Right? Now, what exactly happens is that we

have serverless components which will read the spiffy identity of each of these applications, identify which business unit they belong to and hence what is the enforcement endpoint. It understands this. Now these lightweight modules convert this mapping into appropriate rules. Is it a Kubernetes network policy? Go ahead and do it. If is it a DNS policy, it'll go ahead and deploy it. Is there a firewall rule? It'll deploy. Doesn't matter. So this is where the true power of the system lies where it can actually go ahead and deploy it seamlessly into any environment uh regardless of where it exists. Let's put this all together, right? Um now assume that uh you know a user wants access to again github.com

they would go into the UI they would they would uh say that they require access to github.com for their application the system in the back end you know post validation and approval would go ahead and create this mapping and then once it's created this mapping we have these lightweight modules which will pick it up and deploy this rule and if github.com does not exist or the domain group does not exist users will go ahead and create this new domain and now it'll go through a validation process as well and it'll enter into the system. Eventually all of this access will be deployed seamlessly across any enforcement endpoint across any cloud environment seamlessly. That's about it. I hope this

particular um you know presentation has been useful to you all. Um, as I said, you know, we're completely revamping and redesigning how we're doing network egress uh within our environment. We're in we're improving network security, reducing security risks, and improving our compliance um u you know posture here as well. Thank you. Um I'd love to hear from y'all if yall are trying to like uh solve for this problem or if you've already solved this problem. I'd love to hear from you. I'd love to understand how you've solved this problem provided your company's policy allows it. All right. Thank you. I'm open for questions. All right. We have a number of questions on slide already. Um the first one we

have is how do you handle situations when a new client wants to take talk to a new destination? What the uh what's the SLA for establishing a new connection? And uh what do you do when uh a new protocol is needed to be supported? Sorry, can you repeat that question? Sorry, there's a few of them there. Uh how do you handle uh when a new client wants to talk to a new destination? Uh what the SLA for that is and uh how does that change when a new protocol gets involved with that new connection? Uh a new protocol. A new protocol. Yeah. All right. Okay. Excellent question. So um every single domain is you know is a combination of

domain port protocol product ID. It's supremely unique. Nothing can change here. This is what identifies a domain. Uh you have like the domain name, the product ID, the port and the protocol. You cannot change it. If a user wants a new domain, I mean access to the same domain but on a different port and a different protocol, you know, it'll be a new domain access request. Um the SLA it's going to happen seamlessly. it within within a few seconds policies will be deployed and at any point of time if you have new clients or a new business environment doesn't matter the system will you know you just have to onboard that particular business business unit to the system and now the

system will know everything about it. Next question um this will handles data center traffic very well. How do you handle similar concerns for hardware endpoints that block produces? I'll show it to you.

Okay. Uh this is specific to production environments um not to point of sale terminals. uh I should clarify uh this is specific to applications running within blocks environment uh in our own production systems and not specific to the point of sale terminals um of of any of sort. I hope that that answers your question. Uh next question, what technologies are you using for enforcement? Uh and can you release more details uh for those who want to do something similar? Yeah, as I spoke, you know, as I alluded to earlier, um we're using a baby of technologies here. We're using Kubernetes is, you know, like Kubernetes DNS policies. We're using Kubernetes network policies. Uh we're using

firewalls, you know, standard Surikata firewalls. Um yeah, we you know, it it doesn't matter. Uh we just have to build a lightweight module which which does the translation and you're good to go. It doesn't matter what the endpoint might be. We got time for uh a couple more questions. Uh one question is, is this a product for sale? Um my regulation authority, which is my company, is going to say no. Uh following question from uh Patrick. Uh as a security person, I love everything that's here. Uh chef's kiss, how do you do um all this restriction and process without your developers and product managers hating you? We're actually creating paid bots, right? We're literally creating paid

bots because if you think about it, users and product managers used to go to different systems to set this up. Now it's just one system. If anything, they'd be happy about it, I think. Thank you for that wonderful talk. We are sadly out of time. Um, if anyone would like to speak with a speaker, feel free to come down uh see them see them here or uh elsewhere in the talk conference. We have the whole lounge upstairs. There's a lot of speakers are hanging out at uh let's give him an applause. Thank you all. Thank you for coming.

Centralizing Egress Access Controls Across a Hybrid Environment

Related talks