← All talks

The Not So Boring Threat Model of CSP-Managed NHI’s

BSides Las Vegas24:0419 viewsPublished 2025-12Watch on YouTube ↗
About this talk
Identifier: Z3YUJW Description: - “The Not So Boring Threat Model of CSP-Managed NHI’s” - Deep dive into risks of CSP-managed NHIs across major cloud providers. - Explains exploits unique to each provider’s implementation. - Builds threat model highlighting shortcomings and mitigation strategies. Location & Metadata: - Location: Common Ground, Florentine F - Date/Time: Tuesday, 18:30–18:50 - Speaker: Kat Traxler
Show transcript [en]

All right, everyone. Good evening. Welcome to the last talk for Besides Las Vegas and the common ground for day two. This is not so boring threat model of CSP managed NHIS by Cat Traxler. So, a few announcements before we begin. We'd like to thank our sponsors, especially our diamond sponsors Adobe and Iikido and our gold sponsors Profit and Run Zero. It's their support along with other sponsors, donors, and volunteers that make this event possible. These talks are being streamed live to YouTube. And as a courtesy to our speakers and the audience, we please ask that you check your cell phones to make sure that they're silent. If there's time after the talk, we will open the floor for audience questions,

and I will pass the mic around to anybody who wants to ask. And as part of the cell phone policy for Bides Las Vegas, we ask that you there is no photography during the talk. But with that, I think we'll hand it off to Cat. Take it away. Let's give her a warm welcome. >> Thank you. Thank you everybody. Thanks to the last talk. Um there will definitely be time for questions because there's nobody after me. So we can just we can just keep going. Um for everybody who I haven't met yet, my name is Cat Traxler. I'm the principal security researcher at Vector AI. I spend my days and usually nights um many turtles deep

into cloud service provider architectures mainly Google cloud um a little bit AWS and lately Microsoft unfortunately um um just in my search for um patterns within identity and I always like to splatter my kind of research philosophy across my about me page where I look for kind of the interesting vulner- abilities is where technologies intersect at the boundaries. That's where things are stitched together with awkward protocols where there's assumptions made with some good intentions and often things are forgotten and that's where some juicy things happen. Um that's where you can find these integrations being stitched together often with service agents and that's what this talk is all about. service agents, service linked roles, non-human identities, which is kind of

the moniker that the industry has picked up. I don't like it. NHIS, but the industry has kind of collected around the term NHIS to describe everything that's not a human user, the identities that are used by background tasks. So, that's what this talk is all about. Uh, if you scan this link, um, it will be taken to my linked uh, my link tree. uh my LinkedIn, my GitHub, my um Twitter, all of those kind of socials. So, without further ado, let's threat model the cloud. We're going to get into all three clouds, AWS, GCP, and Microsoft, and do a little bit of like who wore it best, a little bit of like comparison architectures, uh,

through the lens of vulnerabilities, looking at like what's unique, what unique vulnerabilities show up in each of the clouds. Um, and we're going to say, um, based off of their architectures, what are they uniquely vulnerable to and then what are they not uniquely vulnerable to? So, a little shout out to Adam Showstack who kind of came up with this framework for threat modeling. We're going to first ask what are we working on? We're talking about AWS service link roles. We're talking about Google Hello. Uh, AWS service link roles. Google service agents and Microsoft's firstparty app registrations. Those are like the three resources across all three clouds that we're talking about. Then my goal is to

keep you awake until the vulnerability part. Everybody likes the vulnerabilities. So we have to sit through the architecture. Then we get to the vols. The vols are about what's uniquely at risk for each of those resource types. And then finally to wrap it up, what can we do about it? What are the preventative controls that you as a cloud consumer can implement to protect yourself? And then where do you have like zero control [snorts] on stuff? Okay. Um, we're going to just go through order AWS, Google Cloud, Microsoft. AWS, Google Cloud, Microsoft. I'm going to go through that specific order. Write up AWS service link roles. What are we not talking about? I'm not talking about AWS

users. I'm not talking about plain old AWS roles. These are service linked roles. We're talking about the roles that are used as um the roles that are being assumed by the AWS service principles. And there's a few highlights I will point out with this architecture. It has a multi-tenant design, meaning the same identity is used to access multiple customer accounts. And I'm going to use this example over and over again. uh cloud trail is an AWS service. The same um principle service principle is used to access multiple customer accounts. It's a oneto- many relationship. We have a multi-tenant design. It has this hybrid architecture meaning there's like two different parts of this resource. You have the service

principle which resides in something that's AWSowned but then there's that role that it it gets assumed into your account. So there's these two different parts. There's in two different accounts. And the other feature we want to keep in mind is that it has a locked trust policy. Only that service principle can assume the role. When these are automatically created in the background, they're not automatically granted permissions. You have to grant them those policies. Okay, moving on. What the hell? Interrupted. Only kidding. >> M&M's. You asked for brown M&M's in your speaker request. >> Did I ask for brown? I did, didn't I? I was feeling super bougie. Wow. [laughter] >> Anyway, thank you so much.

>> Wow. Thank you. Thank you for paying attention to my bougie request. I was feeling like very Van Halen at the moment. Thank you so much. Um, do you want some? >> I would absolutely take an Okay, thank you. just >> Thank you. >> Thank you. >> Thank you. >> I think it is Eddie Van Halen who um >> the great philosopher. >> Yeah. Who always had that in their writer whenever they traveled that they had to have their green room filled with brown M&M's. And I was like, am I at that level? >> You have arrived. >> Turns out I am. >> You have arrived. And we're grateful to have you here. Okay, >> thank you. Uh,

all right. Google service agents. What are we not talking about? We're not talking about Google users. We're not talking about your Gmail account. When you're in your Google project, you can create service accounts. We're not talking about those resources. These are service agents. These are service accounts, but they're owned by Google Cloud. This whole talk is about CSP managed NHI. So these are the resources that this the cloud service provider is managing for yourself. Some highlights about the architecture as opposed to AWS. It's single tenant. The same principle is not accessing multiple um tenants. It actually has this moniker per per project per product. Meaning there's a new principle created for every project and every

product, right? Completely different than the AWS model, single tenant. It's completely provider controlled. That resource is completely living within a Google managed tenant. And unfortunately, it has birthright permissions. When you create a service in Google, it automatically is assigning them some fairly high uh powered permissions. And last but not least, we have our Microsoft firstparty app registrations. We're not talking about the apps you create. We're not talking about managed identities. We're not talking about users. We're talking about the apps that Microsoft creates that live in Microsoft tenants that then are automatically installed in your tenant. Your Entra ID tenant, you automatically create one. a whole bunch of default um apps that you consume from SharePoint to 0365 and

you're going to get some local service principles associated with that. Uh let's compare them to AWS and GCP. Um like AWS, it's going to have that multi-tenant design. So that same first party principle is going to act in a whole bunch of different tenants. You're going to see that same gooid. You can track that from tenant to tenant to tenant. It's looking pretty similar to AWS. And again, it has that hybrid architecture. So in that Microsoft controlled tenant, it's that app registration, but then as a consumer, you're going to have that associated service principle. It's like two parts to the same to the same identity. And then you'll have your role bindings within Entra ID or Azure

from there. Okay, we got through the architecture piece. We have a little background. Some of them are multi-tenant, some are single tenant, some of them are like wholly living within a managed um tenant, some of them are kind of like hybrid. Um let's look at some vulnerabilities that are unique to each one of these implementations. So what's what's AWS uniquely vulnerable to? And this this multi-tenant architecture is uniquely vulnerable to a type of oh gosh, 10 minutes. I'm so glad I'm the last talk of the day. Um, it's uniquely vulnerable to a type of confused deputy attack. Um, AWS calls this a cross service attack, but basically what happens is when you say allow a service principle to write to

your bucket. Classic example is going to be cloud trail. You're going to let cloudt trail write some logs to your bucket. You're going to allow cloud trail that service principle that that global service principle. it's then going to be um allowed by you know thousands of customers that global principle an attacker somebody with malicious intent if they know the name of your bucket they could then write logs to your bucket um they could direct their log output to your bucket for nefarious means uh it's kind of a simple and dumb explanation but it's a natural consequence of this multi-tenant architecture are the other clouds 's vulnerable. Um, Google Cloud definitely not vulnerable. They go with the single tenant approach.

Um Microsoft maybe. Um, they have a multi-tenant architecture. So, it makes sense that they would be, but nobody's done research into it. Um, and if this pies your curiosity, I encourage you to go and check it out. Now, on to Google. Google has single tenant architecture. Um, but it's still vulnerable to a type of confused deputy attack. Um, this one's really more of a privilege escalation issue because service agents are given these like automatic birthright permissions. They often have like really high-powered admin permissions. And when you couple that with say um you know um a document writing service, you could then maybe manipulate that service to read or write from a bucket that you otherwise

wouldn't be able to do. So let's say you don't have permissions to read and write from a bucket, but the service agent does, and you can leverage that to then take actions that you otherwise wouldn't be able to. It's a type of confused deputy. It happens within your own project. really more of a privilege escalation issue. Are the other clouds vulnerable? Not really. In AWS, they kind of mitigate it in a way that we as a community have accepted. Um they use this type of permission called pass ro that's going to kind of like mark that issue and say we understand this is privilege escalation. We're going to require you to have an extra permission

to do it. So kind of colloquially we've all assumed that this is like preventative control. Again, big huge question mark with Microsoft. We don't know. The architecture suggests that it absolutely would be. Nobody's looked at it. So um I hope this is going to like inspire some Microsoft researchers. Um I don't volunteer, but please somebody go and look at this because um their architecture would suggest that they would be vulnerable. And finally, in the absolute wildest wildest attack um known is service principle hijacking. And this has been known for like five or six years. It's been talked about for a while. Um the local service principle associated with Microsoft's firstparty applications. You can actually add a local long-term um

credential to that service principle. you uh you and me could then authenticate as that service principle and use the permissions it was assigned in a birthright birthright scenario. So this is like a great privilege escalation technique. Uh you and me can um authenticate as the 0365 service or the SharePoint service through this technique. Um it's a absolutely wild thing coming from a non-Microsoft background to think about that. And are the other clouds uh vulnerable to this type of attack? No way. Um you cannot add long-term durable credentials um in AWS and you cannot add long-term durable credentials in Google Cloud. You couldn't even generate short-term credentials. Like this is a very specific Microsoft um thing.

Okay. And our last section, um what can we do about it? Or really what can anybody do about these inherent vulnerabilities to um non-human identities? Um whether whether this is a concern or not, um preventing these is not always entirely up to you. Only in AWS can you actually positively prevent these issues. And that's because AWS allows you to add these condition keys. Remember how I said that the issue in AWS arises when these global multi-tenant um service principles are allowed to, you know, read or write to buckets. Well, you can actually add these condition keys that say, well, yes, cloud trail, go ahead and write logs to my bucket, but only when coming from my account or only when being

invoked from this like known organization, not from, you know, some silly unknown organization. So, in Microsoft, you can actually positively prevent that. Um on the opposite end of the spectrum we have Google. There are no preventative controls to prevent this um confused deputy attack. Uh if you're able to manipulate a service agent to perform an action that you personally don't have permissions to do, this is a vulnerability. Uh Google recognizes it as a vulnerability and please report it to them. Only they're positioned to then implement a check. they can implement a caller check that says is the caller does the caller have those permissions what it's asking the service agent to do um only then only they can implement it

and I would say it's up for debate on whether or not allowing the um cloud service provider to have these controls or whether or not you'd rather as a customer have the controls which one's better I'm not sure and finally uh Microsoft and I I I truly try not to have this presentation just about picking on Microsoft, but they really do kind of like lay an egg on this comparison. Um there is a preventative control that could prevent them prevent you from adding a long-term durable credential. It's called the app instance property lock. Unfortunately, only as the application publisher can you add this prevention. So only Microsoft could add this prevention as the publisher of the first party app

registration. They have not consistently done this and they've they've done that to to some applications. Um however some really critical ones they have not. Um so there's nothing you can do currently to positively prevent this type of privilege escalation say through 0365 SharePoint. Um, I have a white paper coming out in maybe a week or so on this issue with tons of references and links of people much smarter than me talking about this issue and it's been going back for years and years and years. So, there's a lot of documentation on this privilege escalation technique and um how Microsoft hasn't really closed the loop on it. Um, so what have we learned? Have we learned anything in this threat modeling

exercise besides just cit cynicism and existential dread? Um maybe um we have some best practices in design. Um this slide is for the cloud service providers. if you're listening um if AWS uh the holy trifecta of AWS GCP and Azure if anybody shows up and sees this these are some best practices we want single tenant um NHIS we don't want these multi-tenant this leads to um just a growing impact when things go wrong we want zero birthright permissions only EWS does this correct and we want that resource to be wholly controlled within a the CSP tenant. We don't want bits and pieces within the control of the customer, aka a malicious actor. Google's doing this,

right? This slide is for you and me. This is like what we can do. And unfortunately as far as preventative controls are um talked about only in the AWS world can we do anything with conditions keys on our resource policies saying hey global service principle you can act on a resource but only when coming from these specific uh accounts or organizations in Google cloud in Microsoft um detective controls are your best friend. And finally, this kind of comparison threat model would not be complete if I didn't have a little hot take uh grading. Far and away um I got to say with this kind of grading, Google Cloud's got to be doing it the best. Um they have

single tenant architecture. They have preventative control. The preventative controls are are squarely on the side of the CSPs. So you as a consumer don't have to do anything. it's up to them to do something about it. The only knock I'd give them is they have this excessive birthright issue which I guess would also lead to the fact that it's really easy to use. It's a very easy to use platform. Next up we have uh AWS amazing zero birthright permissions. Um but unfortunately they have this multi-tenant architecture which increases the impact and the blast radius when you do have issues. Um, and then the mitigations that's on the shoulders and the backs of the customers. Um, and just really really

in the bed here as Microsoft. I'm sorry, but like automatically assigned permissions. Um, multi-tenant architectures. There's no preventative controls that you can do to prevent this. This is all in Microsoft and they haven't been doing what they need to do. Um, so I'm I think I'm pretty generous in giving them a D in this. Um, but everybody has some work to do honestly. Um, if you enjoyed this talk, um, just wait for like a 20page white paper to come out on this if you really want to like get into the architectures and the different types of attacks and preventative controls and detective controls. Um, and that's all I got. Do we have time for a question?

[screaming] One question. Okay. Oh. Oh, Sean.

>> We need we need it for the >> for the stream. Yeah. So um are you and this is a question is are you opinionated in like a multi cloud environment like which one of these providers it you you would base as a primary identity because every single one like all the the big three that you have right there is like they have built into their platforms ways to federate like federate that identity. on other ends of the platform. So AWS, same with with Google Cloud versus Azure, like what are some questions that you would like the audience to think about when selecting primary ident which is the primary identity and which is you're going to federate and where you're going

to federate out. >> That's like a bigger question beyond the scope of this talk because I think the scope of the talk is really on the CSP manage NHIS, the ones that you don't get to choose. you you're showing up and using a cloud and there's all sorts of background identities created for you and you don't have a choice to federate them. There's some inherent risk in those background identities. The question you're talking about is more like what's your like decision around your um your enterprise directory, right? And like I think unfortunately people don't have a lot of choice in that. I think a lot of enterprises show up and they're like, "We're a Microsoft shop. Uh, you know,

Bob bought the E5s and that's just how this is going." So, your question, if if all things were equal, my answer would be different. But the realities of the game are like what your enterprise directory is is not based on like design decisions of the identity provider. It's based around what Bob and accounting decided and like you just have to you just got to run with that. >> Thank you. >> I'm I'm a practical person. >> Hey >> with a bag of brown M&M's. All right. Uh, anything else? Cool. Have a good night everybody. Thank you.