
All right, afternoon. Uh, welcome to Besides Las Vegas. It is the afternoon and it's right after lunch. So, this is going to be an exciting one. There you go. Someone's awake. Um, uh, so this talk is, uh, innovation, shiny, and vulnerable. Four ways to exploit modern SAS data platforms. It'll be given by Ben Kaufman. Uh, a couple quick announcements before we start. Uh we'd like to thank our sponsors, especially our diamond sponsors, Adobe and Aikido. Uh and our gold sponsors, Floor Mold and Drop Zone AI. Uh it's their support along with our other supporters, donors, and volunteers that make this event possible. Uh also, just as a reminder, these talks are being recorded. Uh they'll be on YouTube. Um
and please remember to silence your cell phones. Uh there'll be a little bit of time at the end for questions. Uh we'll try to pass a mic around if we can. Um yeah uh welcome Ben.
>> Cool. Uh thank you everyone so much for coming. I'm really excited to present this to you and be here today. Uh my talk is innovative, shiny, and vulnerable. Four ways to exploit modern SAS data platforms. That's the QR code for the slides if you want to scan that real quick and follow along. especially for the folks in the back. There might be some small console texts. So, uh we're going to be talking today about some applications that I've tested uh on behalf of my employer, Ptorian. Uh so, without further ado, let's get started. So, a little bit about me. Uh I am a senior security engineer at Ptorian. Uh I focus mainly in advanced product
security. uh while while I'm at Ptorian um testing big uh interesting multi-attack surface products like the ones we're going to be talking about today. Uh and I want to preface this talk. We're going to be talking about data platforms, right? But I want to preface this by saying I'm not a data engineering guy. I was honestly a little intimidated at first to uh approach these platforms just from a a pentest perspective, but they ended up being some of the most impactful uh having some of the most impactful findings uh on these on these tests just because of the complexity and because some of the nuances that we'll get to later on in the talk. So, let's start off by talking
about first what even is a SAS data platform? It sounds like such a generic term. Uh, and it is like there's a lot of different types of applications that could kind of satisfy this category. Uh, a lot of them are sort of startups. They might have sort of flashy names or marketing sites. Um, you might hear customer data platforms, edge edge computing platforms, data transformation, data pipelines. Uh, the those are some of the buzzwords you hear, but they all have kind of they all kind of do uh similar thing and they have some similarities. namely uh they ingest lots and lots of data. So that might be sort of a streaming source or um just like a batch processing. And so
you you probably heard something like ETL, which is uh extract, transform, and load. And that's that's sort of the industry term for data processing at scale. So most of these have some sort of a an ETL function. Um, and for real-time streaming, that might be like fraud detection for e-commerce websites or or banks or something like that or maybe even ingesting sensor data from uh scientific or research uh companies, that sort of thing. So, it's it's proprietary data. It's oftentimes customer and consumer data, right? Uh, and then of course they have some sort of a reporting feature as well. So, you can view the the metadata and the statistics about all the data that it's
bringing in. And then some of it might be feeding into AI or ML workloads and applications might be uh training for th those ML models and whatnot. But this is a security conference and we're security practitioners. So why do we care about this? Well, you might have already thought about this now that a lot of this data that's being ingested into these platforms are critical customer data. So whether that's proprietary or or customer data like PII or PHI. Um and then of course in order to even bring that in to these platforms they have to have privileged access to all these other thirdparty systems. So that means secrets, credentials, all those sort of things from an attacker perspective.
That's what we want to that's what we want to target, right? So let's look at some examples. Uh data bricks is kind of the prototypical data platform. So you can see on the left hand side of the UI there's uh dashboards. Um you can kind of query the data that you've pulled in using SQL. Um there's pipelines and data ingestion. So bringing all that data in and then building some sort of a transformation pipeline and then of course some machine learning features as well to build some ML or or AI applications as well. Today we're going to be looking at a fake data platform called databro.ai because all the SAS bros have.AI uh top level domains. Um
and so you can see it's got kind of similar uh features to the uh to to data bricks. got data sets where you brought everything in. It's got pipelines. You can query that. You can perform transformations uh on that data. And so we're going to be looking at this example. So without further ado, let's start getting into the exploits starting with number one, which is control plane access control issues. So if you're not familiar with the term control plane, that is just basically how users interact with the platform. So we've got a few different examples here. the a tenant like a customer uh tenant might have multiple workspaces and then you've got different users that are uh belong
into those different workspaces and have different permissions within them. So you might have a developer user that's using like a web UI and they have uh access to the test workspace and then they're messing around with test data uh pipelines and stuff like that. And then in the production workspace, you've got maybe a service account that's authenticating authenticating using an API key. Uh and that's primarily interfacing with some sort of an SDK. Um and it's kind of kind of got more maybe a more mature DevOps uh CI/CD pipeline going. And then you've got an an administrative user uh and they maybe are just using C the CLI tool that the data bro platform provides and they've
got permissions over the entire tenant including all the workspaces, right? And then all these different tools are are sending backend requests to an API which is hosted in the data bro data plane which might be a cloud environment it might be on premises but this is what we would refer to as a control plane and the main issue with these is just access control issues. So stuff like IDOR bola um these are some of the simplest to find the simplest to exploit and also the easiest to fix. But because of that, they're also uh the most impactful. Right? So we've seen cases of crossworkspace compromise. So imagine a user is accessing a data set that they
shouldn't have permissions to access maybe from dev to prod or even worse accessing data sets in a completely separate customer tenant. Right? So this is the lowhanging fruit. Uh and this happens a lot because these attack surfaces are so large and there's so many different API endpoints. Something almost always goes goes wrong, right? And when it comes to vert vertical privilege escalation, we might see something like escalating to the tenant administrator user or maybe even finding some sort of an admin API endpoint and administrating the entire platform. And things that can make this worse is when uh these platforms are using low entropy object identifiers, which means imagine like data set ID is 5342. That's super easy to brute force. It's
not some long random string. So if there is some sort of a vulnerability, you can just brute force that and it makes it so much worse. Uh and then of course if there's some sort of a self-signup account that gets access to kind of a production level tenant and there's these types of vulnerabilities, they can then exploit all those other uh vulnerabilities as well. So let's dive into some of the more unique uh specific features to these types of platforms. First of which is what I like to call code execution as a service. So this is an example data pipeline that you might see uh in one of these platforms. In this case, it's pulling user activity
data from an AWS Kinesis data stream and then it's feeding that into a uh like a Python job. In this case, the Python job is called normalize user sessions and all that data is being output into a Snowflake data lakeink. So this is kind of something standard you'd see and then maybe they'll have some sort of a pipeline editor where you'll see something like this on the UI. So when I'm approaching this from an adversarial perspective before I just start throwing malicious payloads at it first I like to develop a working pipeline. Uh and something that helps me with that is I just basically ingest all the public documentation into an AI project. uh and
then ask AI to just basically spit out a working pipeline with just basic uh examples and then just use you know test data basic stuff maybe just import like a CSV or a JSON file and then a lot of these platforms will have a UI where you can edit the the the ETL scripts uh just in the web application directly or maybe they'll have more of a sophisticated CLI and SDK that you can then just use VS Code or whatnot. And so after I've got a working example set up, then I start attacking it, right? And the goal is to get a shell and get command execution so we can start exploring the underlying uh
underlying compute environment. So before I start doing that though, kind of two different things just to sort of test the water um I try to basically make a network request outbound to my server. And in this case, it's using Python. So it's a high level programming language. There's a million different types of libraries you can use. The most simplest of which is request. So just a basic request.get out to an attacker server. And then if I get an if I get a call back that tells me that there's no network egress restrictions in this compute environment and that we could potentially get a shell. If there are egress restrictions and what for whatever reason we can't bypass that,
there's there's luck. Uh there you might be in luck. So we could try to output a canary to the logs. And so a lot of these uh UIs will have like basically job logs. So for each of these jobs um it'll be writing basically what it's doing and then if we can output a canary so just a random string in the script and see that in the logs that tells us that we can maybe execute commands on the underlying runner and then we'll get the output in the logs. Right? So, and then another thing is sometimes these workloads are are using EDR. So, they have visibility into maybe if it's a container or whatnot. And so, you know,
kind of testing that as well with some reverse shells or maybe some malicious implants to see if there's EDR. If those get killed, we'll know. But let's say this is just a completely bare environment. We can do whatever we want. We can execute any commands we want. Uh, I like to use C2 for this to get a shell on the underlying runner. And so you could use obviously like a Python reverse shell, but sometimes reverse shells are a little finicky uh and they don't stay persistent. And so uh I'll just use something simple like sliver C2, which is an open- source uh C2 framework. And so in the in this case, we'll be pulling down a C2 implant,
writing it to the temp directory, uh changing the the permissions, and then executing it. And then you can see on our attacker machine, we've got a session, and we can open a shell in that session. And so for the next few slides for reference, the green text will be the attacker machine and then the white text will be the file system on the actual job runner. And I want to note here that the the the code execution isn't necessarily a bug. It's a feature of the platform. You can it's arbitrary. You can write whatever you want. But the problem is that sometimes the developers are not trusting that you will be able to get to this position. and they're not
trusting that they have everything in place secured up to that point. And sometimes they might try to sandbox the environment, but if you're using a highle programming language like Python or JavaScript in these platforms, there's almost always some sort of a bypass if they try to kind of, you know, reduce the capabilities of those built-in libraries. There's almost always some sort of a bypass. So in this case, we've got a shell and now we can start moving on to our next attack vector, which is data plane access control issues. All right, so we're on the runner and then of course we just want to get a little bit of situational awareness to start off with. So just
list all the files in the root directory. Uh we can see that there's the docker environment uh file which tells us we're in a docker container. Um which makes sense. And then just listing we can see that there's a function folder, right? So the function is probably related to um the the script that we're running. And so list that. We can see that there's our normalized user sessions Python script that we injected malicious content into. And then there's some others folders as well like an input and output data folder. And then some maybe some third party libraries are are are written there as well. And so what do we do from an attacker's perspective? You might be thinking we're
in a container so maybe try container escapes. If we could get a container escape that would be super impactful and get onto the underlying instance if it is on an instance and it's not like a container or orchestration environment like Kubernetes. Um but honest honestly this isn't always the most fruitful path because it's fairly simple to uh run a container without making it super vulnerable. You have to really try to make it vulnerable either making it privileged or mounting the host file system. We could also try to perform like a mini network pen test. Um, but a lot of times these platforms are running in a cloud environment and so there might not be that many network resources
to really try to exploit. And so instead we'll kind of focus on some of the attack vectors that um as follows. So first things first, we're on a file system. We want to scan the entire file system for secrets. Ptorian has an open-source secret scanner tool called Nosy Parker. And this is what we primarily use. So I will just run this on the entire file system starting with the root and in this case it detected a generic API key in the uh proc one command line file. So that is the whatever is in that file is the entry point for the container. So whatever command is in there that was what the container first ran when it was created.
So in this case it looks like it's running bash and then it's calling a binary at user local bin execute job and then whatever this execute job binary is it's passing in a job command line argument and then there's our normalized user sessions Python script. Uh and then it's also passing in a data bro API key. So that's that's interesting. Um and there's the plain text API key. So, as we talked about earlier, there's uh an API and you can create surface accounts and then maybe you know those service accounts authenticate to the API key if it's in some sort of a development environment. So, we're not familiar with this a this API key. So,
let's check it out and figure find out what it does. A lot of times these APIs will have like a who am I endpoint. So, in this case it's the slash user context endpoint. We'll just run curl pass in our our API key and we can see that we are running as the automation databro.ai email address in organ organization ID1 which tells us that this belongs to the platform itself. Um and it's some sort of a service account. Let's just you know for the sake of trying run it on the most sensitive endpoint in the API which is the secrets endpoint on some arbitrary organization. So slash API/organization slash 1000 just an arbitrary ID again a
low entropy ID which is just super easy to guess slash secrets and then we get a success response and see that we can access some arbitrary secrets. In this case there's an Azure uh storage access key. So what does this tell us? The platform was using just this super overprivileged service account to provide all of the secrets for every job that was running and it was basically written to every single job which is obviously a huge violation of the principle of least lease privilege. Um the problem here is that the developers are not assuming that this compute environment is going to be completely owned, right? And so every single aspect of it including just the entry point
could be read by an attacker. Uh and and and and this is a case of developers kind of taking shortcuts uh in the architecture and the secrets management of their platform, right? So this is a a super common example we see. Uh, and then another thing we always want to do because again we're in cloud environments is query the instance metadata service. So if you're familiar with the AWS IMDS um IMDS v2 is what's is what's out there right now. The reason why it's v2 is because they changed it from one get request to three requests. You have to request the IMDS uh IMD IMDS. Man, that's a that's a tongue twister. uh you have to request the
token and then using the token request the job ro name. Once you have the job ro name of the actual IM ro that's attached to this instance you can retrieve the credentials uh of that role. So that's the third one. The reason why they implemented this is because it requires a put request and there's just some added complexity because uh let's say if there's an application that's running on a AWS and it's vulnerable to serverside request forgery. If there's just a basic get request, you can retrieve the credentials and just own that instance um without having to perform more arbitrary things like HTTP put requests. So, we've pulled the IM RO credentials and we've then authenticated using the
AWS CLI and we can run AWS STS get caller identity and we see that we are authenticated as the execute job role. So this is awesome but we don't really know what we can do with this access because we don't really know the names of uh any of the resources that are in the cloud environment without just completely brute forcing it. So but brute forcing isn't super effective because again you have to know the names. So what we can do is we can go back to the file system and look for clues. If you remember there was that execute job binary. So let's just do some very simple basic reverse engineering and we can see that there's
actually a databro job execution logs string which it looks like an S3 bucket. Let's try listing the whole S3 bucket. And voila, we can list the whole bucket. And it turns out to be the bucket that's storing the execution logs of every runner. And this these logs obviously seem like they've got some pretty sensitive information like PI. There's emails, phone numbers, there could be other secrets. So we could literally just download every single organization's log and then just secret scan the entire thing. And the issue here is that the developer is not implementing lease privilege policies on all their all their IM roles. And if they architected this to be specific to their organization's identifier, this
wouldn't happen. And if you have the credentials, it wouldn't really matter because you can access that information anyway. So let's move on to the fourth and and final exploit which is weaponizing highly scalable infrastructure. So we saw uh a traditional infrastructure example previously which had kind of a lot of server management and the reason why serverless is so popular is because you don't have to worry about any of that and if you get that stuff wrong it could lead to a lot of sec security concerns. All you have to do is focus on your your code and your business logic and and you're good to go. But you can get this wrong. So in the first example, we see
this developer is using Versell uh and they didn't anticipate all of their their app blew up and they didn't anticipate all that normal user traffic um and they had a huge bill of almost $100,000. In the second example, their site actually got DDoSed and again their bill blew up because every single one of those requests had to get um every a function had to satisfy every single one of those requests, right? So we were testing an application that was using AWS Lambda which is a serverless function provider. And the way that this application worked is you send events to this API and then it's called it calls that those event data gets sent to the
AWS Lambda. You can do whatever you want. You can send you can perform some sort of a processing send it on to a uh storage or other API or other type of service. Um, we had the idea because you can execute arbitrary code in this lambda, what happens if we send another event back to the streaming API. Well, when you do that, it basically just creates an infinite loop and it just goes around and around until you stop it. So, what happens if you send more than one request in that function back to the API? It's going to explode exponentially and it'll just go on two, four, eight. But what happened was we set this up and we sent a single request
to that API and then within two minutes we just kept refreshing the page. Within two minutes we had to completely stop it because within two minutes there were over 10 million events and we actually blew past uh an API a free tier limit and we had over 10 million. You can see there's negative 10 million API calls left. And what could an attacker have done with this? They could have DDoS an arbitrary site. they could have caused hundreds of thousands of dollars uh to that developer. Um or they AWS might have even shut down the platform because they noticed that there's just this exponential increase in Lambda calls. Uh and so this tells us that the developer
might not have done uh enough threat modeling to sort of anticipate this business logic abuse, right? So we know that these applications are v vulnerable. Why should we care? Well, these are SAS data platforms, which mean they're publicly exposed, which means their attack surface is easily accessible, and they access lots of proprietary and consumer data, which is means a lot to all of us and all of our organizations, and they're very integrated with a lot of business critical processes like AI, which a lot of organizations depend on, which tells us that these platforms are major supply chain attack vectors, and they need to be properly secured. And the fact is that if you're working in a large
organization and you're working in security, your organization is probably using one of these. And so it's a supply chain concern. Uh it's a third party risk and you need to be aware of some of the some of the issues. So this concludes my talk. Thank you all so much for coming. Thank you to my mentor Ali. Uh thank you to Ptorian and the uh Bides proving grounds track for the opportunity. Thank you all.