Go With The (Work)flow

Name: Go With The (Work)flow
Uploaded: 2022-07-06
Duration: 23 min 3 s
Description: Ryan Robinson • Nicole Fishbein - Go With The (Work)flow An eye-opening look into the world of cloud workflow management platforms and their security risks. This talk will unveil research into the world of misconfigurations, mountains of credentials, sensitive data leakage, insecure coding, and con

BSidesSF · 202223:03302 viewsPublished 2022-07Watch on YouTube ↗

Speakers

Ryan Robinson Nicole Fishbein

Tags

CategoryTechnical

TopicCloud IAM

DifficultyIntermediary

TeamRed

ResearchCase Studies and Incidents Analysis

StyleTalk

Mentioned in this talk

Tools used

Argo

Platforms

Apache Airflow Docker

Service

Shodan

Malware

XMRig

About this talk

Ryan Robinson • Nicole Fishbein - Go With The (Work)flow An eye-opening look into the world of cloud workflow management platforms and their security risks. This talk will unveil research into the world of misconfigurations, mountains of credentials, sensitive data leakage, insecure coding, and containerized malware. Sched: https://bsidessf2022.sched.com/event/rjpY/go-with-the-workflow

Show transcript [en]

all right everyone now ready to get started with our next talk so our next speakers are nicole fishbain and ryan robinson nicole fishbain is a security researcher and malware analyst nicole has been part of a research that led to discovery of phishing campaigns undetected malware and attacks on linux-based cloud environments and ryan robinson is a security researcher for intesar he specializes in malware reverse engineering and threat intelligence today's talk is go with the workflow thank you very much i'll just start our wee timer here so there we go thank you very much for coming to our talk and pretty simply what our talk is is it's about workflow applications and the security implications that come sort of with them which turns out

it's quite a lot but first let us properly introduce ourselves actually know that introduction is very good but so i'm ryan i'm a researcher at intelzer i mainly specialize in i guess cloud frauds a malware in previous roles i was a security engineer for consulting company and then a research and anomalies threat research team on yourself hi my name is nicole fishbein i'm a security researcher at intesar and previous to that i was in the embedded r d department in the idf all right so today we're going to talk about workflows and how dangerous it could be when you misconfigure them and expose them and we're going to present two case studies one is argo and the other one is

airflow we will explain a bit about the different features of each platform and how it was misconfigured and what information we were able to retweet from there and we're not going to leave you without some practical advices on how to you how you can detect this type of misconfigurations and detect it in your organizations all right so let's start with what is a workflow let's say you have some data in a source database and you want to move it to another centralized database so for that you will need to get information maybe analyze it and then store it each of these functions can be broken into smaller things like smaller functions smaller tasks where each task

will pipeline pipe the relevant information to the next task and that's how you have a workflow and you're probably going to want to schedule it to execute execute on certain hours certain dates and so on so that's exactly what works of platforms are meant to do and provide you with disability and some workflow platforms cost money some are free and some are open sourced so our research focused specifically on the most popular workflow platforms based on github repo and stars and we wanted to see can we find someone that misconfigured the workflow instance and now we can access the dashboard and if so what kind of information we can find there and lastly if we have access

to the dashboard can we execute malware so the first case study is argo okay so yeah our first case study is argo workflows in my opinion the workflow software with the best logo i love that guy it's like a squid or something uh octopus but so just a short about uh argo workflows there's also an oversolver called the argo cd um but this one's really really popular so container native uh workflow engine and for those who don't know what container native means it's kind of it's the best level of uh infrastructure so hypothetically if you can run kubernetes or containers and stuff you can run the software so you can't it's uh open source which obviously we love open

source fantastic it makes it also slightly easier to do some research on so it does you know compared to pen per product and doing sort of research on that it's designed for kubernetes um great stuff and it's incubated by the cloud neural computing foundation and these are also the cm people that maintain kubernetes so it's probably as unofficial a workflow engine for kubernetes as you'll get and then i set up that as my last point there it's great for compute in terms of jobs and away when key fees for sort of foreshadowing but the first talk about what a workflow is nicole's pretty much covered it all already but in argo it's defined in yaml

and you'll probably notice quite well like similar to jason so it is a workflow would consist of one or more steps typically a step will be a container it can be a few other things it can be like scripts or kubernetes resources but for the most point people running computers inside that and very simply start a step with an input capture the output of the processing from that step for a container and then use that output as another input and next step and busy so on until you finish what it looks like that's just some yaml for a world um for hargo and then um that's what it looks like whenever you can submit it through

the server uh ui there's also a cli tool if you know you kind of like the terminal more than using sort of an annoying kind of browser you know um so there's one other important concept is called off mode or authorization mode um there's three of them oh please click okay there's server client and sso and for the purposes of this talk we're going to focus on server so we are but one thing i'll just point out before i go further is that server was the default until version 3.0 and maybe this you'll kind of find out why so and what the the sort of definition of that is is that server see here in hosting mode use the cube config of

the service account in local mode use your local queue config and in short terms it's pretty much if you can access the web gui for this and it's in server mode you inherit those permissions so um what this is here is the ammo for the um you know you can use a cube cuddle and apply like the quick start but um in the default quick start that i got you got all these permissions here pretty much the full monthly permissions so um pretty much out of the box like the default configuration or at least for the quick start it came kind of uh about the permission heavy so that and so busy hugging that be exploded and i kind

of brought it down into a simple equation on the equation i sort of say it's server off plus excessive permissions plus external access equals profit so it does and you know you can sort of think what you can do with that and uh but for this case study we find something pretty interesting but obviously to find something interesting what you first have to do is kind of look for stuff so what we've done is we tried to find as many um instances as possible that were open in the world and you'll see this with our second case study that we find even more as well but sort of what we've done for this is use internet census tools

stuff like showdown census uh i know that uh robert 7 have like open set of data as well where to scan the internet and you get those results um if it's setting out on the internet there's probably a good chance that it's being indexed by google or ally you know so you can just kind of google it and it shows up or what's a slightly more experimental one you don't get as many hits but it still does work is what i call the brake force so whenever many companies will deploy something especially for production they'll create a certificate for it and like a sort of domain and you know it used to be something like argo.company.com

you can actually just permutate a lot of company names and then just stick argo in the front of it and you'll actually be surprised at how many you can find and yeah we did find some crazy ones so that's that's pretty much how i went through finding it just google it so when going through one we find this like what so um to most people this might not be that shocking it's just some yaml but what we found on a few of the clusters was so this one in particular running for nine months by the time we found it was a axam rig monero cryptocurrency miner and so yeah on on a few of the clusters we

had found that i guess uh fret actors whatever you want to call them had gone on and started using argo to deploy um you know exam rig miners across like all these clusters like many many different nodes in each cluster so that's that's the great part about the compute in town support that particular image was on docker help it got taken down now it had been used millions of times and it was even an azure blog for being used against mass attacks against kubernetes clusters and so yeah we found it quite interesting i wouldn't say it was fully wide scale across all of them it's almost maybe like someone was experimenting and had a few of them or

else other people find it out and kind of wiped it out but yeah so we found it really interesting so you can on your workflow software you can get some malware as well so her next key is today all right so the next case study is airflow and airflow is the most popular open source workflow platforms based on stars in github with over 25k stars workflow is based on python so you can write your tasks using python and it supports multiple plugin and play plugins and the core unit sorry the core concept of airflow are the directed acicli graph so as we describe the workflows when we create a few tasks and we one pipe the

information to the next one you can create branches and so on and the core unit is a task uh writing in python now your workflow is going to use different variables global variables and you can store them in a structure called variables and your workflow is probably going to connect to other services and databases and so on so to connect securely and store the passwords and api keys that you need for the connections you can use the connection and airflow supports logging mechanism to emit metrics and to better understand what is going on in your workflow now we were able to find credentials to these applications like aws api keys azure api keys paypal wallet ids and so

on all in plain sight and just to be clear we're not saying that these logos are compromised who are saying that we find lots of credentials to these applications and these credentials were still stored all over the features of airflow that we just covered all in plain sight all are visible through the dashboards so to understand how we were able to see all this information we just access the dashboards and we access different tabs so the top play is where we were able to find credentials was in the code bad code practices lead to leakage of information one level of obstruction is when you put your api keys and credentials in the variables well the information is not getting encrypted

and it's in plain sight and the connection structure is actually the one and correct place to store your credentials you will need to enter the information to the password field well when it will get encrypted and will not be in plain sight lots of users did the opposite thing when they put the information in the extra field where it's not getting encrypted the logging mechanism and airflow has an actual cv because when you would enter your credentials through the cli it will be get it will be presented in plain sight and if you connect to the uh using the password field in the connection structure it will be once again logged in the plain sight

and this vulnerability was fixed in airflow version 1.10 and later the configuration file is created as soon as you create your first airflow instance and lots of users put their credentials in this configuration file now maybe that will be fine but the thing is that the configuration file can be stored in plain text on the dashboard when a certain flag is set to true so once again we will we find lots of credentials in the configuration file and the ad-hoc query uh allows you to run queries on whatever platform is connected to your workflow and if you're connected to a database anyone with access to the dashboard can query the database now if your dashboard

is is accessible to anyone anyone can query your database so it's a very dangerous feature and now we're left with one question can we run malware and the answer is yes during our research we were able to find lots of container images that are publicly available so threat doctors could replace the legitimate image with a malicious image and when the workflow will be executed anyone can sorry everybody can uh run a malicious smaller amorous container and we set up a test lab in our test lab and airflow instance where we used a plugin called code editor that allows you to run to write and run python code and we were able to create a malicious

container so we found lots of information that we were not supposed to be to see and it was caused by insecure coding practices by using the features in a wrong way or vulnerabilities but airflow did an excellent job on improving the platform and now it's up to the users to actually um they update their versions hobby days and library to action actually we're doing quite good in time i started laughing especially for the airflow um slides like it's just it's just a blur of like pixels it's like all sensitive information it was insane like to quantify how many credentials we find i just want to say it's on an absurd amount so but anyway um so for the protection part

uh i would sum it down to one phrase really and it's the basics really and so the way i said that is um and i really want to put this point to the top is that each of these issues on their own might not actually be that bad but when you chain them all together it can be really really catastrophic so and like we like we've seen this stuff like mod you know you'll have someone they'll deploy or flow to their cloud instance and then maybe the one of the first things they do wrong is the security groups or the firewall rules as configured anyone can access so someone can access from the outside they

get to your flow instance it's a outdated version unpatched version or something therefore there's no authentication there's no login whatever so they can get past that stage then they can get the cr code and the code has hardcoded the credentials inside that the credentials you know we sell a lot of uh aws keys they have like way too many permissions and then from that you know steal customer data do whatever there's so much sensitive information we saw so you know there are kind of multiple points along that line that you could have stopped that you could have done the permissions correct you could have even just done like the firewall rules collect but people just it's every step of the way

that they will mess that up and it leads to something utterly catastrophic so it does so and not to say it's kind of like a lecture bsx matter secure coding practices even if you feel that you know you're sort of making code for something and you feel that no one else will ever see it you know are you sure about that because you know depending on how you do processes within your company like someone else could maybe mess up and someone else gets access to that code he shouldn't see that code and then that becomes an issue patching updating you know what's really really nice with um both of these that we've shown on other uh softwares as well you

know they have like uh airflow itself has something like three thousand contributors you know it is being updated all the time and you see that but like version two it's so much more secure they even have like a security tab now when you go in with like permissions control so update that get rid of like the cvs and all um secure configuration just sort of goes about saying especially if you're going to use someone within production you know they're sure that the configuration that you've said that is actually good um permissions yeah use the principle of least privilege you know you don't need to give the intern kind of god mode for your hey oas clyde you know again i've

seen it happen before it's very common and the word for party plugins so you know i really like the nice example of that there's a code editor um for airflow usually you submit you know you can submit the code via like the cli tool but they're like no let's put it into the server and so whilst that third party plug-in it might be useful for you it could also be useful for an attacker as well and you can also actually introduce more sort of vulnerabilities than that you know if you think about it um the plug-in is being uh it's being made you know in a separate repository from the main run so like the

the maintenance of them are not in par with each other so yeah beware third party plugins and i always wanted a full slide for this next build point the default configuration does not equal the secure configuration so if you're going to put something in the production and do the you know the quick start sort of thing that's that's probably not a good way to go again if right time you see like kubernetes gets more secure our flows get more secure argo as well over time it does get more secure but you know whatever that you're deploying millions of things the default configuration is not always secure just think about that and last but not least the documentation is

your friend again for each one of these on more products there is a security page that is dedicated specifically for our product and genuinely read it because it's got some really good advice so it does you know you might actually understand what's going on and uh i'm going to leave with some open source tills um i'm i'm not really going to go for any uh all of these you know you can sort out or take a photo but a couple ones that i really want to point out um the get secrets one is fantastic it'll stop you from committing uh any credentials to your code so it'll sort of scream at you if you try to do

that um clay custodian is a really really good one um you develop sort of policies within yaml files as well and then you can run that with sort of aws lambda or cloud functions whatever cloud you're using and then you can find the violations of that and i like the magpie project as well it's really good especially if you want to find what's exposed and all you know i've had my season come to me and go you know what's our exposure in our cloud environments i don't know like we've got a lot of stuff up there so that'll really help you defined stuff and pretty much we're about the 20 minute work and we can go for questions

there's two of the blogs that we wrote on i assume the left one is um yeah oh yeah so that's the argo one and that's the airflow one and there's a lot sort of more screenshots in all so yeah with uh that's also up to 20 minutes any questions from anyone oh thank you very much thank you

that's about to say if anyone puts up their hands i really can't say

you can also post your questions on slido hashtag besides okay

how many most of the containers most of the instances were version one and as we continued our research we saw more version two but most of them are version ones yeah especially when i came that you can sort of tell there's a lot of version two out there but they're not necessarily exposed it's like you hit the login page for it so whenever km2 misconfigured airflows i would honestly say about 95 of them were version one and then there was about five percent version two where even though it comes more secure out of the box they still somehow manage to misconfigure it it's almost like you have to actively kind of sabotage version two to make it

yeah but people did the major thing in version 2 is that they added uh enforced login so you can just access anything without logging your authentication um so yeah yeah so i guess the fact that most most of the exposed zones we find were version one it's a really good way to say yeah like update your stuff you know anyone else

no it sounds worse thank you very much [Applause]

Go With The (Work)flow

Related talks