
thank you everybody to be here also a big thank to the staff they did an amazing work um today I'm going to present preeg escalation in Azure machine learning I know that now you hear Azure you want to run away but please stay with me um now this presentation is not super Advanced so there are no Cent exploitation or Windows API call but it's not a basic one neither so you need to know a little bit of how your storage account what is a manage identity what is a role I'm not going to explain all of these in details but you don't need to know anything about a machine learning itself um the presentation is
going to be divided in two part in the first part I'm going to give you the base knowledge about what is HML in the second part we are going to do the acing stuff okay so uh who am I I'm a penetration tester um during my free time I enjoy exploiting Azure and finding vulnerability directory I published a few article last year about azur casual application proxy you can find this on my blog XY bu.com also uh if you want keep in touch go on Twitter and I'm hoping to collaboration so so why a your machine learning well because we are in the middle of the AI Revolution AI is completely reshaped the world from uh biology medicine uh and of
course the C provider are follow the trends um Azure is not different and so it develop Azure machine learning to allow uh data scientist to uh develop Azure life cycle so what is a machine learning is a cloud platform designed uh to manage the entirely machine learning life cycle so you can do many things inside okay is is one packet with many features uh you can for example explore and prepare data set you can import your data set clean the data set um go deep on the data set so you can do all this operation you can deploy automatically your model in um web endpoint or containers or as you wish so there are different uh computes
methods that you can use um you can have notebooks so imagine that you are developing as a data scientist you have Jupiter notebooks online go to the platforms um you create a Python scripts and then you start working on this right so it's very interactive and of course you can automate there are some features that allow you to create models without coding so there are many many things inside here and it's a aure machine learning is complicated okay it's complex there are uh a lot of things happening the back end uh with the application inside a your storage and and other stuff so um a warning before we move on um a machine learning is a work in
progress okay so it's possible the things I'm going to say today are going to change in a couple months or weeks when I was testing a machine learning every weeks new Buttle new feature the workl was different they were constantly updating the platform all the time um they drive me crazy so be careful it's possible that things work now in two weeks they don't work anymore okay so let's go understanding a little bit of the basic so when you create adual machine learning uh the first things that you do uh you create a workspace okay um in the workspace is your dashboard when you work with all the process here so where you can create notebooks you can run
computer you can import mod and you have go like like this one right this portal and you have different features notebooks automation ml data jobs pipelines and you can uh scroll down and and click what you have to do right but when you create a workspace what's happened uh under the hood is other services are created okay these other services are connected to the workspace a storage account an Azure container registry application inside so all of these things are created together with the with the workspace and also the key volt there are some default roles that you can give there are azur ml data scientist this role can perform any actions in azur ml except creating and
managing machines but it can create notebooks it can import data sets turn on turn off the comput instance uh you have the a ml Compu operation is basically this is sub me for the ml aure ml so you can create manage delete access resources Computer Resources and then you have the classic uh reader contributor owner a roll so there are different kind of comput instance that you can have there is the compute cluster the cabetes the attach machine and there is also the classic comput distance right um I only explor the last one the compute instance the comput instance are Compu resources in azur ml okay so you can run a new create a new computer distance in the uh
platform itself without have to use external resources for it okay this is what you can see here is is an example I just open a notebook and you can choose to uh develop the this code and then you select your Compu distance on on the right and then you can run this these notebooks okay you can also access uh to the shell to the compute from from this dashboard now the interesting things here is that you can assign a comput instant to uh a specific user right so each user is going to have his own comput distance but each one of these comput instance they have the same file share attached okay um and this file share attach is
where is where all the clouds file are stored so all your notebooks all your data are there and are shared between the users so each user access different computer instance but they share the same files okay so if I if I code something notebook and I put in the cloud the other user other data scientist can access the file okay so if you are the code credential they are there everybody can see okay so this is very important to remember this is an example I got into one of the compu stands you can see there the the the directory cloud files this is the directory where all these uh shared file are stored okay so if you go
in this directory you find all your shared file between you and everybody else that are using Azure machine learning okay um so you can already see a little bit of uh problems here because okay you want to be careful what you do there what you put which files are there which are not there uh because are not private not completely okay so you have to trust your team you have to trust who give access to a ml so um all of this file all of this stuff the the train models the not books The Bash scripts whatever you have there are stored in this storage account okay um so the storage account is connected to Azure ML and he has a file share and
a blob storage right um in the file share you find for example the uh notebooks you'll find the the different scripts you have on the machines and on the blob sorce you have the experiments the jobs the train models the pickle file um so you have everything here okay and as I explained this share is mounted on each compute instance so each comput distance can retrieve uh these files okay um again this is an example of notebooks a file and this file is inside the file share okay and just a reminder um Azure file share has a credential based authentication but it doesn't have identity based authentication so you just need a storage account access key
to access the file share um so keep in mind these things too so how uh this feature can be exploited and what can happen if an attacker wants to um do a privilege escalation and to access your aure machine learning so one method that I found uh that I call it the startup scripts so there is a features that allow you to create um a startup script that's going to be run every time you uh boot reboot or create a new compute instance okay so you create this startup script this startup script is stored in the file share that I show you before and every time you this computer instance start the start script is started
right um so I usually are bash script and there are script used for different operation right update install packages and things like that so this is an example right I there is a default script is in the notebooks uh every user can access this right um and this script does something right is St package whatever so but what's happen if I have um access to the storage account to the file share but I don't have access to Azure machine learning so because this script and all of the notebooks are in the in the file share even if I don't have permission to the a machine learning I can still modify them so imagine this scenario okay you have
access to you have an access key for for the file share because you compromise another user you don't have permission to escalate to a your machine learning but you have permission to modify the files so what you can do you can explore the file share find your startup script inject your malicious code and simply wait that the user turn on the machine and because these Compu SS quite expensive usually they have a schedule so they usually turn off every two three hours when you don't use it then they turn on again so it's they turn on to turn off quite frequently okay so and this is the this is the the scenario right uh this is not completely new
things in Azure uh we have these problems with other services like Azure function um and other stuff like that so when you can uh compromise the storage account and then you escalate right so it's not completely new things so the attack step imagine that you have a new in this case hel she has no permission in in aure machine learning okay but hel has has access to the file share so she found a startup script she injected um a malicious called the reverse shell in this case but you can inject whatever you want and then uh when the the machine is turned on uh the malicious cause is trigger and now she has a shell okay and this is an
example I retrieve um access token for for the machine man because this computer is they also have a manage identity okay so um you can use this to escalate to other service okay because imagine that uh the data scientist want to connect to some key Volt or some other stuff so they use manage identity to connect this compute instance to other a resources so now from compute distance you go to other resources so start from the storage account you go to Compu distance and now you move laterally in the cloud okay uh um yeah I have a a video PC I think I have to click on it to start it yes so in this case Ellie has no access
to Azure machine learning sorry you cannot access right but he can access the this the storage account you can do it with the Azure comma line tool if you want so you don't have to use the portal but I wanted to show you a little bit better so this is the start
script
right so now I'm going to open a listener of my machine with net cut and I'm going to use NGO to to tunneling because I didn't have public I at the time
and then I simply create my own startup script right I overwrite the script this is simple bash reversal very very
simple right and this is the shell right the reverse shell now you overwrite this file right now this is the new startup
script now imagine that uh show you is different this is the the notebooks and you start one compute instance okay so this is another user that is using Azure ml okay and now you get a
reversal so this is okay so this was the first method okay the startup scripts you might wonder can I do the same things with notebooks yes you can potentially um it's a little bit more tricky because with notebooks you have to do a little bit of social engineering because data scientists use notebooks they can see it so yes you can inject python code there but you need to hide it you need to be much more settled there um the startup script is more direct because right is going to start every time the the machine is turn on another way you can do is I told you there are notebooks there are U scripts but there are also uh compile models
train models so there are pickle file right then this pickle file different from the notebooks and the startup script they are store in the blob storage okay and the thing is uh you can inject malicious code into this this file so uh you can access the storage account access this file inject I malicious malicious python code and every time the the model is deployed with flask can be deploying the containers in a um web end point your malicious code is trigger with the model right um this is a tool that you can use to inject malicious code in the in the pickle file flicking so the process is the same right you you get the model the
pickle file they are called model. pickle uh there are different you can find usually different experiments so you download your model you inject your python code into the piol file and re you reupload right you override the file now every time that this um model is deployed your mous Cod this trigger in this case I use an Azure U container instance so very easily um the FL started the model is is is executed and now I again I have a reverse shell like before with the manage identity and so I can with an access token that I can use to move uh laterally so this is another way another uh things you can do it's a
little bit more dangerous than the startup script because this model can be deployed everywhere right they can you can take the model and take away from Azure and Deploy on NS on on other services so they not only affect the computer distance but they affect every places where you deploy them okay okay so another things uh that I want to show you is I show you how is it possible to go from Storage account to aure machine learning right but what you can do if you compromise a comput instance um what you can do from there you can do the opposite you can go from compute distance to storage account so you can get a storage account access key
okay this is not my new the new things was already discovered by nites but I dive deep more into this um and so this is a little bit more complex than what I show you before but uh if you follow me we will understand so there is a process in the comput instance that is called uh this Mount agent okay and this is the agent that mount the instance the the the file share sorry so every uh one uh yes 100 seconds is going to check if the uh file share is mounted if it's not is going to mount it in the computer distance right so basically this is the process that take care of check and
mount um this uh demon has to authenticate okay because he has to retrieve the storage account access ke to mount the file share the way that it's going to do it is going to connect to the xsd end point okay how he's going to do it he going to use it a private key and a certificate okay this private key certificate are stored on the compute instance okay so you can find this directory here the key the Pam file and the pfix file okay so you are an attacker you are here and so what you do you take these these these two files right uh because with these two files you can basically connect to the exsd
end point and retrieve a few things okay there are mainly three methods right now one is uh get workspace the other one is get add token the other one is get workspace secret okay so with the first methods you retrieve some information about the workspace uh with the second one you retrieve the manage identity of the comput distance or the access token so you don't have to be on the compute distance anymore you can simply get the certificate get the private key and then connect to the endpoint and you get the the access token for the uh Compu distance so if you lose access to the compu no problem anymore I can get my access token so this is interesting for
persistence uh with the last with the last method you can retrieve what is called the account key j what is this stuff this stuff is basically the encrypted um storage account access key okay so inside this one this things you find the storage account access key so what I was okay I took this um pen file and and certificate and private key I generate a piix file I import into BP and and now I can authenticate right it's a client science certificate authentication this is the first method for example I connected to the endpoint to exsd endpoint I use the get workspace method and I retrieve a a bunch of information about my workspace okay this
is for enumeration right this is the another method is get add token with this you get the the access token for the identity of the comput distance and you get the token there okay so now with this token you can authenticate in Azure and move to other system if this Compu can access other stuff and here is the last method get workspace secret where you get the account key JW okay now this is encrypted for now so you have to decrypt it before you can access this storage count uh access key so how you do it um this is like a mosa you have the first private key it's called cluster private key that is
decrypt the second key the this encrypted symmetric key that is going to decrypt the JW and when you decrypt the JW you get the um the storage account access key right um where are these two keys these two keys are in an environment file you can find in the same comput instance right there you can get the two keys you can export them right um and then you can simply decrypt the first key the second key with the first one and with the second key the J right you can do it with a simple python script uh use J library right so in this case I decrypt the encrypted symmetry key the first one and then I use the
decrypted key to decrypt the account uh the JW right so as you can see here in the first screenshot uh the first one is the decrypted uh key the second one is when I decrypt the account key that jwe and you get the um storage account access key right so the second string you see is the storage account access key of the storage account access key connected to the aure machine learning of course and now with this storage account access key we can use aure uh Comm line tool and connect to to the storage account okay so from one comput instance we compromise the storage account now now this is works until three weeks ago then I open um a PO request to
micraster project uh because I develop an automation script to automate all of these process all of these things I show you and two weeks later Microsoft fix everything so what is the situation now now uh the pfix file is not there anymore and the hardcoded um key are not in the environment file anymore okay what it I think I think that uh is still something there because the Compu s needs to mount the file share so and to do that it needs to authenticate so it has to do something to retrieve the storage account access key so they changed something I think uh the problem is still there I didn't have time to dive deep into this to this problems but
I think there is a similar concern even if these two private key are not there anymore we have to discover where where are they now and what they did uh with the backend exess the API but I think there is potential here uh still now so uh we go from Storage account to Compu distance to Compu distance to storage account uh but you might wonder okay I'm I'm inside aurl Chris what I can do now you can harvest in credential uh if you think that you can access only Azure machine learning things you are wrong because uh Azure machine learning has a lot of credentials credential for other services okay there are two way to store
secrets in a machine learning with the first way is the workspace connection the second way is the key the key is more secure this is what Microsoft advice to use nobody use it everybody use workspace connection um because it's easy to to to use it and in using workspace connection you can insert your own credentials okay it's inside the the portal so you go to the portal you can create a new secrets and then you insert your API key okay so everybody can everything can be here okay from API keys to a stre buckets so you can also cross the boundary of uh of aure and and compromise other systems um the interesting part is when
you go to this uh this part of the of of the portal um you notice that you cannot read the secrets okay from from from the portal you cannot read the secret they are hidden by hter risk but the secret are still accessible uh from the backend API so yes you cannot read the secret but you can connect to the backend API uh and you can access the secret that you want to that you want to see okay so this is a reminder uh when you give access to data scientist to a machine learning they also have access to all the secrets stored there okay so they can access your AWS if you have AWS
access key they can access open AI key whatever is there right so they can move to other parts of your infrastructure of your environment so this is very important um yes so automation automation more automation uh right now there are two scripts the the first script was developed by Carl the owner of of the microbar project uh the second the second script was developed by me uh with the first scripts you can grab all the credentials that are in uh azur ml okay with the second one you can uh fetch all the uh important data like uh jobs experiments uh comput distance information work secrets so you can use these two scripts to enumerate uh asual
machine learning of course you need to have access okay with the right permission to do it but can be used to um grab information and see if you can move laterally somewhere from there so a little bit of advice what you what you can do um to improve Security in Aur ml one advice is use private endpoint okay when you create a machine learning you have a choice has everything in life you can choose the public the public point so everybody can access it or can can choose the private end point uh with the private end point of course is more secure uh because you you need a vnet with a private to access this so my
suggestion is to use the private end point if you don't need the the public exposure this is one one advice um the second thing is monitor the cloud as always uh look look what's going on there right monitor the cloud if there are new Compu distance if there are new seekers in the workspace jobs pickle files okay so monitor all of these stuff uh rotate storage account key regularly so you can rotate the storage account key regularly this an good advice uh review the Jupiter notebook okay these another things review these Jupiter notebooks because you don't know what is there okay it's not not just because attackers can compromise them is because the user can uh AR code
credentials too right so they can do things um and of course enforce the less privilege access to minimize the risk okay if someone is not a data scientist you don't want to give full access to the platform okay want to give partially access read all access based on what you have to do okay so always always think that from Azure machine Le they can move to other part okay so you are giving them access to many things usually not just a ml so that's all there are any
question that does seem
so mik mik Mike thank you
um hi just to iterate on the workspace versus uh key volt thing uh you mentioned that the key volt is more secure uh can you just explain a bit in what way and is it possible for example for the managed identity to gain access or have the permissions to list the secrets in the KEYT the same way as a user would be to list the connection Secrets yeah yeah it's possible you can you can assign uh uh permission to access from the computer and to the key vaol yes you can do it it's a little bit more secure because you can use uh Ro base access control so you can say Okay I want just these two Compu to access
the keybard okay so with manage identity man identity they authenticate can access the keybard but I don't want to give other data scientist this permission uh with workspace as soon you are with workspace secret as soon you are in the dashboard you have all the secrets so is you have no way to enforce granular access control but this is something that you can do with key volt right but yes what you're saying is correct if you compromise one compute distance and this comput distance has to access some secret in the key volt now you have access to the key bolt right so yes it's possible to move laterally like this two yeah so essentially once you have a
compute instance under control all the secrets are exposed as expected but the portal is somewhat secured right yeah yeah yeah because these computer science have managed identity you can you can uh treat them like a Azure VM you are in an Azure VM so now the Azure VM can access other resources and you can move laterally right great thank you welcome thank you for the question any other okay thank you Christian that's it for now thank you uh okay