← All talks

Deep dive into cloud DevOps through Infrastructure as Code

BSides SATX · 202028:1618 viewsPublished 2020-08Watch on YouTube ↗
Speakers
Tags
About this talk
A cloud security researcher from Palo Alto Networks presents findings from analyzing 500,000 Infrastructure as Code templates on GitHub to identify common security misconfigurations. The talk covers IaC fundamentals, provisioning tools like CloudFormation and Terraform, and actionable patterns to secure cloud infrastructure deployment.
Show original YouTube description
Title: Deep dive into cloud DevOps through Infrastructure as Code Presenter: JayChen Track: In The Beginning Time: 1130 BSides San Antonio 2020 July 11th, San Antonio, Texas Abstract: While infrastructure as code (IaC) offers a systematic way to build datacenter and enforce standards in the cloud, this powerful capability remains largely unharnessed. We analyzed 500,000 IaC templates on GitHub to study the security pitfalls commonly overlooked in the cloud infrastructure. Speaker Bio: Dr. Jay Chen is a cloud security researcher with Unit 42 at Palo Alto Networks. His current research focuses on the public cloud and cloud-native security. Before joining Palo Alto Networks, he was a researcher in Accenture Cyber Lab, where his research focused on the DevOps, Blockchain, and ICS security. In the past, he had worked on designing secure distributed data storage and data processing systems for mobile devices.
Show transcript [en]

a little bit background about me so i am a cloud security researcher with palo alto networks my research focus in the past few years has been around container devops and public cloud security and in this talk i will focus on my research around the infrastructure security in public so in the past when hosting an e-commerce site like in in the past few years organizations are moving rapidly into the cloud space and in the past when when you create a website um using on-prem data center you know exactly where you store the data you know exactly which hard drive actually holds your sql database and you know which ethernet cable actually direct or transmit your data to the

internet but when you move to when you move to a cloud environment when when your e-commerce site now is deployed in in the cloud you have no idea where your data is you know your server is accessible from somewhere is hosted somewhere and it is accessed both in the public but all over in the world but you just have very limited visibility to uh to to under the to what's going on below your application so as a system at the mean how can you protect your application or your data in the cloud the good news is that the cloud service provider today take care of many heavy lifting and dirty works for you such as reliability

load balancing and and throughput optimization and they have done a fantastic job and the second good news is that cloud service provider also mimics the on-prem data center as much as possible so so that you can configure your cloud infrastructure called infrastructure almost identical to to your to on-prem data center if you had one before and in this talk i will show you how the infrastructure as code can help you build a virtual data center much quicker and cleaner and it can also be more secure if you do it correctly so the agenda of my talk the first half will focus on introducing infrastructure as a goal and the second half of focusing for focus on our research

on exploring the public as well as private infrastructure as a code that we can we have access to we in particular we pour around 500 000 infrastructure as a code template from the github and analyze it using our infrastructure infrastructure iac scanner to identify to to understand what are the how people are using infrastructure as a code and how secure these iacs are and in particular we focus on four type of iacs in our research we look at the cloud form aws cloud formation hd corps terraform and kubernetes emo files and finally at the end i will conclude the talk with some pattern and anti-pattern to make your iac more secure so the broader definition of

infrastructure as a code is the process of provisioning and managing computer data centers to human readable code as opposed to physical hardware configuration so virtually in the past when you have a on-prem data center you have a bunch of servers wire with many ethernet cables nicely and after years of maintenance debugging or patching this may that your your this may how your this is how your uh your stereo center may become a little bit mess messy maybe not as bad as this picture but just more difficult just getting more difficult to manage when you when sometimes you need to scale up your data center to handle more mail or load or business and on the right is the virtual database

data center in the cloud i i i see this as data center in in in browser because i can literally configure and create a data center just by a few click a few drop and drop in the browser then i have a data center ready for me in a few minutes so this is like this is this is a screenshot from from aws although gui based uh configuration is is easy intuitive but it is still challenging if you have hundreds or thousands of virtual machines to manage that's why that's what infrastructure is called iac coming to rescue you can essentially just specify all your requirement and specification of the data center in the piece of print

text file send this plain text the this plain text file to this tool to your cloud service provider and the cloud service provider will magically create a fully a full-blown fully functional data center for you in just a few minutes so the concept of iac is not that new it has been around for for more than a decade and there are many many uh different isc languages there are mainly two types of different iic if we look at the third column the third column here the first type is called provisioning iic this type of iac focus on focus on creating infrastructure such as servers and network connection and this is the main this is the type of

iac that most public cloud service providers such as aws edger and google provide and the second type of is called configuration management iec this type of iic are able to manage configure or install software in your server the biggest difference is the this type of the this kind of configuration management iic usually requires an agent running in your vm or in your servers so that that agent can actually configure all your software such as your web server your sql server and make the patch update for you and it can do much more than the provisioning type of iac but they they are usually these two type of ioc have some overlap but they they their focus is very different in

this in in our research we focus mainly on the provisioning type of iic uh used in public health or provided by public health and what are the features that here are some important features of the iic the first it is shareable in and repeatable this is one of the most important important advantage of iic because as a company you can have you can create a golden iac and share with different department or different projects and each project can then create an identical they virtualize data center for this for for each specific project so it's shareable and can be repeatable and the second property of ios is modulized component so a a data center can be built

with more with multiple smaller piece of iac code and this isc code or ise module works just like you're just just like a software library it can be shared and it can be reused and this makes enforcing policy or or standardization much easier and also making creating a new data center much easier and the third feature of ise is ic makes the process more agile makes testing easier this is because the data sensor can now be created easily and quickly one can replicate an identical production environment to test what one can easily replicate in a production environment to test a new code so in the past we all have experienced that a piece of code

45 our piece of code is fully functional in the depth environment but when this code is moved to the production environment it just failed horribly and most of the time the reason is that there is a drift between your deaf environment and your production environment your product your production your environment is not identical to your dev environment so that the one piece of code may work in depth but doesn't work in the production environment with iac we can easily remediate this issue because as each iac can create multiple identical environment and if anything goes wrong in your production or in your dev environment or or if anything any configuration is screwed up you can easily just destroy the destroy the

existing environment and recreate one quickly so that's the beauty of the isc and the last one statistic analysis this is this is what our focus is uh before going to the study the study analysis on iic let me first uh talk about a little bit about the shift left the concept of of the shift that again is is not new it means focus on moving the test as early as possible in software development life cycle with the goal of preventing issue instead of detecting issues so instead of passing your code testing your code at the last stage of development shifting left encourage you to test incrementally test a small each small piece of your code or a function

as early as possible to identify and address the issues as early as possible and when the concept is applied to the security it means we want to identify the security issue as early as possible and instead of scanning your entire application before deploying to the production this is how we what how we how how we scan for our abilities in the past now with the concept of shift that we want to scan we want to add we want to identify the security issue at the developer development stage we want to install ide plugin to identify the security issue even when the developers are writing the code so this issue can be addressed and removed as early as possible and never get into

the the the development pipeline and when talking about when talking about shift left most people think about uh identify the vulnerability the vulnerable package the vulnerable applications in the software development type 9 but now with infrastructure as a code we can also talk about we can also think about how can we make the infrastructure secure much earlier instead of building instead of finding the misconfigured infrastructure after the infrastructure infrastructure has been built we can with that with iac we can also look at the insecure configuration or vulnerabilities when as a code we can look at the code before the code is actually deployed that's the beauty of iic that allow security or system domain to identify

any possible misconfiguration or insecure configuration before the code is turned into a real data center

and now i will quickly go over uh the research that we did in in early this year so this research in this research we we want to understand how people use iac so the way we did this in particular we want to see are people using iac securely and if the the obvious answer is not quite no it's not it's no uh people always uh they are always in secure configuration in any kind of environment and we want to understand what are the insecure configuration or what are the top the most commonly seen in secure configuration in iic so the way we did this research is that we pull all the confirmation terraform kubernetes iac template that we can we can we can

identify from github there are around 500 000 this isc template that they pull from github and we also pull some of the data from prismacloud prismacloud is the cloud security service offered by palo alto networks so we have some internal data as well that we can we can we can use so we we we aggregate this uh this data into a repository and design a iac scanner to analyze to statically analyze this infrastructure as a call and finally and get some statistics out of out of the data and behind the scene this is how uh the scan how our scanner works we basically uh create many policies for each type of infrastructure circle for example on the

top left for cloud formation this is one of the policy in cloud formation this rule written in gold template identified if an aws customer master key rotation is not enabled and another example of in the terraform policy on the right again we can write the template to identify if azure keys row base access control is not enforced so there are many policies for each iac and customer can also write their customer can also easily add add their own policy to the scanner to identify specific configuration so the the findings in in color formation is that we found 42 percent of all the cloud formation configuration configuration files contain at least one insecure configuration and the two most important findings in

the in in confirmation is first encryption people are still not used to the idea of encrypt everything especially data at rest we we have seen very well adaption of the https or transmission uh transmission encryption but a lot of times data are still not encrypted at rest so s3 buckets are not we have seen a lot of history but data and s3 buckets are not encrypted or data in rds database are not encrypted and one may argue that i don't really need to encrypt my database because i have already i already have proper authentication authentic authentication or authorization access control outside this database but it's always a good idea to have your data secure at rest

because you never know one day the snapshot of your database may may be leaked and if your database is not encrypted this snapshot can then be be read as sprintex and the second finding is the login and it's just it's logging or eventing and people are not turning on the login and event login and events are especially important for security purpose because you cannot see if you cannot see you cannot perfect you cannot protect

so suppose your s3 buckets data in your 3 bucket is is is leaked without logging you cannot know who has access your data and or what data or what data object has been accessed you cannot know the scale of your your security incident and and yeah so login is extremely important in terms of securing auditing and restoring you cannot only prevent security incident from happening you can also it can it can much it can it can help you detect the potential security incident much earlier as well and i will not go over the the code example these are the real code examples that we found on github that how a conformation can be misconfigured so and

like for example s3 packet just get published to the internet there are some times that for example web server a lot of time these are misconfigured and that's how data get leaked and in the terraform uh in the terraform i see we found 22 percent of the terraform is confusion files contain at least one secure configuration and the first key finding again is login event and login and event aw ss3 packets open are not enabled or event such subscription subscription are disabled this all prevent this all prevent you from seeing what is going on what what has happened in your cloud car infrastructure and the second biggest issue that we found in terraform template is the

public exposure we see we have seen a lot of infrastructure just exposed as service or even our rbp remote desktop service to the to to the public sh is a great service it's convenient but it is just too risky to be exposed to the entire internet if you want to use it remotely you should just quietly create a white list of ip that can access to the to the service instead of open it to the entire internet same as rbp and and database database there's no good reason to expose your sql database to the entire tool to the entire world usually they are just a handful of application server need to access the database directory

and if you if exposed to internet any hacker can identify and try to poke around and finally the the kubernetes eml file that we scan the uh so this is the the kubernetes file we scan the the two important uh in secure configuration we identify uh the first uh first share the host network with the container the problem of share the problem of creating the container and share the content share the host network to the container is that if this container is is is compromised due to the application vulnerability the hater can immediately access the network stack and the the same network on your host and it will put your other application on the host

as well as other infrastructure access from the host into risk so it's a very bad idea to expose there are some reasons there are some times that you need to share the host network to the container but usually you shouldn't and the second issue and we the second important issue that we identify is a lot of containers are still are running as a root or a privileged container we all know it's it's a bad idea to run your application as a route because uh if your application is compromised then the hater gets the root privilege of the process and with the root privilege it is much much easier to to period from from that point

moving forward and again this is a example of how this is how this misconfiguration looks like in the code and it all so from just looking at this code one can imagine it's not too hard to to statist to identify this misconfiguration from static analysis and finally these are the data port from our internal database from the palatal network uh internal service and interesting interestingly we identify almost same same kind of misconfiguration same patterns in our private data set as uh when compared to the public dataset 77 to 70 to 76 percent of the the client have at least one of the sh service exposed to the entire into the entire internet and 69 of the client has at

least one r80p rdp service exposed to the entire internet and the same login event in issue and the same encryption data address issue was identified in internal data finally uh let me quickly go over what how we can mitigate these issues so visibility is important in the cloud environment as i mentioned in the car environment you don't have physical access to your your hardware so you need to if there's any event or logging available from the cloud service provider you should turn it on so so that you can see what is actually going on under the hood and enforcing security standards such as cis or zero trust model and finally shift that this is the focus

this is the focus of of this research we can also shift that to that we can now shift left shift left at the infrastructure level because the availability of isc template we can scan the ioc template as early as possible before any misconfiguration made it to make it to the to the production and a few pattern and anti-pattern tooling to remember our takeaway for the audience enable cloud security service such as aws cloud duty azure security center and gcp security command center so when enabling this security service it also enables a lot of login or eventing service to help you increase the visibility of cloud data central virtualized database data center as well as uh your service and always encrypt

always log and always firewall if if the encryption option is available always encrypt if there is if there's any law always collect and you you never know when you will need this data and they will become very very helpful and constantly scan also i i didn't really talk about the application vulnerability in this talk but patch is again the number one the the most important way to to secure your application and this is this concludes my talk and if anyone is interested in in more detail about this research we published a report to unified to cultural report that you can easily find online and download this is a very long i think 20 plus page report

details

you