
so we've had some schedule changes but right now we have Damien Burks to talk about some Cloud security stuff we're absolutely thrilled to have him and I want you all to give him a round of applause give him a warm DFW welcome here's Damien so how's everybody doing today so far so good yeah good good all right cool cool well um uh let's just go ahead and get right into it uh so what I'm going to talk to you guys about today is called minimizing AWS S3 attack vectors at scale and I'm pretty sure you're probably going to be wondering like what does it mean to uh to try to minimize those attack vectors at scale considering that most organizations have multiple AWS accounts so before I get into it just a little bit about me so um short Bayou somewhat so I'm currently a cloud security engineer at Citibank uh and I'm also presuming a master's degree in cyber security and I'm almost done so yay next year finally right life happens um I'm AWS certified four times so I have a range of AWS certifications uh and I'm also open source contributor for this tool that I developed called datacop also um if you're familiar with open policy agent or Opa I also write a little bit of code for them from time to time as well uh devsecops Advocate and Mentor meant to have a couple of mentees and I'm also a father of two kittens not children um as you can see in the picture right there so uh those are my babies and uh some activities are hobbies for fun that I like to do I like to play video games if there's any games in our room just come follow chat with me afterwards uh anime and cars um I know got everything right here you got the infinity so we got to talk but aside from that it's just a little bit about who I am and let's just go ahead and get right into it so the first thing is s Street attack vectors explained so we know for a fact that um buckets have been exposed quite frequently within the past couple of years right since people have started moving and migrating all their data to the cloud primarily to S3 buckets some of those common attack vectors that you that we've seen for example have been data exfiltration so if you take a look at the right side of the screen that's a very small snapshot of some database credentials that were exposed from Netflix's S3 bucket that was publicly exposed to the internet which is bizarre but okay and then you also have some ransomware and malware that people try to deploy because in what I'm talking to you about this in a little bit is people usually read and write to S3 buckets from their ec2 instances and other services in AWS so having that with that being stated like those malware and that ransomware would be loaded into that ec2 instance if they're not careful right uh so with data exploration some common things that have been exposed is pii I've personally identifiable information so you got credit card numbers SSN numbers uh so on and so forth credentials and tokens again database credentials and tokens and so on and so forth so many different things that have been exposed with ransomware and malware depending on the intent of the malicious app or the file the impacts may vary um and you there's not really I haven't necessarily identified any research as far as like what examples I can give y'all but I know for a fact that we all know we're all Security Professionals that the impact does vary so how does that really happen right when you have these S3 buckets that have been deployed and you have all these files that are located Nails S3 buckets how do or how is data exploitation happening or how do these you know ransomware and malware applications get inside of that street bucket one of the common causes is the lack of Access Control right so people don't necessarily have uh roles and whatnot attached to the S3 buckets to be able to prevent people from uploading those documents as well as blocking those you know Public Access you have lack of monitoring meaning that there is no or usually I'm not going to say no but most companies don't monitor what's going in or out of their S3 bucket so when people upload something it's like okay it's there but we don't necessarily know what's there right so you just know that it just has these objects or these files but you don't know what is inside of those files you don't know the contents you just don't know and then the last thing which kind of wraps everything up is lack of access to control and lack of monitoring is all part of misconfiguration right and the reason why is because well it's it's not properly config so with that being stated let's go ahead and get into the scenario so I have this example organization I created so hopefully no one has named their organization generics but if you did I do apologize because I'm yeah so in this example it's basically a mid-sized gaming Mobile gaming company that caches their survey and user data within several S3 buckets and on-prem but the key thing about it is that the company operates out of a single account and they utilize like all these different AWS services such as like ec2 and Lambda and coconut so they're pretty much all in the cloud all in right for the most part So within the past year they sold over a thousand copies of their hit mobile game called Angry Dolphins I don't know why dolphins will be angry but they're angry so after that debut of their mobile their hit mobile game the organization was hacked how well somehow the hackers gained access to and I have this in red the ec2 instance that was storing and retrieving personal identifiable information and PCI Data from a single SG bucket that was public now why was the S3 bucket public we don't know but um we had or the security Engineers had deployed this tool called Cloud one um that basically observed or observed the malicious files within that S3 bucket and it basically uh highlighted that there was this malicious Excel workbook or worksheet that was loaded onto the web server and then it created a back door for those um hackers to Association to the instance so A lot's going on there and I know that there's a lot of words so let's look at a picture this is my favorite so in this architecture diagram you see we have a bad actor and you see you have this report called customer report um whatever bad dot XLS worksheet and that bad actor somehow uses the AWS CLI to upload this document to this public S3 bucket that contains PCI and pii data right from that S3 bucket you see there's this ec2 instance called the payments processing and that payments processor has a role that's attached to it that is called the ec2 to S3 row so in short with that ec2 instance is doing is that it's assuming this role this role has permissions to be able to read and write to that S3 bucket and that's basically how they're able to get all the information that's in that street bucket right so the first thing that we're going to do is once they've read and loaded everything onto this web server that's how that attacker was able to compromise that system based on it reading into uh reading those Excel workbooks and decompressing that information and so on and so forth right so how do we necessarily know aside from a malicious file how do we know what type of data we have in that bucket as far as the pii and PCI how do we classify that well we're going to move over to AWS Macy so if you're not familiar with AWS Macy it's basically this uh data private security and privacy service that is created by AWS that leverages machine learning and pattern recognition to be able to uh discover the sensitive data that you have in AWS particularly in S3 buckets and with those you uh with those it has a oh wait Macy has a couple of capabilities the first thing is um it automatically provides a inventory list of all the S3 buckets that you have within that particular account it goes in there and inspects all the data and all the files or objects and it classifies them based on specified criticality right uh where there's high medium and low which we'll talk about in a second and then also if you're you know an organization that has uh you know data that's supposed to be compliant it also helps those organizations meet data compliance regulations such as gdpr HIPAA and so on and so forth so a couple of pros and cons of Macy is that um so it's fully managed data type so the thing with that is if you take a look on the right um all of the rules and policies that they create is fully managed by them so they have uh policies and rules for Social Security numbers in the US or uh tax identification numbers um and it doesn't just uh is not restricted to just the United States but it's also globally so they have things from China so on and so forth so it's fully managed by them you don't have to worry about that uh it seamlessly integrates with all the AWS services such as eventbridge car watch and stuff functions which is super important and and I'm gonna tell you all about that in just a second um and why it's important it you can customize the data types meaning like if you have like custom let's say custom customized data that's proprietary to the business and you want to create regular expression rules or patterns for that to be able to detect that you can do that with Mason you can add that information in there and then it also create you can create an automated data Discovery job that runs on a specific basis so if you want to run in Daily weekly monthly you can do that it's perfectly up to you but the cons of course is that when you get into when you get into um doing this with Macy anything with machine learning or AI you gonna know you're gonna pay that money right so it's going to be very expensive um not only that but when you get this information when it is returned to you based on how many data or how much data you have or how many roles that it detects it's going to give you this bloated Json file and it's not feasible for somebody to read through 50 000 lines of Json just to figure out what Macy has found right and even in the UI it can look a little complex for whatever reason and then the last part is there is no type of feature available for audio remediation of the bucket so let's say if you do find something and let's say that uh the organization you're they have a compliance rule that you're not supposed to store Social Security numbers or any kind of PCI Data in the cloud but Macy finds that it's just like okay hey we found it but we're not going to do anything about it that's on you what type of I mean excuse my language bro what well why would you why would you even do that you know so there's no type of way for you to Auto remediate those buckets so that's pretty much like the downside of Macy and that's where datacop comes into play but before we even get into datacop and what it does let's get into this next thing that I mentioned earlier which was Cloud one so uh Drew Michael Cloud ones file storage uh security so it's this cool cool cool cool cool times 20 application that Trend Michael developed um and it's basically a security solution for S3 buckets and file systems in AWS um so it basically provides malware and ransomware scanning on those files in those in multiple Cloud environments not just restricted to AWS but it could be like gcp or Azure so many other things so it can detect like several different types of malware if you have them so well you don't want to have them but if you do they would detect viruses Trojans spyware the list goes on but those are like the three main ones that I picked out and then it also supports the scanning of different types of file types so you got Ben exe PDF zip XLS you know Excel Works work uh workbooks you guys see see csvs so many different things that it does so with that being said how does it work so let's just dive into this just a little bit so let's say we have a file right and we upload that file to whatever the cloud storage containers is whether it's s Street for example um somehow in some way they have a magic wand that subscribes to this little bucket and it will scan the scan is going to be triggered for that particular file automatically and what's going to happen is that that file is then going to be sent to whatever repo repository that they have all the data in the contents and they're going to inspect it and do some magic on their end and determine whether or not it's malware it's malicious or not and if it is it's going to tell you what type of file or malicious file it is uh so you have that's pretty much how FSS works from a high level and let's go ahead and just get some pros and cons of what FSS what it is so a couple of Pros um so if you have massive files it supports scanning of massive files five gigs or more which is fantastic um it's extremely fast so if you upload a file even if it's a massive file it'll literally scan it in about three to five seconds um and it will return that report back to you very quickly it is concurrent is able to scan like multiple files so you don't necessarily um let's say if you have like 10 files and you upload them all at once it's going to scan all 10 of those files at once and report it all back to you and then uh it's easy to deploy and it's and it's extensible meaning that if you wanted to add some custom data or custom logic to it you can very easy to do so it's an open source project so they can just you can just go ahead and add that information in there but with it also being open source the problem is it's not free so is it really open source not really but there's a part of it that's open source which is the logical part but you have to pay for the service so um that's actually the downside to it but you know I got free AWS credit so I'm not worried about that uh so that's that's the only thing that we can have so let's just wrap this up and let's move on to some more diagrams so I can show you give you a scenario of uh generics with Macy so let's say for instance now we know the organization was compromised and the security Engineers are like okay let first before we even get to any type of remediation or mitigation let's understand what type of data we have in our bucket so in this case they uh the security engineer logs into the AWS console and they navigate to the Macy dashboard and they trigger a job right and they trigger that job to scan all of the contents in the PCI and PC uh in that S3 bucket that contains PCI and pii data so if you take a look on the right these are just some mock findings but you can see that the security engineer found like 58 000 uh records of sensitive information so you got like credit card numbers and Social Security numbers and so many different things and what Macy automatically does is by default they classify any type of sensitive data that you find or that it finds as high right so they realize that hey these are you know we have a whole bunch of Records here nine times out of ten somebody may have stolen something because of the bucket itself is publicly accessible which is horrible so with that being stated now we'll move on to Cloud one uh file storage security so it's a bit different in a way where uh Trend Micro has their own dashboard uh so instead the security engineer is say if they do want to log into that information to see if the type of malicious file that was um you know caught what they'll do is that they'll log into this Trend Micro dashboard and you see that we have not just one account but two accounts we have this account with the question marks there which means that this is Trend micros like home account where they basically um well all the data that it scans or S3 objects that is scanned all that information is sent back to them their repository is deployed in this malicious account and then there's just like interaction between this account and their account so he's going to log in he's gonna you look into that UI the UI is going to pull all the results from their malware repository where they store all of your information about your objects and then you can see on the third thing uh Trend Micro is pointing to that S3 bucket which means that they have something there that is subscribed to that S3 bucket so it's event driven completely so if something happens to that S3 bucket it's just going to trigger events in that back Upstream to Trend Micro in their Repository so we'll get into the mitigation strategy without automation so let's say we got to do all this manually right uh manual Auto mitigation For the Win not um so I condense the action steps for this because nine times out of ten you're dealing with at this point two different attack factors you're dealing with one for data exploration you did on another one for this malware file that's somehow miraculously is uploaded or appears in this S3 bucket so the first thing we got to do is or else disable the public access for that history bucket why do you have it publicly accessible in the first place I don't know we don't want that anyway right and then what happens after you disable that you don't necessarily know at this point in time since it was public access what type of roles are associated with this bucket so now that it's publicly accessible now you have sometimes your roles may have been compromised because now they have information on the roles and nine times out of ten I don't know about you guys and your organizations but my organization typically or may may not use roles right and use the same row share Rose okay um but you have that information they they can use those roles for pivoting into other things right and then you attach a bucket policy that denies all access to those resources and services so uh that bucket policy is basically going to deny all access for all roles and from any service so no services will be able to access that SG bucket right um You Want to inspect each object and file and identify what the issue could be so now we got to go through each file we got to determine whether or not it has or contains pii and PCI Data and then move it to the on-prem server or file system and delete it right and if not if it doesn't contain any information now we got to move on to figuring out whether or not it's a malicious file and if it's a malicious file then we need to delete it and then figure out the impact so you got all of these things if it's this then do this if it's this then do this how long will it take it might take you a week might take you a day we don't know right because we need to measure the impact so that is where the problem is going to come from and that's why we have automation so now we'll get into Datacom which is supposed to come and save the day not really but um so data cop is the art this is a framework that I that I wrote um it's open source but it's basically an AWS framework that mitigates the potential of vulnerable S3 buckets um and what that happens is that it leverages may see results that you get to be able to automatically block those S3 buckets and contain that pii or any classified information and also uh relies on now Cloud one results as well to be able to make that determination so some features is that it automatically Provisions infrastructure to bridge the gap between Macy and S3 with AWS cdk and python um there's also some configurable settings for bucket blocking so you can do it yourself or you can configure it the way you want to and it's event driven meaning that it ties into some of the AWS services so you don't necessarily have to trigger it it does it automatically and it's easy to extend for other AWS security Frameworks such as Cloud one and so on and so forth so some considerations with this is that there are quite a few IM permissions and policies that you need to create in order to use the following Services because data cop relies on eventbridge Lambda Cloud watch SNS step function and S3 so a little bit about each of those Services eventbridge is basica