← All talks

BsidesLV 2024 - Breaking Ground - Tuesday

BSides Las Vegas9:21:501.6K viewsPublished 2024-08Watch on YouTube ↗
Show transcript [en]

a [Music] [Applause] [Music] [Music]

[Music] [Music] I'm just TR to have to for you I'm just try to give you [Music] something I'm just trying to give you something I do I'm just trying to give you something [Music] w oh

[Music]

[Music]

[Music] [Music]

[Music] n

[Music]

[Music] [Applause]

[Music] [Applause]

[Music] [Music]

[Applause]

[Music]

n

[Music]

[Music]

[Music] the [Music] the [Music] oh

[Music]

[Music] [Music]

[Music] [Applause] [Music]

[Music]

[Music]

[Music]

[Music] [Music]

[Music] n [Music] [Applause] [Music]

[Music]

[Music] oh

[Music] the [Music]

[Applause] [Music] hey [Applause] [Music] [Applause] [Music] [Applause] [Music]

he [Music]

he

[Music]

[Music]

[Music] track oh [Music] he hey hey hey [Applause]

hey hey hey hey hey [Applause] [Music]

[Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music] he [Music] [Music] [Music]

[Music]

[Music] [Applause] [Music]

ready to go all right

right

godan Shang Zas good morning

besides thank you all very much for being here today uh thank you all very much for uh carrying enough to mask uh while you're here and uh remember everybody it's hot out there it's dry in here drink lots of water okay cool why are we all here I mean it's been 15 years since the last the first time that this thing happened um mostly that just makes me feel old but this is I mean truly kind of an amazing longevity this this community has grown so much in that time this is as of today the 103rd security bides event that we know of in the world

103 so I'm not just old I'm actually kind of amazed um and how did this happen I you know mostly uh I think it's because of you folks I've got a few ideas my grandmother had a saying that there's nothing so bad that it ain't good for something and this entire conference exists because there were conversations that we couldn't have at black hat at RSA or any number of other places right rejection failure exclusion these things are terrible in the moment but this community exists because people look at those they took those things and they actually built something from them that was not only unique but truthfully stronger and more flexible in some ways that were desperately needed in the

world there are situations in life and in our industry where you'll only get one shot and you need to cross your tees and Dot your eyes where there's no makeup and there's no recovery option and you can't Unbreak eggs if you drop them right but most Beginnings aren't like that most Beginnings are about learning and the only way to learn new things things that actually new things that nobody's learned before is by doing the most important part is begun even with a tiny group of people in a tiny room even if you overcrowd your tiny room or the next bunch of tiny rooms so badly that you have to shoehorn a bunch of portable AC to keep from

expiring by perspiring even if you overload that AC so badly that your power cord decides to actually Halt and literally Catch Fire the important part is still begun if you begin hard enough and keep at it you can work well out and there's a good chance it will work out well especially with the right help best of all so you know there's the old saying to go fast go alone to go far go together but if you can find a way to go with just enough people to get you where you need to go as fast as you can that's even better right and besides excels at this decentralization docracy if every bsides had to be set up by Jack

and Banshee and nous and Cinders and all the rest of us we'd still be it a couple of dozen not a thousand and three we wouldn't have the global reach because it's decentralized because we come together wherever we're needed and by we I mean you guys you gals you everyone out there and us we get things done without a whole bunch of State transfer vertical communication we're not just a conference because of that we're a movement and all you've got to do to be part of it is participate the next time one comes around in your neighborhood so many wonderful things have started here besides is a place for Beginnings wonderful amazing programs have begun here and grown into movements

of their own Pros versus is Joe's I am the Cavalry our Proving Ground mentored track for new speakers that's so effective and impactful that conferences like black hat started doing it as well public dialogue about mental health in our profession again pioneered here our ground truth data science track has spawned you know other organizations that are doing uh excuse me uh wonderful things that got started at our annual data science Meetup here I can't even begin to cover the impact that you have when you and your fellow participants take the things you start here out into the world you are part of that you are here to do that the co-founder you'll meet the new employer or future colleague

that you'll get to know the cool skill you'll pick up just because you're here and you can spend 20 minutes or an hour or even a couple of days with someone who isn't just a worldclass expert in something but is so jazzed and so excited about it that they love to share it with everybody that they can you are a part of that but it's not just about starting things bides is a community to sustain and boost these Beginnings open welcoming built to incorporate the best from any place we can find it and to celebrate and Elevate those fantastic fellow Travelers passwords con Sky talks Higher Ground each of them led by people who came to

bsides with fantastic ideas and track records already and made us and themselves better and more impactful as a result of coming together I mean after all what is besides a a bside is a cheap record right it's it's a place to experiment to learn to come up with something new and unexpected and wonderful and give it a chance just a chance for the world to realize how powerful it is whether that's a person or an idea or an organization or a whole movement but all of that every bit of that of what that can be begins with you and that brings me to my request and the reason I'm taking a little longer to get

to those talks today you knew it was too good to be true right there's always a catch we always are going to ask something from you so when you leave this room today after the keynote please because you don't want to miss that bit about the amazingly accomplished people sharing what they love best and care most about right but when you leave this room today this is my request for your bsides and it's only a request but go begin something new talk to someone you haven't before learn something you never thought you'd need for your job like how to pick a lock or how to operate a ham radio or whatever is just interesting but better

yet help something that's already started to grow whether that's you or whoever is sitting next to you right now go ahead take a look look right now and make eye contact with the person to your left or your right or at least make eye contact with their shoes and their shoes instead of your shoes like I mean we're all you know Geeks here we get it that's fine too and no it doesn't work if you all look left at the same time and then you all look right at the same time right because this is something like most things that works better with each of you figuring that local bit out for yourselves rather than some Central

Authority trying to coordinate everything take that look anyway and make a mental note and at some point in the next two days not right now but sometime after the talk walk up to those folks and then if they look confused that's okay remember a lot of us are a little bit face blind okay so that just means you don't have to worry about it but you walk up and you tell them Damon Tanner sent you and you ask that person what they really care about what's their thing what's really cool cool and exciting and what they're looking forward to while they're here and okay I mean that's a little too long we can do better there's a great Irish word for

this a great word for bsides in general actually just walk right up and say hey DT sent me what's the crack that's c a i no K very important distinction a lot of us have you know security clearances to maintain that kind of thing but I promise you that no K crack is 100% compatible with that crack is a good time it's a fun experience it's a friendly conversation right and it can be the thing that has happened or is going to happen or even something that's happening right now it's good crack the word crack and bides is good crack too so Damon sent me what's the crack and see where it takes you right see

what you can learn and what you can help grow and I'm I'm going to warn you we'll be checking up on this at the end of the conference so just remember when somebody walks up and says hey what's the crack this is as official as bides business ever gets and sharing the crack is a moral imperative it's how all of this got started and how all of it continues to grow and how all of us change the world one life at a time that's enough of that put a pin in [Applause] it because uh in just a few minutes we're going to have uh Sven catel come here and uh talk about how ml security

can best draw from our accumulated experience in designing and operating secure and Safe Systems in order to address the actual critical risks that are manifest in AI systems rather than some of the more shallow threats that get endless chattering attention today so stay stay tuned stay put and oh yep there's our man right here we're going to get him set up and uh you guys will get to hear some very interesting stuff

[Music]

[Music]

he

[Music]

uh hi everyone uh I'm Sven catel I have been uh uh running the air Village sorry uh I'll step a little closer uh hi I'm Sven catel uh I've been running the a village at defon for the last 8 years uh kind of uh it was basically eight years ago that I actually texted a few there's a couple people in the room and saying hey do you want to meet up for coffee uh and then pitch running making the a village and then the following year we had our first one in 20 uh 18 um and it's been a bit of a ride learning about how AI sec security Works uh because I was a mathematician at

John's Hopkins um doing a postdoc and geometric machine learning and then suddenly I was helping run a AI conference at Dacon uh and because of that I got to know a lot of people in the industry who have actually dealt with this stuff and it's um so I had some love interesting conversations um so I uh as you can see from the tag tagline there's the llm craze which helped contribute to with some of the things we did in the genor red team last year with the White House and a bunch of the vendors um and a bunch of snake also has been in AI uh trying to sell you things that don't really help much um and I this is some

of this has pissed me off and I've given rants about this to people and now unfortunately the rants have gotten away for me and I'm here uh so um to start with uh the first reported vulnerability in a production machine learning model uh that I is from 2004 uh there is older versions of this from 2003 um but those were uset posts uh posts just email to each other and I've lost those uh to The Ether The Bookmark exists on my computer so I wasn't imagining things but the it now 404s and what it was was beian poisoning which is one of the it's the oldest thing if you ever get into uh machine learning security this is one of the

things that they teach you uh the first things and what it is is you append a bunch of hammy words to the end of your spam uh so you if you want to send that link that sending a single link inside of an email is a bit spammy but then if you add a bunch of words and and put it maybe put it in white text so they can't see it um the fil old email filters would just let you through and this is um still effective against some machine learning models as you can will see um but this was like the first one we like oh God we got popped the machine learning model has a vulnerability and

this is kind of unpatchable um you had to do something else and the this the the actual documentation from this came from John [ __ ] Grahams uh he's now the CTO of cloud players uh presentation on this and he had a bunch of other uh bypasses in that present I think there were like six total different ways that people had come up with in the last year to get past their his machine learning model and this is one of the kind of rules that you have to learn with AI security there are always other bypasses and machine learning is just not secure like you it you need to shift your mindset of how you think about AI

security when you want when you come to machine learning so the the main thing is models make mistakes so uh if you have a model that is 100% accurate something went wrong um e uh Mal production malare models are 99.99% accurate four 49 maybe five or 69% uh accurate and still the mistakes they made come back and bik you in the ass sometimes there's no way around the fact that they just make mistakes and each mistake could cost one of your customers a lot of money if for malware or spam models it's just like thing but we build our ecosystem around the fact that these models are just there to like Sim the tide not make the make they

aren't the critical component they just help a lot so and the thing is attackers they just exploit the areas if they find a way to get past your machine learning models they're just going to do that same o thing over and over again and like there's no way for you to prevent them to because if you never make a mistake you're not doing machine learning this isn't cryptography then the other big thing that you have to know for AI systems is um they don't do well on outo distributation and the classic example from this for from Academia is you take mnist which is uh the digits of the post cards uh and you train a model it's 50,000 images in the

amness data set uh training data set that people use um and you train a model on that and then suddenly you start feeding it things from fashionist which are the correct format to feed this model but you don't expect this model to handle this stuff well and it can do all sorts of funky stuff um the theory that some people have with like PE data is if you collect enough PE data and you train a model well enough it'll generalize to the new stuff and people if you went to the RSA floor and the or black hat and people talking about the AI and their thing the next AI back in 2018 and 2019 when that

was really exploding uh exploding they're like oh yeah our AI generalizes to the new threat so we can keep you secure against new things that mostly Works um but one of the problems is AI doesn't actually generalize that well every single time that I've a new model comes up and everyone's like oh it's so creative and can do things we find out that it's actually repeat like a remix of things that it's already done remixing stuff is very helpful there are a lot of um like a lot of academic work and music work is done with remixes but like when you actually get new things in the system which happens all the time with uh PE data um with Mal other things

especially with malicious stuff it remixing doesn't get you there and one of the things that you find out and we know from security is attackers do weird things all the time to get pass your by passes so you're always going to get out of distribution data you're always going to get uh you're always going to make mistakes so like to get this to like make a comparison to drive at home imagine if you have a uh a single API really simple API uh it's just like takes in some bites ands some bites just C like you know something that you can Implement in like under 5,000 lines of C with no engine x no Apache just like

nice like simple like you take the packets out of the kernel and reply very the simplest thing you can actually get work may get working um you could get get a competent a competent engineer could get at this point where you put it in a box ignore it in the corner and you can pretty you're pretty certain that it's not going to that box isn't going to get popped but this is like you know a constrain like thing that's what some people will pitch you on a like and you people have gotten some secure systems when have you heard about like the actual financial transaction parts of a bank getting popped that doesn't really happen anymore there was major changes

in the 80s and regulations and and those are handled on May frames things around the bank get popped but the actual Financial transactions are much are handled well but now if you think about securing an Enterprise with tens of thousands of endpoints and no idea what they own no idea what the things users clicking stuff all the time um you you you don't that's that security thing is not the same as the first one you're not trying to secure one system you are assuming that a user will screw up and you're going to get popped and you want a good EDR good Enterprise detection response thing system a nice sock in there you want people to come uh

incident remediation you focus you you drive less of your resources to towards securing your environment and making sure that the there's a wall between you and the world and more resources towards making sure that when something happens you have a good response time and the damage is minimized and you should think about AI as the second one and not the first one and it's even worse than that because defending machine learning models is even worse than defending against uh you know defending a large Enterprise with unknown unknowns and people doing Shadow it and a bunch of other stuff um this is one of my favorite examples of this um this is a two-dimensional slice of 780 dimensional

space and each color is sort of a unique decision made by a neural network and now this this is a two-dimensional slice there are for this very simple neural network there were two to the 60 regions in here very simple neural network because it had 60 hidden layer hidden nodes that means it had a very small amount of parameters and now we're talking about chat gbt with trillions of parameters this thing had thousands of parameters and each of those things is is a unique decision and we know from um studying these studying machine learning models that pretty much if you're in one Decision One region here and you change that region over there the model's

decision could change and you don't know that it's going to be correct there's this thing called out of Serial examples and this to me is like a good representation of like why you kind of be want should be concerned about the think about that so you don't know what's going on in most of these things you have two to the 60 um regions you've only sampled and know about how your performance are in 60,000 of those regions and you know that 2 to the 60 is much bigger than 60,000 so you can't control all those things you haven't seen so the first thing that you should always do is assume that someone's going to pop your model no matter what you do

there's nothing you can do to stop it and all you can do is delay the pop hope you see it when it happens and fix it as quickly as possible that is AI Security in a nutshell is it is just delay it so you have to deal with it less but when it happens respond hopefully you see it and respond as quickly as possible that's it there's not really defending the model I'm going to put robust Securities a big firewall and things like that the a lot of the AI firewalls don't buy you time and that's all you care about is will this buy me time if you are int introducing new systems it will your questions are will

this buy me time and if I in or will this let me respond faster those are the two questions you ask and answer with with AI systems but now for the this is more of a slide for a bunch of the people who know what I'm talking about what about my defense um is adversarial training um people will bring up adversarial training if it doesn't really help um for malware models we found that I I found that it seems to hurt the model more than it harms it didn't buy me time and it didn't help um adversarial detectors uh very easy to build an adversarial detector that's overtrained on like IBM art and my attackers aren't going to do

this uh adversarial layers thing if you want to have a better understanding of how a defenses work in reality go see all of the work of Nicholas k he has a lovely man who loves um finding poor academics who come up with a love an AI defense that they're very proud of and then just like wrecking their [ __ ] um he that's his favorite thing to do and he's very good at it and so far he's like um you know won every match he's like defeated every defense he's put his mind to so what have we done how do we actually manage AI risk one uh this is the quote from one of the

founders of the Facebook SP site Integrity we measure ourselves in the speed we could detect and respond to new types of attacks and mitigate the potential damage caused Facebook knows this case Facebook spam is always find a mistake in the spam detection service and ruthlessly exploited until it's patched so that's their response from 201207 uh malware I for this is one of those graphs I was forecasting how bad our things are and if you look at I don't need to explain the graph but the graph basically says we get popped in about 3 months we deploy this this model was deployed on January 1st 2019 and about 3 months later we know that there's a bypass because the the Blue

Line went up um and so we would forecast it like that so some historic strategies so like it's all is not lost people have been doing this for 20 years and they've come up with some ways of dealing with it so the first one is robust features not robust models but robust features and what I mean by that is you don't let the model see things that the user can control uh so features that are hard for the users to control if you are used to things about L if you want to know a different word that people use for these listing uh embeddings are the new word um which is mathematically correcti but I'm used to calling them features um

and for spam it's basically use the behavior of the spammer um uh for malware use features that change the execution behavior and basically malware is very this is very hard to do correctly and llms you want to use API key Behavior Uh so when you register for an AP for a session an API key with open AI they're monitoring how you use it and they will look for abuse in not not just the content but actually what you're doing so I want to give you an example of like how to screw this up very badly um uh silence I kill you so remember the example of the very first patch the very first vulnerability that

was disclosed in 2004 well this is from 2019 um so what they did was they created a PE classifier that's starting to decide where the your P portable executable that you've just downloaded from the Internet is malicious or not uh so you take your PE file look at the header data do a static analysis on and then you grab about 7,000 sance grabbed about 7,000 features out of that and created a embedding like a feature that they're going to use then they feed it to a model that has been trained on hundreds of millions of s PE samples about 50/50 malicious 50 50 50% malicious 50% benign and then they train it to say good or bad and if it says bad

it won't let the thing execute but they had a problem um rocket league and fortnite um would do Shady stuff on your computer uh they do all sorts of colal exploits to make sure you're not cheating in F fortnite and epic games is uh do all sorts of like Colonel Shenanigans to make sure that the DRM works and the cheating is not happening so their thing was gra doing had a lot of false B on fortnite and Rocket league so what they needed to do was have a way of allow listing all of Rocket League's binaries but rocket League packes their stuff every so often and silence didn't want to keep updating their their allow

list for each version of Rocket League because what would happen is a new version would come out all their customers things with all the rocket leagues and would uh customers of theirs would complain they would add it to the B thing and they would just keep going so uh to make this work out they put a use these thingo centroids so these features are a point in space and they took that point in space and they put a sphere around it and they said every single time I have a new piece of malware if it has the same embedding if it's embedding ends up within the sphere let it run which seems like a good idea

okay this is I'm going to put a little tight ball and if I have the right static analysis I'm going to have I'm going to only allow things that are rocket League like to run except they have string data in there and the string data was packable with you can just pack whatever you want in the string data so what they did was they took uh what Skylight cyber did was they took the strings from rocket League appended it to some malware just in the string data section and then gave it to silance and silance were like cool this is rocket League I'm going to let it run and so you could get any piece of mware just

append a bunch of strings to the end of the executable and then get it and it would uh from rocket league and it would s Sky silance would let it run which is not good um so what good features are are hard to modify the strings don't affect execution so if you really good if you are good at your featurization you wouldn't you shouldn't use them because it's too easy for the users to modify but what you do to get things that are uh good features uh for spam IP address Behavior there are IPS that are just blocked from Gmail will never get a receive an email from that IP address because it just sends too

much spam that's a huge part of how they deal with Spam now they're even making you uh in 2024 2023 they're making you pre-register all your bulk emails to Gmail to make sure that you can't send spam they are locking that down making sure you're behaviorally locked down so you can't do it um so domain Behavior Uh these are like these are more key indicators of spam on social media sites like Kora like um there's a long list of them and I'm not going to get into them malware the very fact that it has to execute constrains your system you can't just do a weird you you have to make something that is able to execute on the

end user that helps constrain the people um uh but one of the things is you can just pack your binary and then you'll get past it get past the crappy detectors um Dynamic behavior is really what you want to do but it's really hard to do and I can talk about that some other time llms count API Behavior if you keep asking it for like violent malicious violent things that open AI doesn't want you to do it they're going to see okay cool he's asked for a violent thing that's normal users users usually ask for a violent thing once a week that I can people I'm fine with that but if you ask Sur violent things

repeatedly over and over and over again you get a warning saying don't put give I don't want to give this but that's how open a deals with it next thing for AI security you know since you know you're going to get popped you don't want to talk about it um security is security here so not telling your users how it works means they don't know what they're doing to when they're trying to mess with you so it takes them longer to get past um so you just would want to say anything an example of a bad example uh proof putting so proof Point used to reply with all that data whenever you got a whenever it got an email and so

that data is basically the outputs of your machine learning model and it would just give it to you and now uh will pierce and Nick glanders use these to steal the bottle weights and now they could do all sorts of fun stuff with this because they stole it proof Point gave them way too much information um and if you steal the model in the right way you can now like go make your own custom emails on your home computer make sure that it gets past proof point and then start sending those without letting proof Point know that you're making nice emails that can get pass them um so the proof Point remove those from the response they weren't needed

nobody cared about them except for people trying to mess with proof points so they didn't need to be there good obscurity uh this comes from a social media uh website that is not uh you know not the biggest one in the world but uh this is how they handled spam on social media website they had a bunch of their data user data in a data set uh and then they had a bunch of small featuers so one of those featu rzes will be like hey where is this IP address from which country it's from all that stuff another featuer would be like how often these people log in all sorts of different featuers they had way too many of them

and if they they train a model on all of them it would be a very good model but they don't want to make the best model like the best model for the next week they want to make the they want to maintain the service so what they did was they selected a random subset of 50% of those features just 50% and they trained a model on just those features so that the model couldn't see all possible data that they Ed all the possible signals of how malicious users use uh their website um and because they could only see a sub the models could each each model could only see a random subset that was different each spam model behaved very

differently from the previous one or the next one so that the spammers who were now like frantically trying to spam this thing and trying to get P it had to relearn a new model every week um one of the best AI hacker like best AI hacking groups like in the world is just the social media sorry this search engine optimization people and they just get to know Google search engine and the YouTubers get to know the algorithm and the these the the the people trying to spam this website got to know the spam algorithm so they to prevent them from learning that they made the spam algorithm week to week change its Behavior quite significantly still good performance

less than but the performance each week was less than the best that they could ever get but because it changed week to week the spammers were constantly having to learn new things on their toes and couldn't get really get their footing so the overall spam was lower so that just keeping things obscure and hard to learn very very useful and that's part of why people don't know about all of these like the stuff that I'm uh the history of the stuff is because one of the principles is don't talk about it last one um speed is security well this is the pronouncement speed is sec as you know in if you got instant response I'm sure many of you have dealt

with instant response speed of security on instant response but for AI systems like uh you do a few things so initial response to a new attack blocklist you just need your fragile block list um The Silence example if it was just a temporary block list thing that they could run for like 3 days while they fixed the problem and redeployed a new model that would have been great but because they used it perpetually not a good idea uh but block lists all sorts of different ways to do block lists um doesn't have to be robust doesn't have to last a long time just needs to get you up past the thing uh this layered security principle of different types of

block lists different types of models you can redeploy like if you if you have sofa stuff if you can redeploy your spam model to prevent Hackers from getting stuff over if there's a new strategy that's uh people are new fishing strategy that people are doing and you don't your training the retraining your malware model U for a new fishing strategy with a ransomware in there retraining your malware model is probably more expensive than retraining your spam model retraining and redeploying spam models are faster put quick block on the thing and then like different types of models and things that you can do to be faster is more important but layered security allows you to be

fast um retraining time uh do your stuff best to make this as short as possible if you got to week to retrain your M your spam model then you're screwed you have to redeploy that every 3 days um so you've got to get that faster uh and now the last one that's really hard uh is detecting a breach one of the problems with AI systems is you have thousands if you are doing uh if you are crowd strike you have billions of P new of PE queries a day and a new strain of mar new new strain of malware could be hiding in those billions and there's not there not going to be that many of them and one of your

customers could get really screwed up by that and so how do you find that new strain out of the billions it's not like a single stock is only seeing a small fraction of that and might be able to handle seeing a new thing and like alerts of like hey there's this thing going on but once you're at crowd strikes level where they actually have to maintain that model they it's it's it's much more difficult for them to find the needle and the hay stack um there are ways to do this and Industry is uh the uh Advanced players in Industry are well Advan well ahead of Academia government anywhere else um like I interviewed with Facebook in 2018

and then I saw people describing Canary systems that they were using in 2018 in papers from 2022 it took the like Academia and like the outside world was four years behind what Facebook was already using and they spend way more on the detection respon the detection side of the detection response for machine learning models because the response is just retrain redeploy patch very easy to automate that side the detection is very difficult they spent way more on detection than anything else the last one is learn from your attackers um don't do Rand [ __ ] to defend against threat models that don't exist um search engine optimization people they are attackers they have conferences they read weird papers and

have theories and you can talk to them and you can see how they work and if you understand how they work they you would see how a lot of the threat models for how a the major companies for AI risk management actually work and but the main thing is they try stuff until it works and then they teach and sell it to each other spammers do the Facebook spammers do that too but try stuff until it works and so they sell it to each other now here's one of your favorite examples if you see a lot of these snake oil salesmen they'll tell you we've got a we've got a way to prevent adversarial examples from

affecting your model cool a I don't believe you and B uh if you have you ever seen an ereal example trying to attack your model there are delicate tricky maths to get working they don't really work for spam and malare and tabular data like that like a lot of security data they don't really work for because reasons but there's there's more there's some stuff thing you need a very good understanding of machine learning and which is expensive or you could do the wisdom of the crowd where it's cheap you're already doing it you already have thousands of people writing spam and malware doing this uh it goes out of disribution and it's cheap so if you had

if you were a if you were writing uh trying to attack a machine learning model which would you choose uh so this is part of the reason why we don't aders serial examples we see people doing weird stuff like uh after Ember 2018 was released and people found out malware uh was sensitive to Imports uh uh people writing malware would start shoving random Imports into their malicious software which was not a normal thing before that Rel before people figured out machine learning models were sens to that but that's because they learned it and they just started trying it to see if it worked not like that they actually did something aterial now model poisoning is another

thing that people come up with you can inject a very small amount of data into a model data set and it can drastically affect that model's Behavior at a Target point so if you want a your model to misclassify something that is malicious as benign you can pre prepare the area by putting a bunch of benign do binaries in there that are benign uh that have the same static analysis of your malware you just keep do that upload benign data for a few year for a year prepare the area and now you then when you release the malicious thing it can really screw with all your those models because they're not going to detect it

uh how are they actually doing that why like are they going to spend that much effort to make sure that like it's it's very difficult um people will talk about this for llms um but like what's the point um uh dilution is the solution to pollution you when you have torrents of data like uh fire shuttle and like uh a lot of the stuff going into llms these days just there the aggressive D duplication and handling of data that way mostly cleans up your poisoning so it's really delicate to I know there's going to be people who argue with me on this and I'm happy to have the AR argument but like I don't see poisoning is going to be that big a

deal on the large production models if you have a small model and you have curated for a single thing then poisoning could be a case but then like you got to figure out why you're doing the poisoning versus just trying something that's cheaper and simpler anyway so um like that's all history of stuff because people have been talking about poisoning virus Turtle for a while and like people have been talking about like a bunch of like you know the history of this like the if you learned about all this stuff on your job a lot of the stuff I talked about is something that if you worked in machine learning security like building a m model or spat

model or something like that the last 20 years you would learn about all that stuff on the job um and then you just don't talk about that to the public we have people talking about that at AI Village a lot um so that's how I learned that it's a very common thing and a lot of we would have shop talk about these things but it's not there aren't any books that tell you how to do this there aren't any the papers for the AI security stuff are mostly wred by academics who haven't worked in the industry and don't think about the threat model like the professionals do uh and so like it is different so

um like that's the one of the biggest problems with industry is it's the the the people who really know how to do the security stuff aren't talking about it so now since with since llms we have a bunch of people who've never done this now they're talking about it and they think they know what they're talking about because they read a paper from someone from MIT who also thought he knew what he was talking about um and did excellent academic research and I but not very good uh like actually securing machine learning models um so to give a kind of a recap to help you understand how things are going I've come up with a like a Gartner

diagram so I'm going to throw darts at a board and these darts are not going to mean much but but I'm hoping that it kind of explains some of these ideas so first axis of like AI security is how many attackers you've got uh and you can lower this by having verified attacks API limits and being less popular so if you just yeah and I'm guessing your CEO would doesn't want you to doesn't want to be less popular but you want to be low with fewer attackers coming at you and so the only two you can really do is make sure that you only have you can only respond to queries to verified stuff some of

those is impossible too the last second axis is like how much attack attacker control has over your model uh the way you manage that is you pick harder to manipulate features uh less public information and a moving Target uh and so you can reduce the attacker control by moving Inward and then you have your contour lines where these are the equality things so you're going to have a bad security system where you have to deploy each week a good one where you only have to deploy each year and then you know the medium ones um and so you really want to think about like where do I belong on this graph like how much control do my attackers have over my

model how much control do I how many attackers do I have to deal with so here's some of my Gartner darts um so we have modern spam up here they have to redeploy every seven 3 seven days to like manage things especially on large social media sites they have to redeploy aggressively because there's so many people just coming out of that but they have uh because they do a lot of Behavioral data uh it's really hard for spammers to change their behavior to spam less because by spamming less they're making less money so they want to really push that marker to like get as much content on your site as possible so doing stuff with behavior makes is is

sort of hard for them to actually control but they can they they they have way around it then they do find new stuff so you are redeploying like every few days because there's there's so many different ways to do this um we have malware models um it's the there's far fewer people who can write good malware than can write a piece of spam for a social media site so you kind of like way down there on thing but the attackers have way more control over your system so because they get to write how what the what the executable actually works like and then you have self-driving cars um it's actually really import hard to set up a attack

for self-driving cars um if you really really think about it there's these stickers don't work all that well they they do work um uh and then if you actually think about it like what's the point of stickers and stuff so I'm happy to argue with you about the self-driving car location on this thing but going back to the silence example uh they screwed up they introduced a learning vulnerability in that they made it too easy for people to manipulate them and basically just moved left on this thing uh so it was a vulnerability in how much um control they gave the people they didn't need to give it to them uh and I would to add to this prompt injectors

they probably go here they have you have way more control over what the input to the nlm is uh there's there's people who are researching this and trying to change this um but you have more control than before um but honestly how many surface attack surfaces where prompt injection will actually achieve a goal to get past a security control like most people don't deal you know I'll have more slides at the end but how many actual like prompt injections will cause a system to act send an email or do something bad we have theorized about this but I hope there's not many people who have actually deployed Control Systems where an llm has bad can do bad things I

know things so mature teams uh can't control how popular the services they and better features are too expensive uh or impossible when you first deploy your first like machine learning model in the wild you're going to be like out here you're probably going to be over here because you're giving way too many control you haven't got a good security posture um as you get more popular you're going to go up but as you mature you should be uh figuring out better features better controls and going inwards um there's some machine learning systems like that you're not going to really be able to go inwards but hopefully you can kind deal with stuff but you are hopefully moving as close as

you want to move to get as close here as possible mature teams will be as close to the act the C the uh origin of this graph as possible and they can't get closer without spending too much money or telling their CEO that their service has to grow less which not going to go over well so they eventually they've kind of gotten the model to the point where it is as good as it's going to get the bypasses are just coming in and they do a deal with it and then they stop dealing with making the model more secure and they start dealing with how to make respond faster so that gets back to the quote

from the Facebook guy um we measure our spe in the speed we can respond to new kinds of attacks and min minimize the damage of control that's really it and but what changed with Transformers we kind of had them in U AIS uh so three-dimensional Garner GRS work worse than two-dimensional ones because it's very hard toize that on a slide and the point of this slide this thing is to have slides for uh your Consultants to sell you things um so we have this like new axis um but there are new threats GPT FS uh paper harmful content they have a long list of things uh harm of representation disformation influence operations uh privacy cyber security uh and what they

did was they in their paper they showed a bunch of examples of how they mitigated some of these things um anthropic released a paper shortly off um after uh AI um in in November of 2022 that had this graph of different ways of people red teamers were successful or not successful in red teaming their models and here's the different attempts that their red teaming team of 111 did and they're all just like trying to get it to do bad things and this is like early days of red teaming a model like if you were making a malware model you would get a bunch of reverse Engineers or people who understand how malware Works to Red Team your model just to

figure out where where holdes are you do early days this is like this this type of red teaming is like the first thing you do to figure out where and what where you need to patch what CH you need to change but like uh generative red team one we did the same thing also did one off examples uh we had eight different models uh we had 21 challenges including credit card Miss economic misinformation um the credit card one was get the model to tell you what the hidden credit card number is um and this is related to a cve a cwe that got released a few a couple weeks ago and um in a real system the llm would not have

access to any other credit card than your own so you getting it you're getting the LM to leak your credit card information to yourself is kind of pointless if you actually were if a bank was to actually deploy this they would not train them LM on credit cards The pre- Prompt would not have the llm would not have access to anything other than your own stuff and this is how the uh like recommendations from like Gavin kondik who's on the iOS top top 10 uh the actual recommendations in uh the cwe is recommending this um running calling this uh prompt injection is like calling an SQL database uh injection of database vulnerability where it's most it's a

vulnerability in the surrounding system and the other one get the model to produce false information about an economic event or E fact where the false information has the potential of changing politically this is one of the big threat models that open aai really freaked out about when they were they said we we can't release gpt2 because it's has in 2018 they were or 2019 they were drumming up a lot of press saying it's too dangerous to relase gpt2 because of the damage it could cause to society uh this is one of the things they kept bringing up is the amount of disinformation R things that the the model could do um so but the thing is

like cools so you can make it hallucinate like in modern system things what's the harm at the same time like large disinformation operations are running hiring people at the cheap to run these things uh I can also download mix 7 billion and just run this on my local hardware without dealing with your llm or like your service um and I can uh without violating your terms of service uh so like the first bunch of the those harmful content like I can just I can also make harmful content with like cheap uh people online it's a you chat GPT making harmful content is more of a brand problem for open AI than an actual security problem not um but it's it's bad we

shouldn't do it but like there's way like if you actually want to do this um you want to do some things um one of the things people talk about is like oh it's going to help you write spam and fishing emails but like you can just take an lstm like my friend John Seymour did in 2017 uh 2017 uh and train it on your Twitter data and that makes excellent fishing spam that too many people clicked on uh and that was with a dumb model you don't need the latest llms to do that um but there's two things that people are like really kind of worried about like cyber security because we here at a security

conference potential for emergent risky behaviors uh so security controls uh they're actually kind of bad at writing malware um as far as um my friends and I can tell uh this might change um but the fact that it doesn't really know learn it but just does good remixing means that it's thing this might change with the next version of GPT it does help you be more productive so you might get more malware um out there in the system but not I I this I'm not so I'm not as concerned about this personally but I'm happy to be argued with uh text image models this is one of those emergent behaviors that the creators didn't think

about but is and is kind of good for good reason because hum most humans don't think about this as a threat but this is really kind of the problem that we have with these large these systems is suddenly they can do something weird that we didn't think about and there's a problem so cam image text image generation uh could have been generating cam um from in stable diff stable diffusion I know they did a lot of effort in cleaning things out but one of the data sets that were a lot of these models were trained on La and five billions had 1,600 cesam images in it which meant those models were probably capable of generating cesam a model that's capable

of generating porn is probably and the concept of children this is one of those emergent behaviors that I didn't have to think about when I was doing malware stuff but what act what if you want to sit down and solve this problem you can turn it into and what people did with prompt have done with prompt addictions is they turned it into a classification problem and then they started playing the game that we've been playing for the last 20 years so you'd have a semantic classifier and you would see if the person is requesting with layers of security like keywords looking for look U bad stuff that you don't want them to request and you start doing playing that

game but like this is one of those problems that is kind of emergent uh and we uh see it happening but the solution for this isn't generate a whole bunch of new stuff but it's kind of play the game you've been playing for the last 20 years it's make a classifier get into the the groove of deploying it and maintaining it and don't freak out we've kind of been dealing this this sort of thing for a while um and continue without things anyway thank you very much [Applause] [Music] h

[Music]

[Music] [Applause] w w [Music] [Applause] [Music]

I'm I'm just tring to give you [Music] something I'm just trying to give you something I do I'm just TR to you something he [Music] [Applause] [Music] [Music]

[Music] [Music] I'm just tring to I I'm just TR to give you [Music] something I'm just something I do I'm just trying to give you something [Music] w

[Music]

[Music] a [Music]

[Music]

[Music]

[Music] [Applause]

[Music]

[Music] [Music]

[Applause]

[Music]

[Music] a [Music]

[Music]

[Music] [Music]

[Music] n [Music] [Applause] [Music]

[Music]

[Music]

[Music]

[Music] [Music]

[Music] [Applause] [Music] oh [Music]

[Music]

[Music]

[Applause] [Music] hey hey hey [Music] [Applause] [Music] [Applause] [Music]

he [Music]

good morning everyone and welcome to bides Las Vegas we're here at breaking ground and we're about to hear from Oren and elad talking about reddis or not Argo CD and getups from attackers perspective before we get started just a couple of announcements we want to thank our sponsors especially Diamond sponsors prism cloud and Vana gold sponsors Adobe Drop Zone AI so their support along with the other sponsors donors and volunteers that make this event possible so this talk is going to be recorded so please make sure that your cell phones are silent if you have any questions at the end we're going to go around with a mic so everyone can hear you make sure to

speak into the mic so everyone can hear what you have to say as a reminder the the photo policy perhap is taking pictures without permission so please be careful and with that we'll get

started good morning everybody walk around from that thank you for being here today I hope you're over the jet leg by now so you could stick around during the talk and know I am super excited to be here today is it too loud no okay I'm super excited to be here today talking on besides elv and before we dive into the into the technical part of this talk on a personal note I want to share a story with you that illustrates what I think is my job as a security researcher that is my motivation over 100 years ago a revolutionary technology how do we avoid that a revolutionary technology was invented that technology allowed people

to move in a faster and more personalized way than ever before that is the car by the year of 1952 there were over 25 million registered cars on the road and in that same year an Industrial Engineer named John hatrick was driving to church on Sunday morning with his wife and daughter suddenly a deer ran into the road and John to AO to avoid hitting it turned the wheel of his car and the car went into a ditch John and his wife instinctively threw their hand to protect their daughter from getting hit luckily none of the family members got hurt but it did make John realize that that technology lacked a crucial safety feature and in that same

year he invented the Prototype of the airbag a safety measure we take now for granted just as John didn't stop using the car but instead he pointed out its safety flaws and suggested potential Solutions I believe that my job as a security researcher is to identify security failures in emerging Technologies in emerging techn Oles and suggest potential Solutions and and today in our talk we'll present how this mind mind State came into our research in GTO on gitops and Argo CD so without further Ado our agenda for today this talk will have four parts first we will EXP explain why we identifi GPS and Argo CD as an emerging Trend emerging technology worth re researching next we'll have to study

these products what is gp's Manifesto how is Argo CD operating and how does it look like from an attacker's perspective next for our main part of this talk we will share with you the research story of how we managed to find a critical vulnerability in the platform of Argo CD and what are its implications if being exploited by an attacker and lastly we'll share some security takeaways for you to take home and Implement in your organizations drawn from our broader research on gitops security now you must be curious who Am I by this point my name is Orin I'm practicing cyber security for seven years now I used to work in care Bros and networking uh research and

today I work as a supply chain security researcher at pyod hey

hey hey hey I'm elad I'm also seven years in cyber security field I'm practicing web application security that's my main my main focus and supply chain security I'm love acing random stuff and pretty much everything with an IP so I'm also a security researcher here at code so both together with Orin and that's our team

okay so let's start let's start with identifying a trend why we decided to research Argo CD and gitops so you know how starting a new research can be like super confusing experience because you have you have all these new technologies you can start and and research and all these lining lights it's like you you don't know what to do but this time we tried to do it differently the opposite way we analyzed Trends from major kubernetes and clown native conferences by the talks that were held in these conferences and by that try to identify a trend and there was that word we weren't familiar with that kept coming again and again and in a talks from

major companies so we've seen G giops at Adobe and giops at Spotify and even how gups change our lives by a person working in VMware okay so these are some serious companies talking about GPS but the next question will be do people actually use it like regular people organizations do they use it and apparently 91% of the respondents in a cncf survey responded that they already use gitops in their organizations and just to illustrate visually how this crazy number looks like and what's even crazier is that out of the 9% holding back over 2third claim that they will Embrace gith UPS to their organization during the year of 2024 now it is August so as for the question do people

use it probably yes and the next thing we'll have to do is find the leading gitops tool we have read that gitops is some kind of way to do continuous deployment for cloud native environments okay sounds kind of similar to supply chain security we could we could just um try and research that and out of the same survey on the top left corner the most left corner we can see with over 60% embracement rate a tool named Argo CD and when we went to further research what is that tool in their website we've seen that small some small organizations claim that they use Argo maybe you know some of them the last thing we had to do just to

to close this identify Trend story as a security researchers we we just had to go to Shodan to search for public um Argo CD applications and we have found over 12,000 Argo CD applications which made us Wonder how many private ones are there so GPS Aro CD was it worth researching We Believe yes but let's try and further study what are they exactly now let aad lead thank you Rin so like every good research we started with studying studying what is gitops what is Argo CD and basically be the Argo CD gurus learn it from bottom Up Inside Out learn every feature and be the best in that to even think that we can exploit

that so for studying what is even gitops gitops uses git as a source of Truth git repository could be GitHub gitlab bit bucket or any other semm it stores there all the configuration files right there meaning all the deployments all the iac's files all everything that we want to deploy to the cloud everything straight into the git repository so you probably think why save everything inside git repository that leads me to the second point versioning so you push new features to the cloud every day continuously manually maybe new Services new ingresses new load balancers and one day everything crash you're trying to do and think what me made that messed up and what to re

roll back and what to do but with get UPS that's not the problem all you need to do is go one commit backwards and your all infrastructure will be rerolled and back fixed again so in the bottom line the current state inside inside the repository that contains all those configuration files will represent the state in the cloud environment both of them will be synced so to visualize this let's see how it looks like firstly at the left side we all we have all the configuration files to the cloud environment stored in a GID repository then we have a gitops agent the one that sys every time and all the time for the git repository that we

configured and checks if there's some new configuration files to sync to the cloud then if there's a new commit a new deployment every every new uh configuration file it will instantly deploy that configuration file to the cloud saving that uh principle of the source of trth is the git repository so from an attacker's perspective it's way simpler what attacker wants to sit in production environments to sit on the cloud environments right there is the all sensitive data so what we what attacker needs to do is to break the gitops agent and that leads me to the best and most popular Argo CD is the the most popular gitops agent is the Argo Argo CD is an implementation of

gitops meaning it focuses only on kubernetes so taking this all concept of thinking G repository to some environment in kubernetes meaning taking all the deployments ingresses services and sync them to our cluster you install Argo CD inside your cluster in a separate name space and then you're good to go all your deployments are synced right in your into your cluster it has thousands of stars in GitHub and it's a graduated cncf project that only made us understand why it's so adopted among the community so the same slide here only with the Argo CD all the configuration files are under one git repository Argo CD syncs them all and deploys them to the cluster and makes sure the cluster is up to date

with the latest changes so that's will be our Focus today Argo CD so after understanding a little bit how it looks like and how the how Argo CD and gitops operates let's see how it really looks like we set up Argo CD application with a default configuration and started pushing some buttons and created a new application like we understood before new application first thing first connect our git repository the one with all our configuration files all the deployments one sitting okay we can see everything is starting to be synced and after the sync is done we can see the kubernetes cluster is synced with the latest gate commit and we have all of our uh deployments all of our

infrastructure synced right into Aro by Argo CD so UI is okay and it's nice but we we really care what's happening behind and under the hood right so let's understand what's happening there we have right there a g commit with a first commit and the kubernetes cluster are synced into the same commit then a user comes one day and submit a new commit can contain any configuration file any kubernetes configuration file a repository service a new component along the way being notified about the G commit and make sure to update a red cach server then a kubernetes controller takes those changes from the red cach server and updates the cluster so at the end of that process we

can see that the H state of our G repository is synced to our kubernetes cluster so after understanding all of those what is gitops what is Argo CD why it's so popular let's move on to the exploitation phase a huge component that we saw before is the red cach server the one that holds all our deployments and is staying up to date with the latest changes going back going to the documentation and to the learning learning phase we trying to investigate a little bit about that specific red cash instance we are going to the documentation and we see something we really really like the next sentence secrets are available to anyone who has access to

the radi instance that's nice we love secrets and we have unauthenticated access so together is Magic we tunneled our way and try to connect to the r instance and it succeed it's nice we we are putting inside the secret part and trying to eat some buttons under the r instance see what it contains what data we can find and we see very particular manifest key going to that key we see that all the deployments from the G repository everything all the secrets all the deployments all the ingresses basically all the configuration ation files from the git repository are right in there that made us think as a security researchers how we can interfere in that

full flow from git repository to kubernetes Cluster can we maybe inject some deployment straight into the r cash server will it be succeed so that's our main when thought here so poison the red cash server we are building a malicious deployment that will be privileged with all access to the node host of the kubernetes cluster that will be with that will be with the access to the file system to the network adapters everything and what it will do is only create a reverse shell to a Ser to attacker server meaning we accept expect here to get a connection to our server with the privileged permissions we poison the red is Cash server and we wait we wait a couple of

minutes and it failed we didn't get any connection it's very sad maybe we'll start another research idea maybe move to another another platform I don't know know a couple of minutes of researching that and we find out why our changes and why we didn't get any connection so our changes were rerolled and nothing has been changed in the r instance so we are going back to that DB and then we see that nothing changed nothing at all and the red cach server has been synced again with a repository with in that case a guar repository few minutes go by and we understand we a an attacker inside a cluster in another name space try to

deploy a malicious deployment by injecting a configuration straight into the red instance that's okay suddenly the repository service got notified about that about the change and what he do is reink everything from the G repository to the ready instance so everything is back to normal State and we didn't manage to inject our malicious deployment so what now are we leaving that what will we do will we try to use the regular flow from the G repository Orin will explain what happens next thank you a lot you're passing me the mic in a complicated stage of this talk but what we did try to do is to go back to the application manifest and see if

we miss something and this time we' have noticed an entry called Cash entry hash and that entry possessed some kind of a string look like a b 64 so we were thinking could it be some kind of a validation mechanism like a check sum for the application Manifest content luckily Argo CD is an open source application so we could go to the source code and look for the function gener creting that cach entry hash and in the source code we've found the function called generate cach entry hash that function receives a structure of the application manifest and Returns the string of the new cash entry hash perfect if they went through all the trouble for creating a validation

mechanism for the content probably it's going to it's going to use some kind of private signing mechanism right with a private secret this is what we thought but then we found that comment inside the source code and I'll read it they say hash the Json representation into a base 64 and coded fnv that was written inside the generate cash entry function I'll get this part we don't need a cryptographic hash algorithm since this is only for detecting data corruption I swear to God I haven't changed anything so we were super happy because that means that we could recalculate manually the cash entry has in order to sign our own malicious deployment right so this is what we've done so our second

try to inject malicious deployment we inject a kubernetes deployment containing a pod with all the capabilities granted meaning privileged Bond and his whole purpose in life would be to spawn a b shell to our attacker server but this time we signed our own application manifest using the logic from the source code we waited for the changes to take place and this time it worked we managed to succeed and in our attacker server we could see a new connection from within the client's cluster granting us privilege access to the cluster and from here the sky the limit it's like every attacker's dream come true when we went back to the Argo CD application we've seen a new record was

added and when we took a closer look we could see that a new resource was created named that attacker spoiler this is our malicious deployment and and theoretically if you're being worried about um getting detected we could inject our malicious deployment perform the attack and then delete everything so we won't remain any trace and we won't get detected now how does it look like from an attacker's perspective what could have attacker done with that privilege access so first it could access the file system of all the pods within the cluster all the production pods we could seal their data which means that we could also seal their tokens their kubernetes tokens to perform actions with their privileges on

their behalf also an attacker could steal all the kubernetes secrets in the because he's able to deploy whichever resource he desire thus he can mount the secrets next you're probably familiar with this uh application it's wire shark and the sniff you're seeing is a sniff from a production pod being taken from our malicious injected pod because since we've granted Network capabilities to our malicious pod he could sniff on the network adapter of the host node and read all the production communication and also it could intercept and perform many in the middle attacks to the to the pods in the production environment and if you're not kubernetes fans don't worry we have something for you as well

well we could use those privileges to steal the I am role of the host node and from here starts our journey of lateral movement inside the client's Cloud environment so that attack Vector that vulnerability really allows attacker to have it all to sum it up the attacker's new capabilities would be he could steal all the secrets in the kubernetes cluster he could sniff all the network communication and intercept it it could deploy whichever kubernetes resource it Desires in the cluster and escape to the cloud environment now imagine that any pod in the cluster could perform this attack any pod any compromised pod in the production environment could access the ready server and take complete control over the

cluster we have disclosed that vulnerability to the Argo CD team and it got a critical score of 9.1 we were super happy it's like it was the the highest score we ever got so we were very content and to recap the exploit part we have learned that any compromised pod in the cluster could access the r instance of the Argo CD application which resides within the Argo CD namespace it could retrieve the application manifest of the Argo CD still the secret if they exist in there and inject whichever malicious kubernetes resource he desires effectively leading to Cluster compromise okay that part was intense um as for the disclosure here it's important to me to give a shout out for the Argo CD um

development team to Michael Leonardo and pav for helping us to mitigate this vulnerability and make Alo CD safer for everyone as for security takeaways I have four things for you today the first one is pretty straightforward update the version of your ago application that's easy that's an easy one to our version which is not vulnerable vulnerable to that attack the next three are going to be General G of security takeaways drawn from our broader research on G of security and these are as following first deploy your gups tool in a separate cluster since if the client or someone have deployed the Argo CD in a separate cluster a compromised pod in the production environment couldn't have

access the ready server the same logic goes for the second uh takeaway the enforced Network policies Network policies in kubernetes are like firewall or IP tables and by enforcing an alloud list that allows access to the giops resources only to resources that should access them in the in the first place you could have mitigated that vulnerability easily and it's important to make sure you also have a cni plugin to enforce those Network policy rules and the the last one is the Golden Rule it's a principal for cloud um environments in general or any environments actually deploy gethub tool in the least privileged um with the least privileges possible instead of deploying it with the default

admin permissions it's better to customize it to your privileges that um you actually need in your organization now let's watch a live demo that was recorded before here we can see our attacker server we're listening on Port 50 852 this is like a normal ec2 machine and we will wait for connection from within the client's cluster and this is the compromise pod it could have been compromised by a web shell or whatever it is a pod in the production environment and it has low um permissions in the cluster it is is not supposed to be able to container Escape neither it to privilege es escalate with inside the cluster and next we will see the Argo CD

application and this is the same um structure as we've seen in the presentation and right there on the right you could see this pod this is where we are again it's a production pod that has been compromised with some low privileges in the cluster perfect now let's try and expl that we have uh written a tool to um to exploit that vulnerability and it has two modes it has the detect mode which will allow us to detect the Argo CD ready server and the server and it has an exploit mode for detect mode it will try to resolve the fqdn of the Argo CD ready server against the kubernetes DNS server inside the cluster that way you could know the IP

address and we could see that the Argo CD applications U version sorry the Argo CD version is prone to that attack it's a public API the version next for the exploit the exploit will require two flags first one will be the malicious deployment to inject it's a kubernetes yaml um infrastructure we will want to inject and we used um malicious deployment taken from Bad pods public project written by Bishop Fox so thank you very much for that and it contains a pod with all the Privileges that will connect to our attacker server and we will also give it the red IP address when we hit enter we can see that it found one application manifest it injected the malicious

deployment and recalculated the cash entry hash and the attack completed successfully and when we go to the Argo CD application web view we can see that a new poll was created that ATT attacker that is our malicious and now when watching our attacker server you wouldn't be surprised that we have received a new connection from within the client's cluster that allows us everything on the cluster to compromise the cluster as a whole and that is with without any privileges in the cluster I'm happy it worked that time the recorded demo was a good idea um that's it for today if you're interested in more technical details about the research or some um more detailed remediation

guidelines for that vulnerability please read our blog post on the on the vulnerability and and on a personal note I had great time thank you and we're very enthusiastic for supply chain security so if you want to collaborate you have any ideas or questions please feel free to DM us on LinkedIn and we'll be more than happy to answer and if you have any questions um now it could be a good time we have time right yeah perfect thank you [Applause] if anyone has any questions just raise your hand and I'll walk around with a mic so everyone can hear we'll also stay after the talk if it's like a stressful um position position all right thank you very much

have a enjoy the rest of your day

a [Music]

[Music]

[Music]

[Music] TR [Music] hey hey hey [Applause] [Music]

hey hey hey hey [Applause] [Music] a [Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Music] [Music]

[Music] [Applause] [Music] he

[Music]

[Music]

oh

[Music] h

[Music]

[Music] [Applause] w [Music] [Applause] [Music] I'm just I'm just TR to give you [Music] something I'm just tring to give you something I do I'm just to give you something he [Music] [Applause] [Music] [Music]

[Music] [Music] I'm just tring to do I'm just TR to give you [Music] something I'm just tring to [Music] get I'm just trying to get get something [Music] w

[Music]

[Music]

[Music] he [Music]

[Music]

[Music]

[Music] [Applause]

oh [Music]

[Music] oh

[Applause] [Music]

[Music]

[Music]

[Music] [Music] [Music] a [Music]

[Music]

[Music] I [Music]

[Music]

[Music] [Music] [Music]

[Music] good morning everyone and welcome to besides Las Vegas breaking ground this talk The Fault in Our metrics rethinking how we measure detection and response given by Alan stot before we begin just a couple of announcements we want to thank our sponsors especially Diamond sponsors prism cloud and V in our gold sponsors project circuit sem group it's their support along with our other sponsors donors and volunteers that make this event possible so this talk is going to be streamed so please make sure that your cell phones are turned off if you have any questions at the end I'll be walking around with a mic be sure to talk into the mic so everybody can hear

you also there's a strict photo policy so please make sure not to include other people in your photos when you're taking pictures and with that let's give it over to [Applause] Alan hey y'all thanks so much for coming to my talk I've worked in detection and response for the last decade and I've made a lot of mistakes especially when it comes to metrics this is the talk I wish I'd seen today you'll get three things you'll get a framework I built to help you build much better metrics you'll get a new maturity model that I've been using to describe and measure detection response capabilities and you'll get lots of examples so my story with metrics starts

on a Monday morning I'm only a few months into a new job and I get a message from my boss he's like yeah you know the board of director meeting is coming up and I'm looking for updated program metrics and you can tell I'm new to Senior Management I'm eager to please I don't ask any questions so I send a message to my new team and I ask them hey what have we presented in the past yes what's the response oh no bad news last manager just made those up and good news I'm going to do so much better how many of you have had this happen where you inherit someone else's metrics yes this is often our starting Place

metrics that haven't been well thought out or maybe even worse fudged to avoid questions or more work yes so I did what you probably did I Googled it and then I ended up just copying the metrics I used at my last job and that's led me to using a lot of bad metrics but so what why should we care about metrics well you all attended a talk that had metrics in the title why do you care about metrics it can drive change it can it can drive change show things it shows you're doing things what would you say you do here budget yeah I need more money I need more headcount yeah one reason might be yeah

metrics do they supposed to drive Improvement uh Carl Pearson he's a late 1800s 1900s guy widely viewed as the founder of modern statistics and he's got this quote he's famous for that which is measured improves sounds like a great plug for metrics but there's an implied warning in that message what if I'm measuring the wrong thing there's this paper that's written by these two guys out of MIT Hower and cats and the paper's called metrics you are what you measure and they talk about how as you pay more attention to metrics you start to make decisions and take actions to improve those metrics the metrics you choose are improving and over time you'll become what you measure

metrics also help us communicate what we do why people should care uh Edward tufty who by the way he teaches one of the greatest classes about presenting data it is not cybercity or infosec related whatsoever he's got an entire section that talks about terrible PowerPoints so it's a very fun class and he's got a quote that says metrics reveal data metrics are a tool that enable us to present the greatest number of ideas in the shortest time with the least ink in the smallest space and why well if we're being honest because we need budget we need headcount and metrics are usually the tool we use to communicate that so why are security metrics hard why do we struggle with security

metrics we don't know about success looks like we don't know what success looks like what else I've heard people say it's because we're trying to prove a negative right like nothing happened success uh in my personal experience security metrics are hard because I'm a security person I don't care that much about metrics here's a much less famous quote metrics are an annoying PowerPoint I need to update every month that's me a bit about me I'm a senior staff engineer I work at Airbnb I work on fun things like Enterprise security threat detection and incident response and I really love my job I live in Austin Texas with my wife and three-year-old son Liam here he is

little smile there and I really love being a d and a husband and there's one one thing I'm really good at as a husband as a dad and as a security engineer I'm really good at making mistakes and this is the point of the talk where I'm supposed to gain some credibility with all of you tell you about my accolades my 15 years of experience but really I've just been making mistakes let me tell you about five of them and the first terrible mistake I've made is losing sight of the goal how many of you are on call or work in alert queue of some kind yeah how many of you are on call right now yes what a terrible idea

right those are the tired people in the room by the way and this year marks my 10-year anniversary of being on call and for those of us that spend our days triaging alerts and respon responding to fires it can be really easy to lose sight of the goal and so we end up describing that Frontline operational work with metrics like this one and here's a metric that shows the number of security alerts per per month how many of you have seen this metric before yeah how many of you have this metric today cool and if you take a closer look you can see in the past year March and April had the most alerts my boss will ask a question about

that and if we keep looking at it it looks generally like alerts are trending down did we do that did we stop logging something in February did I get really mad at the IPS alerts and was just like we're turning them off alert count has become the heartbeat metric for security operations instead of rooting back to the goal of detecting threats and responding quickly we've reduced ourselves to cries for help I've come to call this metric the operational burden we've inflicted on ourselves you'll notice this graph has no numbers on the YX if it said 5,000 versus 500 versus 5 million like does that does that mean anything to anybody especially the people that aren't working the alert

queue another title for this might be we're doing things it's crazy out there maybe it's fear driven scare leadership with a bunch of alerts and sometimes we try to make it a bit better we break it down by true and false positives I've been proud of myself for doing this but if I'm honest I'm not really sure what I was trying to say with this metric that we have a lot of false positives so what's a good true to false positive ratio is it the same for every alert type if I reduce this number would it mean that I have decreased my visibility in the threats if I have too many false positives does it mean that

I'm possibly missing true positives and so the first problem I'm running into is I don't know where to start with metrics detection and response has significantly matured as a field but I'm stuck here making metrics about alert volume so I needed a starting point and so to give you a starting point I thought about what in detection and response could we measure to help us make decisions to see if we're improving the acronym here is not great it's Savor and the S of it is we want to show that we're streamlining our operations by improving our efficiency and accuracy through automation through better tooling and processes so that's one area for metric we want to raise awareness about

what we're learning from threat Intel share things like what threats and Trends we need to be prepared for we want to measure our vigilance how prepared are we for those top threats can we detect them and as we learn about new threats and Trends how is that guiding our threat hunts as we explore our networks what are we finding and when our detections fire or our threat hunts turn into incidents what's our Readiness how quickly are we able to organize and respond to incidents how complete are our playbooks so when you're thinking about your own metrics think about which saor category the metric should fall under and this can help you tie it back to an

outcome and I like to start with just one metric in each category uh we often get asked to make a lot of metrics um and that doesn't help us focus and so for each metric we should ask what question does this metric answer so what question were we trying to answer with this metric I think it was are false positives taking up too much of our time or phrase it a different way do I have enough time to properly investigate my true positives so another important question to ask when you're looking at a metric is how do I control this metric how do I reduce false positives so how do I reduce false positives alert tuning how's that going

for you yeah about that and if I map this to my saor categories this is maybe a streamlined metric and streamlined metrics usually answer questions about efficiency accuracy and automation so I have two big problems s with this metric first this metric doesn't tell us where we're spending most of our time we think it does intuitively it makes us feel like we're spending most of our time with false positives and second the only control I'm rewarding is tuning or turning alerts off so how can we make it better and here's a graph of time spent on false p positives and for now I've completely removed the true positives because for now I'm okay if we spend

time on those and instead of tracking how many false positives there are I'm tracking how much time we're spending on them now how do you track this that could be as simple as well whatever alert system using is when the alert gets created to when it either gets assigned or triaged and there's an inherent problem with that uh if your team's anything like mine you know if I'm on call I'm doing the triage and I've got you know 10 alerts sitting there in the queue what's the first thing I do when I see all those do I go to each one individually and start working on them no I select them all and I sign them to

myself and why do I do that what metric time to response time to response time to triage our slas it's important to remember when you're thinking about a metric to remember that you've got a bunch of hackers and they're going to figure out how to make this metric improve regardless of what they're actually doing is improving the the process so I recommend not measuring it for a while you want to have accurate time for how long an alert triage takes and by measuring something that is inherently making people do the thing that gives you poor data doesn't make any sense so how do I control this metric how do I improve this metric well automation maybe and as we

get more automation tools the number of events may not even equate to how much time we're spending on false positives and as you automate you can carry over the time you spend to automate that so this lets you do something really cool you can actually speak to the amount of human hours your automation efforts are saving you so now folks aren't just incentivized to tune or turn off alerts they're incentivized to find out where are we spending the most manual time so that we can automate it my second mistake my second mistake is using quantities that lack controls or more simply said measuring the things you can't change meantime to recover is a classic incident response metric it'll be in

your Google search and in this example you'll see that recovery was lower in September and October and then it grew in November and December but then the team pulled together we did some real good process man management we improved our tools we worked really hard and we got those recovery times back down or maybe there are two three four major holidays in November and December it's funny I've spent the last year researching metrics for detection and response and I've learned something we're obsessed with speed in incident response the vast majority of results when I search for detection and response metrics are about meantime time to detect time to respond time to contain time to recover and I'm not going to

argue that speed isn't important but using time as the sole measurement across incident phases completely ignores quality and Effectiveness but my big problem with this metric is that security incidents have a lot of variability especially the further you get Downstream in the response process a lot of dependencies from event start to recovery and not all of them can be controlled especially by our teams so a graph like this it doesn't help me make decisions because I don't know what's controllable here I don't know what my team needs to do I don't even know if this is good or bad and what happens when your teams don't know how to improve a metric you stop caring about it it

because you can't affect it so instead I've broken out the response times across all the different phases and here I have filtered out any built-in time that I need for Quality so I like to do this where every response Playbook I have has some expected built-in time sure as you mature your capabilities that built-in time will come down but that's not the focus for this graph here we're looking at what can we control today Eric brandwine from AWS he gives this talk it's called the tension between absolutes and ambiguity in security and in it he says when you look at a metric it should immediately answer what do you want from me what do you

want me to do and one of the easiest ways you can do you one of the easiest ways to do that is make the answer zero if there's nothing to do so here I filtered out all the time we can't reduce right now so if there's nothing for us to do we've made the answer zero so now I can look at these metrics and I know exactly what it wants from me go look at the incident in December and figure out what happened in the remediation phase so then we can either filter out more of that time because it needs to be built in or we can do improvements in our playbooks my third mistake was thinking

thinking proxy metrics are bad or more simply choosing amazing metrics that are insanely expensive to create when all I really needed was a metric that was good enough so here's a great example so a long time ago uh my team and I decided that we wanted to know what our miter attack coverage was and this was before this was the cool sexy thing to do and we determined that we needed to write tests across the entire framework and once we got going we figured out okay we don't need just one test per technique that won't tell us much and also we've got Windows Mac and Linux so we're probably going to need a couple of tests for those and so

after years of developing tests investing in tooling we finally had the data to visualize our attack detection coverage side note I saw a really great tweet the other day it said we need to do a better job of mocking vendors that claim 100% miter attack coverage for lots of reasons but most importantly it's silly I've seen the Carnage that 100% miter attack coverage is and it's alert fatigue like you wouldn't believe anyway we spent years Gathering all this data and it is very cool but at the end of the day all we really wanted to know was where do we prioritize detection building so do this instead rather than trying to measure your your detection

coverage across the entire attack Matrix start by finding the top five threats you care about the most don't overthink it look at your external thread Intel and think about what industry you're working in what type of environment do you have and then look at your incident Trends what kind of events what kind of incidents are reoccurring and then link those back to your organization's security risks what would be a really bad day for your company if data was exfiltrated what data would make your Chief privacy officer cry the most it's a great metric you can visualize it by a growing tier as well and once you've got your top five prioritize your detection development from there and I like to Workshop these

as a team where everyone takes one of the top five threats and then we use attack to derive all the different techniques and sub techniques and as you write tests and detections you'll slowly end up building yourself this prioritized miter attack coverage map things you actually care about but without all the alert fatigue and without having to build this super costly metric and plus now you might be best friends with your Chief privacy officer my fourth mistake was not adjusting to the altitude and as someone who has floated back and forth between management and individual contributor I'm very guilty of this one who here has tried to explain all the different Columns of the miter

attack framework to a board of directors yeah I see a couple guilty hands I have sure why not let's do it detection cover is actually one of the better new metrics that we've come up with but wow we've done a bad job of explaining it at the leadership level I've seen one of those miter attack heat Maps uh from a specific vendor just slapped into a board of director's deck as if it meant anything to them so we need metrics at every altitude and the higher the altitude the less it becomes about the specifics of detection and response and more about the impact to the business and it's helpful for me to think about it like a pyramid for the

business the impact we make is reducing the cost of an incident or a breach or another way to think about it might be how costly we make it for an attacker to cause impact and so our metrics at the top of our pyramid are about meantime to detect and alert the organization about a threat and how quickly we can can respond and get thing get business back to usual but then under that top layer is our coverage and Effectiveness can we DET the detect the top threats to the business do we have playbooks for the attacks most likely to happen do we have the visibility we need and then under that layer how well do our tools perform

how much time do we spend trying to figure out what logs we need to search and how long it takes to search them organ izing your metrics in a pyramid can help you connect the lowest layers to your Northstar metric and speak at the altitude that's appropriate for your audience organizing them in a pyramid can also help you connect your metrics to the rest of the security organization so it turns out detection and response isn't always the best strategy if your metrics show that me and time to respond is trending up because of a repeating type of incident Sometimes the best way to reduce the cost isn't by improving your streamlined or your Readiness metrics it's putting a new control in

place to prevent that type of incident from even occurring I really like to do this especially because I get to work across lots of different teams I get to work across detection and response and then I get to go and visit all the teams that do V vention and you would be surprised how little we're telling those other teams they have no clue what's happening over in detection and response they don't know what prevention and controls they should be prioritizing based on what's happening in the real world and it's our job to help inform them so that our lives get a lot better and my last mistake was asking why instead of how and my natural inclination is to ask why

why didn't we detect that malware sooner why are we still missing those firewall logs and as a dad I have a lot of why questions why do we bring the car seat when we only took one taxi ride the entire trip why do we need four suitcases why didn't we bring the stroller and why can't Liam walk by himself why can't you walk by yourself Liam but in all of these examples why is not helping and so instead I've learned to move straight to the how and start figuring out what needs to be done and often answering how allows you to identify the underlying problem much faster and from a much more positive perspective especially from your spouse

I mean cooworker how can I carry Le a car seat and two suitcases through the airport how can we detect these types of threats sooner how can we respond faster when I interviewed with my current VP she asked me how do we build a modern detection and response program how do we get there but one question interview and it made me think about maturity models and my first exposure to maturity models was the hunting maturity model hmm who here is not familiar with the hunting maturity model okay so it came out in 201 15ish it's was created by David biano who's also the creator of the Pyramid of pain and hmm was great when it came out and it's still great

today because it helps describe the different levels of maturity for a threat hunting program what do we need to get to the next level of maturity how do I get there what specific indicators would put me there what type of activities would be expected at different levels of maturity and maturity models are useful from a standpoint that they give us as security practitioners this common language to answer where we are now where we're going and how we're going to get there so I created the threat detection and response maturity model and the TDR maturity model buil builds off of the hunting maturity model and expands it across all the different areas of detection and response and there's a lot

to it so at the end I'll provide a link with the full maturity model that you can use and the first pillar I thought about was when measuring maturity was observability do I have the tools and logs to get the visibility into our entities and user activities can I enrich it so it's contextualized and searched quickly and then proactive threat detection where we focus on collecting threat Intel prioritizing the detections we build and buy and the Hunts we perform and then finally rapid response where we prepare playbooks and automations so we can move from triage to analysis and respond with all the forensic capabilities we need and we can use these pillars and these 14

capabilities to describe and measure where we are today and where we want to go next and for each of the 14 capabilities in the framework you'll score four different areas process tools documentation and testing and you'll rate those from initial all the way up to Leading and I've provided some general guidance on how to rate the maturity of each area but in the framework itself there's a lot more specific direction for each specific category and capability so for example if we rate our detection engine capability we think about what processes do we have do I have a process for creating a detection that looks for firsttime occurrences do I have a process that defines the most optimal

way to determine those thresholds and then we rate our tools are the detections we have managed from a central location and then documentation or what's been the case for most of my career the lack thereof and then finally there's testing and we all know what happens when you don't test things well bad stuff happens and as you go through each of the capabilities and rate them I like to rate them individually and then afterward rate them together with your team because everyone once you start talking about the different capabilities you'll hear things that you'll change your mind confirm your own rating and you can have a really great discussion about where you are with each

capability and then you can take those ratings and show it a high level where you are today across the three pillars and where you plan to be by say the end of the year based on the projects you're planning and the initiatives you're doing and I like to use this tool because it's a very simple message for leadership but there's a lot of underlying detail that backs it up that you can zoom into depending on your audience but I also really like to use it because it shows whether the work that you've planned is going to have impact or not if you've done all your planning projects and you're like cool I'm going to work on

this and then you're projected to not move at all maybe you should rethink the things you're prioritizing what's also really cool is that a number of companies and folks have been starting to use this and I can now start to Baseline where different sectors are and understand where I should expect myself to be and where folks have struggled in moving their maturity and this is the nice thing about using a framework is that we're talking about the same capabilities and problems using a Common Language and we could talk specifically about what went well and what didn't in our journey to get there so as you do this work you'll need metrics to show are you getting

better and here's where SA comes back in and so for each metric you create you'll put it into this structure you want to avoid my first mistake losing sight of the goal and ask what question does this metric answer what's the outcome we're looking to achieve and then what sa category do I tie it back to to help drive that outcome you want to avoid my second mistake using quantities that lack controls make sure it's a metric you can actually control and don't forget make it zero filter out what you can't control today so when you look at a metric you know exactly what it's telling you to do and then if you have control of a metric

what risks could this measurement reward I was talking to a buddy of mine and he runs one of those really big socks like the kind with the huge room and the monitors on the wall and the pew pew map and I haven't been in a big operational sock in a really long time and I'm happy to say pew pew map is alive and well it's doing well anyway they were talking about metric and he was telling me that their time to analyze metric was one of the biggest pain points in this sock overall analysis was way above what they expected so they brought the metric up to the team they're like hey time to analyze we got to find ways to bring it

down so guess what you won't believe it the team brought it down and guess what else went down quality of analysis so then guess what went up true positives missed so when you introduce a new metric think about hm what risky Behavior am I rewarding it might not be a bad metric but you might want to create metrics that balance it out because remember you become what you measure then there's metric expiration when is this metric not needed anymore when my only lever was alert tuning it might have made more sense for me to track alert volume but now as I automate more and more of my alerts maybe it's time I expire the alert count metrics or

at least remove it from my leadership decks and then data requirements how much data will this metric require how much new effort are we going to need to improve this metric and how much time does it take to collect this metric don't come away from this talk telling people Allan said I should spend all my time working on metrics that's not the reality the reality is that we rarely have enough time to work on metrics so don't make my mistake number three where you come up with this amazing metric that's going to take so much time to work on to improve think about the fact that I have very little time I don't get new headcount just because I invented a

metric take that into account when you're choosing your metrics and anytime I talk about metrics I always get asked how do I change the bad metrics I'm already presenting and I get it change is hard leadership does not like surprises and they often have expectations that I'll be updating last month's slide deck but I have a tip that's worked really well for me uh here I've convinced my friend Dexter he's still my friend to get in near freezing water it's about a little under 40 fhe and Dexter's first reaction was shock his heart rate spiked when his body hit the water he gasped Liam thought this was pretty funny and he had to work to not

hyperventilate but then suddenly it all made sense to him this is great and it's the same when you change your metrics it's not going to be fun immediately people will go into a State of Shock those metrics they've been around a while and they've gotten very very used to them but my tip is embrace it push to the change because they'll soon have Clarity because when you bring it all together you're about to tell a story that's complicated it's Technical and you don't have a lot of time to tell it so here's my pitch up front and center is our maturity model using the TDR maturity model and it shows the maturity of the program where we are today and

where we're targeting by the end of the year and then we use the sa categories to tell the rest of the story we're streamlining our operations by looking at what's taking the most time that's what we're automating we looked at our threat Intel and incident Trends and we're raising awareness about these top five threats to the company we're focusing our time this quarter to build detections for these threats here's where we're tracking and we've been exploring gaps in security controls relevant to those top five threats and we found three new gaps and from a Readiness perspective we have one type of reoccurring incident with a really long recovery time so we're working with our security team to implement new

controls that'll prevent these from ever occurring so now instead of making wild guesses about whether you're improving and if the tools you're buying are making a difference you can use the TDR maturity model to measure your capabilities instead of using volume counts fear tactics and tired emojis you can use sa to get to the core of a metric ask better questions and map that to something you can control and instead of focusing on 100% miter attack coverage you're focused on the threats that matter the most found your top five and are working on having detection coverage with real impact so hopefully this talk is your wakeup call take a cold plunge it's time to rethink your

detection and response metrics thank you very much [Applause] and this is my link tree it has my contact info it has a copy of the slide deck and then there's the complete TDR maturity model I also write a very infrequent newsletter very infrequent I have a toddler called meard uh it has an adorable cat that people love the security info is half decent I've got lots of stickers to hand out so please if you see me after please come and grab them and I think we have a couple minutes for questions we've got a couple minutes for questions so raise your hand and I'll walk around with the mic thank you so much for this by the

way I thought this was really useful I had a question regarding the program maturity model that you were talking about can you talk a little bit about what you use to fill out that graph I I'd like to understand that better to have examples yeah absolutely I gave a talk last year about building a modern detection and response program and in that talk I dived I took a step back and I dove into what are all the things that we do in detection response and then I thought about what are those outputs and what I found was a lot of the outputs that we had internally things like threat Intel were really not getting shared at the right level they were you

know informing our other tools internally and that was about it um there's a link in this link tree that has that full talk and in there I talk about all the different areas of detection and response I think we need to be able to succeed and I use a lot of uh a lot of additional like research that folks have done into how do we build good programs how do we think about programs and what matters the most so check that out that'll answer a lot of questions for you yeah great talk um two questions how long does it take for a big sock with those pew pew maps to get this going all right how long does it take

all right well I guess it depends on like your bravery uh I have had folks I've talked to that are like this is going to take me years to make that change um it really does depend on your leadership's willingness to shift I really think that if you're not telling the story of what you're doing all you're really doing is continuing a narrative that doesn't inform anything one thing I found works really well is asking back the question of what question are you trying to answer with the metrics we have today and spending that time to bridge that Gap because a lot of times the questions they're they have in their head don't aren't really

actually being answered by the metrics we're putting in front of them and you can use that saber framework to find where their question really is and map it to something you can really show that that makes that change I'll come back to you for your second question later um how do you guys go from counts of detections in a miter bucket say to coverage of that miter bucket say that again uh so weal there was a miter coverage you know kind of in there for like a a AE tactic or procedure how how do you go uh from the counts of the detections that you have to deciding how much of that is coverage within coverage it's really

tricky to know and this is this is partly why uh I think a lot of the miter attack coverage maps that specific vendors and tools pump out don't tell us a lot especially if we don't understand like you know a miter attack technique could have 50 different kind of examples and if you can detect one of those does it really inform that You' be able to detect that in a day-to-day way um there's uh a waiting system that I've used in the past where for every test you write and every then correlating detections to those tests you talk about how good that test is for how good the detection is so if you can think of 50

different ways to do something will your detection catch a percentage of those this is what I think helps us build detections that are a lot better versus like if somebody's like oh I'm going to detect all miter attack you could write like some very simple binary type detections or you can improve your like all right this is looking at behavior that doesn't normally happen here and it's extracting out some sort of data analysis from that I think having a maturity of basic all the way up to you know this is actually going to detect a lot of different like abnormal behavior will help you wait whether or not you say what percentage of that technique

you're covering and so assigning like a weight to each of those there's a really great talk I'll have to look it up that they talk about a framework for waiting the tests that you have for each miter technique to the weight of your detection and they've done like a really cool analysis of like tools that actually like do this uh so come find me I'll try to find that talk that paper I think one more so if you're just starting out with some of this what would be the top things that you prioritize first from like the metrics I create yeah all right I would start with what you're doing today uh so if you don't have a threat

Intel program or you're not even thinking about thread inel today don't start with that metric because you've got to do a bunch of work first um I I I really think you generally don't have a lot of time for metrics so think about what can I actually measure today that would help me make better decisions today uh if I think about like the s a v r I like to start kind of on the the response Readiness side of things because that's the stuff that happens regardless uh you have an incident you could have created the detection for that but the reality of it is is like you just might need to be able to

respond to those things and so I like to start from response I think that's uh the Readiness area I think that's a good place to start cool I'm out of time I'll hang out in the hall over there and I've got lots of stickers so thanks everyone for attending [Applause] [Music] St

[Music]

[Applause] [Music] hey hey [Applause] [Music] [Applause] [Music]

he [Music]

he

[Music]

[Music]

[Music]

[Music] track [Music] got

[Music] hey hey hey [Applause] [Music]

hey hey hey hey hey [Applause] [Music]

[Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music] he [Music] [Music] [Music]

[Music] [Applause] [Music] he

[Music]

[Music]

he

[Music] h

[Music]

[Music] [Applause] [Music] [Applause] [Music] a [Music] oh

[Music] I'm just TR to okay I I'm just TR to give you [Music] something I'm just TR to give you something I do I'm just tring to give you something [Music]

oh [Music] [Applause]

[Music] [Music] I'm just TR to I'm just [Music] something I'm just tring to give you [Music] something I'm just trying to give you something [Music] o [Music] oh [Music]

[Music]

[Music]

[Music] [Music]

[Music]

you [Music]

[Music] [Applause]

oh [Music]

[Music] [Music]

[Applause] [Music]

[Music]

[Music]

[Music]

a [Music]

n [Music]

[Music]

[Music] [Music]

[Music] [Applause] [Music]

[Music]

[Music]

[Music] [Music] [Music] [Applause] [Music]

[Music]

[Music]

[Music]

[Applause] [Music] heyy he he he [Music] [Applause] [Music]

he he [Music]

[Music]

[Music]

[Music] TR [Music] hey [Music] [Applause]

hey hey hey hey hey hey [Applause] [Music] now [Music]

[Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Music] [Music]

[Music]

[Music] [Applause] [Music] he

[Music]

[Music]

he

[Music]

h

[Music]

[Music] [Applause] w oh [Music] [Applause] [Music] I'm do I'm just TR to give you [Music] something I'm just trying to give something I do I'm just trying to there [Music] [Applause] [Music] [Music] [Music] I'm just TR to something I do I'm just TR to [Music] something I'm just try to something I do I'm just trying to give you something [Music] m [Music]

[Music]

[Music]

[Music] a

[Music]

[Music]

he

[Music]

[Music] [Applause] oh

[Music]

[Music] [Music]

[Applause]

[Music]

[Music]

[Music]

[Music] he h [Music] a

[Music]

[Music]

[Music] [Music]

[Music] [Applause] [Music]

[Music]

[Music] a [Music]

[Music]

[Music] [Music] [Music] [Applause] [Music]

[Music]

[Music]

[Music] a [Music]

[Applause] [Music] hey he [Music] [Applause] [Music] [Music] he [Music] he

[Music]

[Music]

[Music]

[Music] track [Music] hey hey hey hey [Applause] [Music]

hey hey hey hey [Applause] [Music] he [Music]

[Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Music] [Music]

[Music] he [Music] [Applause] [Music]

[Music]

[Music]

he

[Music] h

h [Music]

[Music] [Applause] [Music] [Applause] [Music]

w [Music] [Applause] [Music] I'm just TR toing I do for you I'm just tring give you [Music] something I'm just trying to give you something I do I'm just trying to give you something [Music] w [Music] [Applause] [Music] [Music]

[Music] [Music] I'm just toing I do you I'm just [Music] tring I'm just TR to give you [Music] something I'm just trying to give you something [Music] w

[Music]

[Music]

[Music] [Music]

[Music]

[Music]

[Music] l [Applause]

[Music]

[Music] [Music]

[Applause]

[Music]

[Music]

[Music] oh [Music] this n [Music] oh [Music] oh

[Music]

[Music] [Music]

[Music] is

[Music] [Applause] [Music] oh [Music]

[Music]

[Music]

a [Music] [Music] [Music] n [Music] [Applause] [Music]

[Music]

[Music]

[Music]

[Applause] [Music] he [Applause] he he [Music] [Applause] [Music] [Applause] [Music]

he [Music]

he

[Music]

[Music]

[Music]

e TR [Music] hey [Music] [Applause] [Music]

hey hey hey hey hey hey [Applause] [Music]

[Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Music] [Music]

[Music]

[Music] [Applause] [Music] he he [Music]

[Music]

oh

[Music] h [Music]

[Music] [Applause] [Music] [Applause] [Music] he yeah [Music] [Applause] [Music] I'm just something I I'm just TR to give you [Music] something I'm just TR to give something I'm just to give you something [Music] m [Music] a [Music] [Applause]

[Music]

[Music] [Music] I'm just something I'm just to [Music] something I'm just something I do I'm just TR to give you something [Music] o he [Music]

[Music]

a

[Music]

[Music] [Music]

[Music]

[Music]

[Music] [Applause]

right

[Music]

[Music] [Music]

[Applause]

[Music]

[Music]

n [Music] a [Music] l [Music]

[Music] a [Music] [Music] [Music]

[Music]

[Music] n

[Music] [Music] [Music]

[Music]

[Music]

[Music] [Music]

[Applause] [Music] he he [Music] [Applause] [Music]

he he

[Music]

[Music]

[Music]

[Music] TR [Music] hey hey hey [Applause] [Music]

hey hey hey hey hey hey [Applause] [Music]

[Music]

[Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Music] [Music]

[Music] [Applause] [Music]

[Music] w

[Music] h

[Music]

[Music] [Applause] w [Music] [Applause] [Music] just [Music] something I'm just TR to [Music] something I'm just TR I I'm just want to give you something [Music] w

[Music]

[Music] [Music] I'm just I'm just TR to give you [Music] something I'm just TR to give something I do for you I'm just trying to give you something [Music] m [Music]

[Music]

[Music] n [Music] [Music]

[Music]

all right good afternoon welcome to the breaking ground afternoon session my name is Fabricio baloi and I will be presenting This Cloud telescope research to you in the next 40 minutes thanks for choosing this track among so many awesome tracks out there I hope audio is clear if audio is not good enough please shout complain and I will address well my name is Fabricio and this is a presentation on this Cloud architecture CL uh called Cloud telescope and it allows for observing internet-wide activity in this case the presentation today refers to describing malware spreading activity botnet spreading activity as part of of the results we can find with this development and research this is myself I am a computer

scientist I currently work at norof University College in Norway and it's quite nice to be here in Las Vegas much warmer uh um I am a cloud Solutions architect I teach cloud computing and cyber security for undergraduate students among other activities that I hold and this is part of the research myself Barry Lucas and KLA have been doing for the last three years I have five topics to show you internet background radiation the N the standard Network telescope the cloud telescope the experiments we have been conducting with item number three and the discussion of these bot Nets that we have been detecting by the method not sure if you're familiar with this terminology internet background

radiation seems like a fancy terminology but it's pretty much an analogy to describe all the malicious often malicious activity we can catch we can capture by capturing unsolicited packets either on a domestic router a standard unfire wallet host or in this case a sensor Network that is deployed within cloud service providers so by analyzing this we can learn from vulnerability scans activity we can learn warm botn net propagation and more recently even benign scanners such as senses Shan and much more um the back scatter is one of the most prevalent activity found in Internet background radiation in this wild traffic that you can capture as long as you don't filter by using standard mechanisms such as a

firewall we can say that this intense activity arriving to Internet connected hosts is long duration it happens all the time it's low intensity it's not really a huge bandwidth concern and it has been studied for at least 20 years mostly in a research center very close to Las Vegas within kaida the Center for Applied internet data analysis here in the United States mostly held by by the internet or the network telescope deployers that will distributedly capture the traffic so we can learn if there is um some sort of geopolitical influence over the type of traffic that hits the United States versus the traffic that hits Norway or China or or any other country the reason is that I mostly

spoke about the topics but cloud service service providers currently enable us to deploy a distributed uh Fleet that is budget friendly that allows one to launch worldwide sensors without having to move from their chair meaning deploying via software instead of deploying a physical computer as it's usually done the cloud telescope is therefore described by this architecture containing the internet gateway router and a forcefully open security group so instead of taking advantage of aws's or Google Cloud's default deny policy we have to open all parts we have to allow all traffic in so to say and then the sensor is deployed there for example in the United States in AWS you can cover four different regions of the country

two in Canada one in Brazil and and so so on and so forth and that's how the cloud telescope can be deployed traffic is also recorded in pcap using a demonized version of TCP dump recordings are rotated and they are also uploaded to a cloud bucket so that recordings are centralized instead of distributed across the many telescopes out there and it makes it easier for later processing the data common common tools and stack that is used t-shark is the def facto standard for dealing with largest amounts of peap traffic if you just want a sample that contains less than 1 million packets it should be okay to look at it on wider shark for small

samples for learning you know in a more friendly way and you can also index packets using very interesting news uh security Stacks such as security onion or the well-known elastic search and kibana frontend to interact with the data and eventually learn from learn new patterns from this indexation radiation looks like this pretty much a standard peap but in this case because it's a distributed sensor Fleet you can see for example in one sensus scanning our sensor in ch you can see Showdown and this quickly appear so you can even profile how frequently uh friendly S sensus tools are scanning the sensor Fleet in three you can see some sort of distributed back scatter it's back

scatter because TCP is reset if you no longer wins the bid for having that instance allocated to your account so we Implement some sort of listener that will launch a new sensor upon request ter upon termination request terraform is us it to describe the architecture so anyone wanting to reproduce the experiment can do it and Bash is us it for Automation and that's how it's usually deployed

um we use Alpine Linux because of the smaller footprint this is related to the fact this distribution doesn't Implement a full GBC um userland Library it implements a a much tighter smaller footprint um intermediate or middleware as we can call it meaning it's um it can operate with half a gigabyte RAM virtual machine and that's the main characteristics of the the device whenever we receive determination notice we will stop the capture and launch only one but in your minds imagine you have up to 260 um virto machines 10 per AWS region in this case and they are allowing and capturing and recording unol ited packets aring to the sensor that's the the first takeaway of this

methodology the femal nature means that it exists as a terraform artifact um the GitHub repository is maintained by Lucas Baylor and that's the that's where the results I'm going to present you come from with the deployed 26 260 sensors in AWS starting August we wanted to keep it running let's say for six months but after 45 days of capture even though we weren't serving no service 10 billion packets were captured we had to stop because this was becoming quite huge to process it resulted in 200 GB of peaps that any anyone out there can download and do your own studies there are many patterns there that we only know exist no one ever got into them to see what

they really look like we are still looking for answers on is there any sort of geopolitical influence affecting the attack patterns we capture with the telescope this is an open question and for this experiment one key characteristic is that for parts SS Sage tnet web and https we implemented Lucas implemented some sort of application layer responder um answering back to attackers if they wanted to get into the machine they could do so but we were actually recording their commands upon infection other than obeying to the commands other than really exposing a vulnerable shell back to the attacker so that's one key characteristic that is that makes this presentation unique what does one learn if one

launches such an architecture what can you learn what did we learn for 45 days after capturing 10 billion packets according to the experiment and this is quite interesting 98% of the traffic unsolicited traffic arriving to the sensor fleet was TCP meaning no footprint no Footprints of Deni of service attacks or any other form of UDP exploitation or icmp fluing were saw only by a very small extent most was TCP we captured almost 1 million IP sources from all all parts of the world which I will profile to you in a in a minute but that's the range of the if you could call it the Tes scope resolution or aperture that's what we can learn or see by deploying it

currently we even though I only launched 260 sensors they were recycling um their IP addresses according to AWS policies and therefore we had 603 IP addresses captured on our side as honeypots traffic distribution across the world was fairly even the Baseline here is 4% per region in this case Asia Pacific Southeast 3 saw 6% of the radiation and Asia pafic Southeast two saw only one only 1% of the radiation that's also the newest at AWS region that could be linked to this fact uh the most attacking countries not not now not looking at it from an AWS perspective but by the country that owns that IP address or the country that is linked to the radiation Source or the

attacking Source we saw the Netherlands as the most prevalent country and there is a curiosity there for can you imagine the reason why Netherlands is the most apparent source of this random attacks arriving to the sensor Fleet

sorry it's coming from many anonymization services I uh VPN Services which seems that the country highly they have a culture eventually of Hosting these services this is openly available but that's the pattern openly available anonymization services including VPN that's our guess we cannot fully endorsed the statement but that's what it look like considering the autonomous system number holder the owner um and the least attacking Source was Taiwan Thailand Pakistan Poland and so on the most frequently attacked sensors were residing in the United States this is is slightly biased in the sense that United States has four AWS regions four different places we had 40 sensors in the country but even if you split by four you will still see some sort of

average activity hitting the US and the least attacked sensor fleet was in Germany Canada and the United Arab Emirates but it's also fairly even with a small bias towards the US and India now if I had to tell you in 2024 that the most attacked TCP Port is actually the tailet Port this would be like big news in the '90s in the early 2000s but it's still the most prevalent I would say there was a shift around 2015 it wasn't the most prevalent according to kaida telescope but it has become again the most attacked Port by far do you have a guess on why to tet is currently the most prevalent destination port for random attacks on the Wild

it's a three-letter uh word or a keyword iot iot thanks for the answer precisely meaning the iot introduced the new generation of low power low cost fast time to market devices that are not necessarily as secure as recently we see like modern operating systems becoming more and more secure so that's our conclusion to the why but as you can see from the patterns that I'm going to show you it's actually they are actually exploiting Linux Kel 2.2 2.4 which are highly related to embedded devices not really ordinary servers followed by GTA but to a far less extent Minecraft VNC pretty interesting to see VNC exploitation as a top Target and then the classics SS jtp DNS over DCP which is not really the

common standard uh SMTP and and so on RDP is is also there if you think about UDP exploitation which is only seen by a small amount in the experiment it mostly relates to the recent LPS exploitation or the recent LP vulnerabilities that are actively exploited and of course the amplification attack related to M cache redis which is connected to some of the talks presented earlier today is also there somewhere and also DNS and the classic often exploited ports icmp is mostly ping 99% of the times Echo request Echo reply to a far less amount and other not so popular icmp types being captured when it comes to Pink in and ping flooding the most active Source was

actually China followed by the United States So speaking about profiling the source of the attack this is what I wanted to get meaning the the IP sources they really belong to uh companies in the Netherlands that hold business related anonymization they sell anonymization Services most of the time Belgium also shows up there followed by China and Japan so that's the profile in all cases this is probably tunneled before hitting the sensor using some sort of gr encapsulation or any other equivalent protocol for the same purpose now speaking about the analysis the mware analysis let's analyze it together even though uh it could be small small font there let's see if we can decode together this is

uh text version of wi shark if you will t-shark the coding we can see at the application layer well it's TCP at the transport layer Source Port not um in this case it's it shows reversed so Source Port is actually 39,000 mostly irrelevant but destination is 23 um and then the honey poot records any command that the attacking Source wants to inject without actually executing the commands but acknowledging saying yes you are successful with the command proceed that's how it pretends to be so then it starts with an attempt of running a a w get script that we can assume we will try to download some sort of payload with the infection commands control and command or turning the

target into a zombie very likely um this also reveals the vulnerable server often vulnerable server serving the payload so we can also study who is out there vulnerable in a vulnerable manner serving payloads for attackers on the internet eventually it's just an unprotected website that got hacked and they are the Apparent Source of the attack many times it asks for some sort of busy box exploitation they hope our honey pot runs busy box which reveals the embedded nature of the the attacker's expectation upon the target we can also see trival FTP this is not really FTP But A variation trying to download data from somewhere else trying to run shell scripts related to the tftp

download attempts to execute busy box FTP G and then this is from from an attacker's perspective this is an attempt to ensure that they have really infected machine and turned them into a zombie we also see curl and then connection attempts arriving next on the next packets so that's pretty much the way it looks like once you deploy the telescope and start learning their attack patterns now let's go into a some sort of it's a binary analysis but not in the sense of dissecting the binary we want to know which binaries are most frequently linked to the payloads being delivered to the Honeypot many so the names are not really the most exciting ones this one

calls I but you can hash it you can hash the binary and compare it against publicly publicly available uh sources and then you learn exactly what this binary wants to deploy what it's what is its intent and then you see many funny names out there but notice one pattern reinforcing the iot behavior it's most of the time 32-bit binaries or they carry the 86 uh appendix this is not anyway that the techer can call it as they want but but it reveals some sort of pattern um and then we can link the binary with the very botn net in this case it's the botn net profiling and the botn Nets are I believe this is this is

popular names by now right marai is it popular have you heard of Mirai before it's probably the most active botn net on the wild they're not really aiming at a single Target but they are the most prevalent in quantitative terms and there are many variations out there including Mozzy detected 10 million times in the experiment quite frequently and Sora among other names so we can we can try to learn how many strains are are out there and by learning from where they are downloading the payloads from we can try to guess if they belong to a same hacking group or if or if they could belong to different hacking groups deploying their zombie Nets or their bot

nets for later use against Targets in all cases for the top 10 they carry a 32bit signature they are 2.x Linux kernel often related to embedded

systems um after probing the the honey poot if it's attacking Port 80 it won't speak shell commands it will speak HTTP commands right in that case we will see gets posts and this kind of pattern in this experiment six million times Mirai was trying to infect the Honeypot with this single Source 185. 224 something hosting the Cutie binary to a smaller extent Mirai again Moy they were the most prevalent one so these are the busy according to the experiment these are the busiest sources serving if we were to quantitatively attack this problem tackle the problem we should start by addressing what's going on with the services what are the vulnerable Services exploited by attackers that we should promote

awareness about this kind of outcome the commands also reveal what kind of vulnerable service HTTP based service could be running on the vulnerable hosts many times it's the an attempt to get the environment variables then it depends it depends on the type of technology used it either be a Windows server with IIs a PHP server on an Apache other options but you see there are predictable patterns a lot of exploitation uh trying to leak GitHub or git authentication credentials this is the kind of cyber threat intelligence acquisition you can get by running the cloud telescope um of course because this is deployed in AWS we could expect a lot of health checks coming from the the cloud

provider itself towards the a part web ports 8443 1 million times that was the case but to a smaller extent in other cases it was these projects this also reveals that like for example paloalto has a a project related to mapping the Internet it's benign it's not an attack but it's interesting to quantify another Cloud mapping experiment at PDR labs.net this is unknown to me census is inspecting part 8 and signing because this is totally upon the attackers to rewrite as they want this is not a standard that must be followed truly and of course python browsers or re mappers go mappers much faster and others so this is again something one can learn now if we were

to speak okay you detected a lot of binaries we see that it's an i iot style infection but what else we can tell you that in this experiment most attacks are exploiting the majority are exploiting cve 2016 216 this is related to embedded CCTV devices so from an attacker's perspective it's a good business to launch infection attempts against CCTV as they are the most generally speaking they're the most vulnerable followed by a CMS system used mostly in China think cmf I only knew about it by trying to reverse engineer what was going on and also totto link which is behind billions of iot devices a common middleware that is used for anyone wanting fast time to market iote

devices in our cases score 9.8 that's so according to what is the Insight here the Insight here is that the most popular popularly exploited systems are iot systems currently according to the method and they're exploiting highly critical vulnerabilities which goal is to increase the fleet of botn net sensor on behalf of the attacking group they they belong to so expanding the zombie Fleet if you will that's the conclusion um that's what you get currently by analyzing um unsolicited traffic arriving to a sensor Fleet on a cloudbased experiment we have many questions there requiring so that's an invite if you want to help us to find better answers in terms of of geopolitical influence but okay this is

B net expansion but is there a political intent behind attacker is it a nation sponsored nation state sponsored attack or is it just an ordinary attempt for business for money for profit both could be true we need to investigate here you will find the most relevant links we also have companion Publications if you want to learn this from a more academic a more let's say yeah academic research perspective but it should be interesting for both worlds the research Community but also the industry should have an interest at the results we have been finding and there is of course more to come that's all thank you very much for your [Applause] time thanks again we should have three

minutes if you have a question does anyone have a question no okay that's all thank

thanks trans I'm not sure if we have another speaker starting right now just so that we make room for them

so when you're explaining sounds indidual of is it of large and then also how do you

each individ the

[Music]

he

[Music] h

[Music]

[Music] [Applause] oh [Music] [Applause] [Music] something I I'm just dring in [Music] something I'm just I do I'm just tring to give you something [Music] w

[Music]

[Music] [Music] I'm just try to I'm just trying to give you [Music] something I'm just trying to give you something sming I do for you I'm just trying to give you something [Music] w

[Music]

done [Music] [Music]

[Music] he

[Music]

[Music] he

[Music] [Applause]

[Music]

[Applause] [Music]

he n

[Music]

[Music] e [Music] n

[Music]

the [Music]

d

[Music] [Music] [Music]

[Music]

[Music]

[Music]

[Music] [Music] [Music]

[Music]

[Music]

[Music] w [Music] [Music]

[Applause] [Music] hey hey hey he hey [Music] [Applause] [Music] he [Music]

he

[Music]

[Music]

[Music] track [Music] hey hey hey he [Applause] [Music] hey hey hey hey hey [Applause] [Music] [Music]

[Music]

[Music]

[Music] hello every everyone and welcome to my talk my terrible roommates discovering the flow fixation vulner vulnerability and the risks of sharing a cloud domain flow fixation vulnerability is a vulnerability I discovered recently in AWS mwaa managed Apache workflow uh Apache air flow uh this is a oneclick vulnerability that allows the count takeover on a victim so enough with the teaser I am Liv Matan a senior security researcher at tenable I Microsoft's most valuable researcher and as you might notice I hand the major Cloud providers AWS Azure and gcp so on the on Prem word in bu Bounty when you run JavaScript on a subdomain this is a vulnerability right when you report it in a bugb bug Bounty you get

xss you get paid this is a valid vulnerability but in the cloud this is something different the concept is observ served differently because for example there are a lot of services that host web services or even other cloud services on the same parent domain and other customers different customers share different subdomains let's take it into a more practical scenario for example AWS elastic bin stock and keep in mind this is just one example of one service but this concept affects a lot of services in the major Cloud providers so I as a customer customer one for example will have a domain in customer 1. elastic b.com an elastic binstock to those of you who are not familiar is

simply a Serv service that allows web hosting and another customer for example customer 2 will host a web application on customer 2. elastic bin stock.com and uh a lot of other customers will be also hosting the same uh will be hosted on the same uh shared parent domain elastic beano.com now I as an ordinary user of the elastic bin stock service can run JavaScript so now it gets confusing because as you might remember from the previous slide this scenario is the exact same scenario like in the bug Bounty world when you get an xss on a subdomain of a victim because I run JavaScript on a subdomain of a victim of for example customer 2.as

b.com so this is kind of crazy so I will Define some definitions before we get it started so we'll be on the same page super cookie is a cookie that you as a user Define or in JavaScript code you simply Define to a parent domain and this cookie will be tossed AKA cookie tossing to all of the subdomains of the same parent domain for example if I set a cookie to the elastic bin stock.com parent domain the cookie will be set to customer one and customer 2 and so on to all of the customers of the same elastic bin stock.com in the same browser as for session fixation this is a vulnerability or attack technique that

allows an attacker to hijack a user session by by forcing the victim into using the same retrievable session the same known session so for example an attacker will send this link the following link the https uh W wxyz.com with the session ID of 1 2 3 4 to the victim the victim will then visit the link We log in with his credentials to the xyz.com website does verifying this session of the one 123 4 then the attacker can simply use this verified session the redeemed session in order to hijack the session of the victim this is uh a very brief definition of session fixation but keep in mind in AWS it gets much more interesting

some of the potential same site risk AKA uh shared parent domain risks that I can note for you uh these are a lot I'm sure I'm I'm I'm sure you're not going to see all of them I'm not going to dive deep into all of them but one of them a very notable one is Cookie tossing attacks as we said super cookie that allows csrf protection bypass and assession fixation abuse so these are a lot of risks that uh are caused uh as a result of the concept of the shared parent domain and this is very very um large also in Cloud providers as you saw in the example of elastic beanock and will be uh in a lot

of other services some of the past impactful vulnerabilities could also be prevented if a lesser known guard rail was utilized in that time of the vulnerabilities exploitation some of them by Gaff Amiga and and a guy named Sirius and those uh both those two vulnerabilities uh were exploited by Cookie tossing and that technique could be prevented if the cloud providers were simply utilizing a lesser known guardrail named public suffix list so that guardrail uh is a list initiated by Mozilla this is a community collaborative list and this looks exactly as simple as this thing this simple list uh that each organization or a domain owner can register his domain to this list that will then be treated

by browsers so browsers see this list and each site that you see here is a site that is considered to be shared with different customers means that subdomains of the same site AKA shared parent domain will be shared with different customers sensitive data is shared therefore the site uh should be considered as a public suffix now this public suffix list is lesser known and I will show you exactly why in the following slides because in AWS these are all of the services some of them I reported to AWS and some of them AWS themselves uh initiated a research in order to uh to find these services that their domains are not present in the public suffix list so

this is the state of AWS currently as for the public suffix list and it means that uh now it is all reported and fixed these are the domains that were are now inserted into the public suffix list but it means that these domains were vulnerable to the same site risks I showed you before uh before reporting uh this risk to AWS as for Azure these are some of the domains including popular names like API management Azure front door blob storage these were all not present uh in the public suffix list now they do after the report and as for Google I reported and they simply stated that uh this issue is not considered that severe for them so they would skip

that so you can go ahead and uh search for some nice bounties with same site risks in Google Cloud so let's get it started what is exactly Apache a flow Apache airflow simply is an open source system that handles data pipelines and workflows and in AWS you have the managed version of Apache airflow Amazon managed workflows for Apache airflow we did a sample in our company in table uh and we saw that 20% of the customer database we have are using the MW mwaa service so the vulnerability is rather popular and impactful so let's dive in into the flow fixation vulnerability you can see this screen uh when you visit the service you can see that each service gets attached

an airflow UI and as you can see this uh airflow UI I just got hosted on the parent domain of Amazon aws.com and after all the stuff that I told you it might tell you that there is going to be something interesting in that case because Amazon aws.com is a shared parent domain with other AWS services so how does it really work you log into the airflow UI and you have uh you get a cookie session to the airflow UI this is exactly how it looks like you get a cookie session to your own uh subdomain of the Apache airflow manag version and this session cookie is simply a uu ID after that you're using your AWS STS

in order to get in exchange a web token that is essentially a JWT that allows you to then redeem it and authenticate to the dashboard to the mwaa airflow UI then there is a session retrieval now keep in mind you see this request and this request is unauthenticated on the left side of the screen you can see the request and there are no credentials and nothing that authenticates you so you might say now okay so I can retrieve the session of victims and we have an unauthenticated vulnerability right like authentication bypass but not really because this session is retrievable and authenticated but this session at that point of the authentication flow is no use you cannot use it you cannot log in

with it this is just a session cookie this step is making the session verified and now you can use this session so in that step the request is using a token the JWT token we just retrieved and again the session cookie and in exchange you get the session cookie that is now verified and now you can use it in order to log in to the airflow UI but you can see something very very interesting here because the set session cookie that that I got as a normal user of the airflow UI is the exact same session cookie that I just gave to the

dashboard that scenario is kind of interesting because now as an attacker I can retrieve the known session unauthenticated and I can try to force victims into redeeming this session we just saw the authentication flow and after they redeem the session that we just retrieved we have a known session ID and I might be able to hijack the session with session fixation so this is a scenario for example I have the attacker info the attacker bucket and I have the bucket under shared parent domain Amazon aws.com this is by default how S3 buckets work in AWS and on the other hand I have the victim mwaa which also hosted on Amazon aws.com we are both on

the shared per par domain of Amazon aws.com and I can simply host an index HTML file then lure victims into my S3 bucket and run HTML JavaScript with the JavaScript I can set a cookie to my Victim and I might do some cool stuff here so this is how it looks like I set I set a cookie in the index.html this is simply JavaScript and I said the the cookie that I just retrieved from the victims mwa panel uh and I said the cookie uh that this is the uuid cookie that I retrieved and I said it to the Amazon aws.com shared parent domain does this cookie will be shared into also the Apache airflow of

the victim because we are on the same shared parent domain and this will be in the browser of the victim but not so fast because this thing has actually failed why did it fail because the S3 bucket domain was in the public suffix list and still is so the public suffix list has prevented me from setting the cookie to the sh parent domain from the domain of S3 bucket the next order of business was to use Google Dorking in order to find services that are that are hosted under the same Amazon aws.com domain and might allow me to set a cookie from that service so I found AWS API Gateway that at that time wasn't present in the

public suffix list and allowed me to set a cook key to the Amazon aws.com shared parent domain and this is absolutely crazy because I can now lure victims into my Gateway so host the exploit code there the exploit code will set the cookie to the victims mwaa this cookie is a cookie that I know that is retrievable this is a known session ID and with that session ID I can redirect the victim force him into logging in into his own dashboard verify the session that I know then I can use the session and voila this is how it looks like this is the exploit code uh so we'll get into it in that case I simply wait for a get

request specifically to the path of the PC for AWS um and then I send an https request to the victim's airflow to the victim dashboard uh to the path to retrieve the session ID this is all unauthenticated I don't need any credentials right because this session is not yet verified then I save this session that I just got in the cookie value parameter the next step for me was to use this uh HTML I set this HTML into my attackers page and this HTML includes JavaScript that will set the retrieved cookie the retrieved session to my Victim and I set it by setting this session cookie into the domain of the Amazon aws.com which is shared as you saw then after that my

next step is to use document.location in order to redirect the victim into uh this nice SSL redirect URL that will uh essentially Force the victim into logging in into his own dashboard after the victim has logged in into his own dashboard he logged in with the session that I just said to him does this session is now verified and I can use it in order to take over the victim's dashboard this is a nice demo I set for you so let's go over

it so here this is the victim visiting the panel this is what the victim sees gets redirected and everything he doesn't see anything that uh it's kind of suspicious this is the victim's browser you can see as the session that it was injected this is the attacker's view it saw just uh you just saw the session that was injected

this is the attacker that was uh the just use the uh hijack session the attacker is now using the hijack session after the victims uh has logged in into and verified his own

session and boom this is the attacker after we logged in into the victims [Applause] panel thank you the fix was is that AWS fixed this vulnerability by refreshing the session uh of the mwa so now after a user is logging in into the mwaa dashboard aka the airflow UI the session is refreshed after login

some of the takeaways is that shed parent domains are dangerous this is just one example of such a vulnerability that can be exploited with this in mind but we saw two vulnerabilities as a case studies uh that were exploited in the past and the public suffix list is a lesser known guard ra that people are just less aware of so let's be Community focused let's uh elaborate on the public suffix list let's use it because this is very powerful as we saw so it could prevent flow fixation and it could also prevent these two impactful vulnerabilities we saw um as case studies so this is a very very interesting and important concept we should take and keep in mind also check

if the service domain that you are using is present in the public suffix list and if not please assume that same site requests are untrustworthy because same site requests are dangerous and can be risky as you saw in this this presentation thank you any

[Applause] questions no questions thank you very much [Applause]

[Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] he [Music] [Music]

[Music] [Applause] [Music] he

[Music]

[Music]

h

[Music] h

[Music]

[Music] w a [Music] [Applause] [Music] [Applause] [Music] I'm just I'm just TR to give you [Music] something I'm just TR to give something I do BR you I'm just trying to give you something [Music] he [Music] [Applause]

[Music]

[Music] [Music] I'm just I do you I'm just trying to give you something [Music] I'm just trying to get something I do I'm just trying to give you something [Music] w

[Music]

[Music]

[Music] [Music]

[Music] he

[Music]

[Music] [Applause]

[Music] [Applause] oh [Music] [Music]

[Applause]

[Music]

[Music]

[Music]

[Music] what's going on we're here to talk about MacBooks so my name is Nick I work at the uh security team at a company called figma and recently our team was thinking about if we had malware on one of our company laptops what would one someone want to do with this malware right all of our laptops are MacBooks our employees use Chrome the obvious answer is they'd want to read people's Chrome cookies like we don't have interesting data on our laptops anymore our emails are in Gmail right or other interesting data is in other SAS apps on the internet this is the actual data you want off a laptop um so if you have a

shell on someone's MacBook can you read your CH their Chrome cookies and how do you do it and I I can detect that like some security people are skeptical of this idea right the answer is of course this is a law of nature right if if you have a malicious program running as you surely that program can take over any other program running as you right like you can you can use GDB or l L DB or something to take over code execution you can launch Chrome as a subprocess and like read its memory after it's launched to read cookies TR out of memory maybe you can take over the computer's Mouse on their their MacBook

and click around follow fails right open the debug console go to like where you as a user would read the cookies just read them directly out of there right so surely you can always do this stuff anyway if you're running as as the user except that's not true on a Mac you see apple built iOS iOS is crazy right you run potentially malicious programs on your iPhone um and you can mostly trust that those programs can't take over other apps and like 10 years ago they brought this stuff into Mac OS a uh a program on Mac OS that is running under a different Apple developer under code signing cannot take over another program necessarily this is what underpins like

accessibility permissions you can't take over the mouse this is what decides whether you get Bluetooth access things like that um so you can't just take over chrome necessarily now I'm going to tell you that you can but it doesn't have to be that way right this is a design feature of the way that Chrome is built so what we're going to talk about is I'm going to give you three attacks that you can use to steal cookies on Chrome there are more attacks but these are just three that I think are interesting I'm going to tell you a couple ways you can prevent some of these attacks from being done but not all of them I haven't come

up with protections for all of them and for the ones that we can't prevent we can at least try to detect whether they're happening so first one I want to talk about is the Chrome Remote debugger right there's this thing that you can use if you're if you're running like selenium on your computer for example and it like takes over chrome and clicks around to run like automated tests the way it works is it runs Chrome with this remote debugging server um where you can send it commands to tell it to click on things if you Google how to steal cookies on a MacBook like you find this blog post that explains how to do this

from 2018 it's super cool uh basically you launch Chrome you tell it listen on this port uh I'm going to connect to that port and I'm going to tell it I would like you to give me the list of the users cookies like all of them for all the websites and says sure yeah one one big response it sends you all the cookies um this was cooler in 2018 when you could launch Chrome in headless mode and so the user wouldn't even know this is happening these days headless chrome doesn't have the same cookies as normal guey Chrome so this doesn't work as well um but you could still you could launch like normal Chrome and the user would

see something weird happening but you could steal the cookies so that's one way another attack is like surely Chrome has to store the cookies somewhere right when you reboot your computer you're still logged into Facebook where does it store them it stores them in this file uh it's got all the fields that cookies have like which website there whether it's an HTTP only cookie or stuff like that but the file does not contain the actual cookie value it encrypts the cookie value and it encrypts it with an encryption key that's in your login keychain this is where stuff gets kind of cool who can read this encryption key if you're Chrome and you have uh

Google's Mac OS code signing certificate you can read this key no problem if another program tries to read this key uh your MacBook prompts the user for their password um so if you just ask the keychain for the key you probably can't get it or honestly the the message is not that clear what it is that the user is being asked for so they might just type through it that might work so if you want to decrypt something you can either get the key or find a way to attack the crypto right um so if you can read this file you can probably write to this file right you can copy paste a cookie that you want to steal to a new

cookie in a new row in this sqlite database for another website um a website that you control and just wait for the user to visit that website or you can like I don't know you can launch Chrome pointing at that website and make them send cookies to you right uh you could imagine that you could imagine a crypto scheme where the encrypted value takes into account like which website the user is using or something um so that a cookie for GitHub is not valid for a different website uh but they they currently don't do this cool so that's another way the third attack that I want to talk about uh I honestly think this is the coolest one we're going to take

advantage of the fact that Chrome is not one process right it is a complicated multi-process program the way that Chrome Works is when you launch the Chrome main process it the one of the first things it does is it asks the keychain for your cookie encryption key and then it starts off this other process that's responsible for doing a lot of things but one of the things it does is read the actual cookie database read andw write your cookies and decrypt them it sends the encryption key to this other process the network service they're both signed by Chrome but they have different like signing IDs and then the network service actually reads your cookies so one thing that's cool is if

you are an attacker you can replace the binary where the network service binary lives in like the Chrome application bundle with malware and then kill the process because you have user permissions right you can still kill other processes the user running and chrome is very helpful it'll say oh my gosh the network service has crashed I got to relaunch the network service and it will do the startup process again it'll send the cookie encryption key to this uh malicious process you've now gotten it to launch uh that process is going to launch the network service send the encryption key to it and then it can start speaking IPC to the network service and just tell it please like

read these cookies and do these things with it um so the important thing here is that Chrome and the network service are the only processes that are actually talking to the keychain and the cookie file right malware is like getting launched by Chrome and it's launching another process but it's not actually reading these files and the reason that's important I'll get into later uh and don't even think about like trying to use file permissions to fix this cu the malare could just bring its own copy of chrome right or you could like download it from the internet and even if it's not the same Chrome binary the user usually uses it's still a signed copy of

chrome it still can read the keychain it can still write to this file okay what are some things we can do about this the first thing we can do is if you pay Google for Chrome Enterprise there's a flag you can turn on to disable the remote debugger so when you set remote debugging allowed false uh and then you try to launch Chrome with a remote debugger it just prints out an err message doesn't work um everyone should turn this on this is super cool you you might have trouble as I did with like you have actual Engineers who use tools like selenium so you just probably just have to exempt it for those those

Engineers um but you can maybe get away with like telling them to run Chrome with a different user data directory or something like that so that if someone manages to launch Chrome with a remote debugger it's not getting their real cookies okay so that's one one approach uh another thing you can do is you can use this PR called Santa and this is a Mac OS talk any hands in the air for people who know what Santa is any familiarity awesome so most folks know of Santa as this like binary authorization tool but it's also got this thing called file authorization and the idea is we can tell the OS whenever someone tries to open one of these files in this case

we're going to talk about the the cookies file I want you to ask me the Santa agent uh whether that process should be able to open the file and this is like the OS levels this is not just file permissions um so the ideas like Chrome tries to open the cookies file another program tries to and even if it has user level permissions at the kernel level it gets blocked so my recommendation is you should have two rules in Santa for which processes can open the cookies file you should tell you should tell Santa to let the network service which is that Chrome com. google. chrome. Helper process open the file don't let normal Chrome open

the file uh because if malware like from the CLI launches Chrome like application whatever the path is to Chrome and then the path of the cookies file Chrome will open the cookies file and say oh my gosh this is a binary file they must want me to download the file and it'll copy to your downloads folder so don't do that um and the second thing you should protect is if you're using the remote debugging allowed flag you should protect the policy folder because malware can just delete this folder uh when you set Chrome Enterprise Flags it tells Chrome that or what what Chrome does it saves locally which flags you have set there like cool like pki and

signing to make sure that it's actually from a Google server but if the files where it stores those just aren't there it assumes it's not managed um and when Chrome launches there's a couple seconds before it actually goes and redownload its managed policy and that's plenty of time for the remote de buger to launch and for you to steal a cookie so protect the the policy directory uh I have no no ideas for how to fix the IPC problem so if you have ideas please let me know so if you want to detect someone doing this weird IPC attack we're going to try to find cases where the network service is getting launched by another process that isn't

Chrome and the thing that we did at figma to check check it for this is we used this thing called osquery um some of you may be familiar with this it's like a cool logging tool where you can write SQL queries to like give you logging pipelines for things that are happening in the operating system and basically you want to look for uh processes getting launched you have to like kind of build a table of parent processes and child processes but you have to look for processes where um the target process getting exact is the network service and the parent process is not actual Chrome that's what I have for you do you have any

questions yes

the question is what about session cookies uh that are not like they don't have a max age and they don't expire they die when your Chrome process dies I think they only live in memory that's my memor my my recollection so they don't actually get persisted to dis so I think you I think you'd also have a hard time uh I think trying to steal session cookies you'd have a hard time using any of the attacks I listed because you'd have to attack like the active running process that the user has launched from their GUI that they're clicking around in and that's when they like logged into an application um and so I think like

mucking around with the network service through IPC for example would uh cause the session cookies to go away other

questions yes

do we have another mic

cool the question is about other browsers uh I think on Mac OS there's like three browsers that exist it's like Safari Firefox and other things that are chromium based browsers that kind of reskin chrome um so all of the chromium browsers typically encrypt cookies on dis so the attacks against crypto all work um and I think like Arc and brave and browsers like that do not have uh a remote debugging tool that you can launch they just like don't exist um so I guess they're a little bit more protected against this problem um Firefox stores cookies unencrypted on dis so you can also use santa to just protect the file where it stores the cookies and Safari I don't

actually understand how this works but Safari like Mac OS builds in something like what we're trying to do with Santa here where it just prevents other programs from opening the Safari cookies database um and I guess another interesting area is electron apps so it's actually an optional flag that you don't have to turn on in electron if you're running an electron app to encrypt cookies on dis uh a lot of apps don't have this turned on if an app does have it turned on you can still like try to use IPC to either get the main process to tell you the encryption key and then go read it out of memory out of the file um or you can

do the IPC attack to read it out of the network service

yes I was hoping someone would ask about extensions so I I don't actually know what um all of the protections that Chrome has here but it seems hard to as malware put malicious extensions into someone's user data directory maybe someone else here knows the details here but there's at least some like pki that Chrome does or it makes sure that the extension at least has been published on the Chrome web store so if you mess around with like either the content of an existing extension it checks that uh the extension is actually signed by um the certificate for that extension there's like a per extension certificate and then that gets signed by the Chrome web store um but like I guess you could

write a malicious extension and then publish it and then put that in someone's user directory that' probably work uh so I'd also recommend using osquery to get a list of people's Chrome extensions and I don't know have a sense of whether they seem like malicious extensions or not um some some companies do extension allow listing so that when Chrome launches it just doesn't let you run with malicious extensions or with extensions that are not on your allow list installed this is obviously disruptive because people have to ask some it person to let them use an extension um yeah

any other questions all right thanks [Applause] [Music] a [Music]

he

[Music] h

[Music]

[Music] [Applause] w [Music] [Applause] [Music] I'm just I do I'm just tring [Music] something I'm just something I'm just TR to give you something [Music] h [Music] [Applause]

[Music]

[Music] [Music]

I'm I'm just trying to give you [Music] something I'm just try to give you something I do I'm just trying to give you something [Music] o [Music] w

[Music] a

[Music]

[Music] he [Music]

[Music]

[Music]

[Music] [Applause] he a

[Music]

[Music] a

[Applause] [Music]

he

[Music]

[Music]

[Music] n [Music]

n [Music]

[Music]

a [Music] [Music]

[Music] [Applause] [Music]

[Music]

[Music]

[Music]

[Music] [Music] [Music] [Applause] [Music]

[Music]

[Music]

[Music] n [Music]

[Applause] [Music] hey hey hey hey hey [Music] [Applause] [Music] a [Music] he

[Music]

[Music]

[Music]

[Music] TR [Music] hey hey hey [Applause] [Music]

hey hey hey hey hey [Applause] [Music]

[Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Music] [Music]

[Music]

[Music] [Applause] [Music]

[Music]

[Music]

oh

[Music] h [Music]

[Music] w now [Music] [Applause] [Music] [Applause] [Music] [Applause] [Music] I'm just I'm just TR to give you something [Music] I'm just trying to give you something I do I'm just tring to give you something [Music] w [Music] [Applause]

[Music]

[Music] [Music] I'm just trying to give you me I do you I'm just trying to give you [Music] something I'm just okay I I'm just trying to give you something [Music] w

[Music]

[Music] [Music]

[Music] a

[Music]

[Music] [Applause]

oh [Music]

[Music] [Music]

[Applause]

[Music]

oh

[Music]

[Music]

[Music] a [Music] the

[Music] n [Music]

[Music] [Music] [Music] [Applause] [Music]

[Music]

[Music]

[Music] n [Music] [Music] [Music] n [Music] [Applause] [Music]

[Music]

[Music]

[Music]

he [Music]

[Applause] [Music] he [Applause] [Music] [Applause] [Music] [Applause] [Music]

[Music] he [Music]

he

[Music]

[Music]

[Music] TR oh [Music] hey hey hey [Applause] [Music]

hey hey hey hey hey hey [Applause] [Music]

[Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music] he [Music] [Music] [Music]

[Music]

[Music] [Applause] [Music]

[Music] w a [Music]

you

[Music] h

[Music]

a [Music] [Applause] [Music] [Applause] [Music] oh

[Music] [Applause] [Music] I'm just Jing [Music] in I'm just [Music] Jing I'm just I do I'm just TR to give you something [Music] a [Music] w

[Music]

[Music] [Music] I'm just I I'm just TR to give you [Music] something I'm just TR to give you something I I'm just trying to give you something [Music] m [Music] oh [Music]

[Music]

[Music] [Music]

[Music] is

[Music]

[Music] be [Applause]

oh [Music]

[Music] [Music]

[Applause]

[Music]

a [Music]

[Music] a [Music] [Music] [Music]

[Music]

[Music]

n

[Music] [Music] [Music]

[Music] a [Music]

[Music] [Music]

[Music] [Applause] [Music] he [Applause] he [Music] [Applause] [Music] he hey [Music]

[Music]

[Music]

[Music] n

[Music] St [Music] hey [Music] [Applause] [Music]

hey hey hey hey hey hey [Applause] [Music] he [Music]

[Music]

[Music]

[Music] [Applause] [Music]

oh [Applause] [Music]

[Music] [Music]

[Music] [Applause] [Music] he [Music]

[Music]

[Music] h [Music]

[Music] [Applause] w h [Music] [Applause] [Music]

[Music] I'm just dring [Music] something I'm just dring give you something I do I'm just TR to you something he [Music] a [Music] [Applause]

[Music]

[Music] [Music] I'm just something I'm just TR to give you [Music] something I'm just I I'm just trying to give you [Music] something all right good afternoon SL evening everyone welcome back to bze Las Vegas this is the breaking ground track hopefully you guys are finding the event so far to be fulfilling um it's been pretty action-packed day uh for most of us so but welcome back to the breaking ground track the title of this talk is Operation soeki you are a threat actor as yet have no name presented by the wonderful people from na4 SEC just a couple of reminders I know you've heard it once before many times before I just have to say it again please do make sure

um first off this is live stream so please do make sure your phones are completely off or silent um out of respect for the speakers and those that are in the live stream if you have any questions towards the end if there's time for Q&A I will have a microphone here I'll come run around U just raise your hand up high and I will come to you so that everyone on YouTube is able to hear your question as well um you if if you have sorry one second we also like to thank our sponsors um once again especially our Diamond sponsors prism cloud and vancea and our gold sponsors Adobe Drop Zone Ai and sreb it's their

support as long as with our sponsors donors and volunteers that truly make this event possible so without further Ado I will turn it over to

them uh har brong uh did you all enjoy the happy hour yeah ah yeah uh our presentation is just around the corner so we are too nervous to enjoy the happy hour uh once the presentation is over we want to enjoy plenty of the beer as other parties so now let's move on to the our presentation and our talk is about operations key uh it is a summary of research and activities uh against the activist group uh we hope that the presentation will contribute uh in anywhere to the pide Las Vegas Community uh first of all I introduce today's speakers uh first my name is theom I'm a mar analyst as the NF Laboratories in Japan uh I usually

research AP groups in East Asia uh next he kaii samish shim uh he's a threat intelligence Analyst at the n Communications uh which is one of the biggest telecommunication company in Japan uh for This research he has been working on SNS research and the Pak data analysis uh finally he's uh atushi Kanda uh he also works uh intelligence analyst and at entity Communications uh he is an excent engineering this manager and he is a specialist of the security and network field uh for This research we developed uh our analysis system and the read the entire our operation uh we all belong to the threat intelligence team uh n for SE uh now sex team in Japanese authentity

Communications and uh uh we work to protect Japan and NT Communications uh from cyber attacks okay before we start our presentation there are a few important things uh I like everyone to keep in mind uh our main topic is activist uh activist want their names and ideologies to be known to the public so please don't write thre doctor's name on public space like SNS uh if they know they're being focused on they maybe Target you or your country or your company uh to attract more more attention uh in addition some of the technical details are not described in this presentation because they may watch this YouTube live uh and uh they change change easily the

ttps uh if you want to know the technical details uh for example how to extract the decryption key from the infrastructure uh please contact us after the presentation uh we will provide the details on the useful script for research uh this is our presentation agenda uh first we will introduce overview of what is operation soci key and uh activist group profile next uh we will uh explain how to track their activities and infrastructures and after that we'll present the analysis of data uh it is exfiltrated from the infrastructure over the year and finally we'll summarize the insights uh we gain from our activities against activist group Oro let's dive into our research uh to start I'd like to ask you

a question uh do you know

him in this room some people know him so here's a No Name 057 uh I think many of researchers already know about him because he's a pran activist group and uh they have been executing uh D attacks around the world uh during observation uh critical infrastructures in numerous countries have been targeted and uh attacked by them uh in our country Japan is there tax have also damaged and many companies and governments and organizations uh since February of this year they have been attacked Japan many time uh citing uh opposition to Russia as a Reon so we began tracking the attacks in February of the last year so operation SOI is our activities and approach against activist group and

no name the name s was inspired by the famous Japanese author uh this image uh he uh SOI Nat who wellknown book I am a cat start with the line I am a cat as yet I have no name uh which is similar to their name so we select using the this name SOI to research then we join their underground community and analyze their to and infrastructure and uh continuously Maring their Leos attacks and internal operations for for over year as K is we will provide these three points first is a method for analyzing and tracking the Dos infrastructures second is a longterm data analysis of DS attacks from Mar marable perspectives third is the Insight on how

we should respond to activist based on experience with operation soci so next uh before we talk about the details of our operation we will introduce the profile of the activist and their communes uh once again we introduce no name uh he's a poran activist group who has been active since March 2022 the ideology behind the activities that uh claiming legitimacy regarding Russia's invasion of Ukraine and uh criticizing those po anti Russia uh such as NATO and the uh collaborating Ukraine they have be executed D attacks against the countries and the world and the their opposite to the Russia to spread their IDE uh unlike other activist uh they use volunteers for the DS at operations and

they operate the telegram communities and the it is a base of their operations they get B teers in telegram community and uh distribute custom do two and it is called dare and encourage volunteers to join their activities and to spread eies uh this diagram shows the the T Community uh there are four major communities uh first one most most famous public channel is uh no name 0576 uh it is a public leadership Channel and uh which reports of successful leaders attacks and the news about Russia it is posted every day the number of subscribers increased with each report of their activities and there are about 72 subscribers now next uh Doria project and left side

on the new cryan dsia channels uh doia is a their original Tool uh they develop to automate dos attacks uh these two channels are private channels which are underground communties for people who want to B tier for dos attacks uh in the Doos shop project uh illegal activities Place take place such as uh discussing uh attack targets and distributing uh Distributing paid bpn and T accounts previously dos T also distributed here dos project but now volunteers are screen and only those who pass the interview process and uh they are guided to the new cryan DHA channel right side the new cryan DHA channel is a new community established uh in March of this year uh the community is still very

small uh it is about 300 subscribers uh in the inter the uh in the interview to join the Channel people asked about the affiliations and ideology and they cannot join their Community unless uh they recognized as likeminded approved Volunteers in this channel will be able to download D here and use the to and execute DS attacks and the d project is manual along with the to uh it's explaining how to use it in it describes that decoin uh it it is uh can be converted cryptocoin will be given as a reward according to the number of DS attack collaborations and it also explains the upset uh such as uh using PPM uh so it is shows us uh manual

cover more than just basic usage in particular they have decently issued a coral ofse enhancements uh in last following the arest of their Volunteers in Spain they have included measures to improve OB uh such as a using BN has a c option or separating personal use of telegram account and so on so it is expected that the making arrest will become difficult in the future and finally they also manage bot called uh D bot which allows volunteers to automatically handle deers and rankings uh the community has a ranking system uh based on the number of successful Doos attacks and it makes volunteers to enjoy those attacks uh as if they're playing game so this is the sumary side of this

section uh no name is a activist group and based in Telegram and they include like my n Volunteers in Telegram and distribut do called dsia and they maintain their DS infrastructure by providing various motivations to their volunteers uh in other words by joining the community and analyzing the dser uh we could monitor the D activities so the first step uh in tracking them is to analyze the sh and find out its control mechanism so from here we'll provide a detailed analysis of the D sh tool and their infrastructure uh to track their activities uh it's just as operation soy and first let me clarify the motivation behind the analyzing d sh uh the primary goal is to make the Le Shear

Target this accessible uh by doing this we can take proactive measures for current targets and those who might be attacked in the future uh another goal is uh identify the fingerprint of their infrastructures uh with this uh it could execute a takeown operation against them in the future uh to arive uh to uh to achieve these goal we must be statically analysist and dsia to reveal data structure and communication algorithm uh this is a overview of dare uh D is a Marge platform enable those to uh build go langage and distributed to volunteers uh the the tool is provided as a set for various operation system uh like the image uh and various operating systems and CPU architectures uh just

executes the r for your platform it start do attacks on machine the executable file are separated to Russian and Russian us them but uh there's a det reference in their behavior so we don't go into details uh this is the update timeline for leosia it has been very active it's continuously update uh since we start our observations uh then we provide a demo of D doia this time we will use uh Windows x64 executable F on my machine other one will work just as as well

so at run time uh specify the IP address of the current feature server for dsia with high P arguments and when executed D receive a command from she server and including a Dos Target list after a while the sh return feedback on the number of successful do next in the background uh you can see that large number of communications uh being sent to the background my Tami server so it easy to execut the do

attacks so next uh let's analyze the internal of D here the figure shows the result of analysis of D Behavior Uh the behavior is very simple uh dp3 the step one to step five this image at Step One D share send a request to the client Ling path of the shet server to join as the botonet uh if the Ling successful a unique time stamp will be returned as a response from C2 server as step two and after successful rolling to request list ruce Target list the client get Target first uh is a step three and finally the encrypted TOS Target L is sent to the Dare which decrypt it on memory and starts the TOs attack it's a step four

so to summarize that if we can emulate the communication step one and step three and decrypt the responsible step four uh we will have achieve our primary goal to emulate the communications uh it is necessary to analyze the data structure and encryption method uh this SL shows a example processing data uh all Communications uh encrypted using as 256 and GCM mode with the dat necessary for decryption is uh concatenated as a de head uh it is ASI and the tail it is a GCM T in the datails and this encrypted data is um encoded to the base 64 and converted to the Json format uh like this uh which is a basic data format for communication

with C server for theier so now we understand the communication algorithm and the data structure with the shet server so we can DEC the command and control uh in step four so we tried to decrypt it with a simple P script it was be to it was able to extract the those Target R so we have achieved our primary goal uh however we have one issue uh since the beginning of this year they have started uh changing their infrastructure at very short intervals uh it every few days so to continuously track them we need to find their new infrastructure within the enor Internet space in few days as a fingerprint of the infrastructure uh we can apply

unique time stamp response in step two uh however it is Impractical to send post request to entire Internet space within few days so we have to adjust our approach to complete the discovery within a realistic time frame so we decide to use a more faster internet scanner Mass scan uh to deuse the number of scans uh by using M scan to First identify only the a HTTP server and we can reduce the number of HTTP server post scans from billion order to around the 10 million order uh finally we execute scan only HTP server and emulating the D shb connection uh to find their fingerprints then we discover new infrastructure and get the D Target

this withing your day so as a result we are now able to continuously track their infrastructures and deal Target list uh this is a this is a demo on how to get uh the target list first we prepare this of HTTP servers and create it with mass scan and uh with them perform scans on the list uh emulating the D B connection uh okay uh now we can see that uh IP address uh starting uh 193 it has a fingerprint or D here next so we emate the communications uh to get the D Target this for the this IP

address okay log in success and the uh data is display is uh they call that Leos Target so we can get uh Theos Target R from their sheet their infrastructure now that the proof of concept or our Pro of pro concept is completed so we automated this task we we use apach airow is back in and we get the target list be to and uh next is Art F and the information and next is store the processed data in GitHub uh this process is carried out regularly and any changes uh notified P struck like

that uh with this uh we can keep correcting the Target this and track their move their activities so next way kaii will Ki will talk the result of detailed data

analysis okay this is about the target list we are acquiring for each file it contains about 100 350 pieces of Target information and among them the number of unique host is about 10 to 50 the target race includes the Target and its attack method for the Target domains IPS ports URL passs Etc are specified it also supports through the randomization using templates for the attack method we have confir D attack methods of layer 7even and layer 4 such as slow rist and TCP in FRA the format of this target list is occasionally changed it was confirmed that the format of the target list has changed in conjunction with the times when the D protocol was changed in April and

November the D SLE capabilities are also being updated and new features are being added as needed it was also found that major changes are linked to the timing of updated to the dsia infrastructure next we will discuss the transition and Analysis of Doos activities we have corrected and analyzed post post from the public telegram Channel as for the content of the posts there are posts that call out to comate at the beginning of the day's activities we call this start of work notifications on the contrary at the end of the day activities they also they also post new summaries we call this end of work notifications also the most common type of post is promoting the success of dsia

Dos attacks this dos post who includes checkhost links to the Target site and images of a browser showing that it can no longer connect to the Target site other post include reacting when their activities and features of social media or blogs also the figure on the light represent the time zone when many posts are made assuming that this actor is active in Moscow time UTC plus three post on telegram are concentrated from around 10 a.m. to around 9:00 p.m. and it is clear that they are active at a clearly defined time next we have visualized the timeline of telegram posts over a long period of time the top row represent the number of posts the middle low is a scatter Pro of

the minutes of post time the bottom low is a heat map of the days attack for the top 10 countries that are frequently targeted at the grounds that Trends B Trends bu depending on the time period you can see that the trend mainly changes at the beginning of the month every few months from the next slide let's take a look at each first if you look at the timeline of the number of post in the up section you can see that the number of post varies depending on time period also if you look at the part after 2023 is the left part if you look at the area on the left there were many post promoting the

success of Doos attacks in Orange but if you look at the light side you can clearly see that the orange has decreased so have DOS attacks decreased that's not the case and in fact it's known to be on the lights if you actually look at the content of the post at the left side this they were doing in in the sty of one po per Target but at the light side they changed to make one post for multiple targets also if we look closely you can see that the bear in the picture has changed from a realistic bear to a deformed bear and from such things you can confirm the change in format next let's talk about the posting time

the partical axis represent minut posted from 0 to 60 the green triangle represent the start of work notifications initially when the start of work notifications began they were posted at various times and minutes however from a certain point in time the trend changed to concentrate post from 0 to 1 minute each day also it is now known that the start of work notifications have being AED next let's look at the D success post the orange inverted triangles represent the post promoting success of Doos attacks initially posts were made at various times and minutes after that also not as much as the start of work notification s it can be seen that posted started to concentrate at times

like 0 to 1 minutes and from 30 to 31 minutes and then eventually it return to posting at various times in this way it was confirmed that the posting are changed depending on the period finally let's discuss the target countries in the lowest section we are displaying the top 10 countries TLD that are frequently targeted looking at the sty of Target country selection no name has a tendency to set the main target country at the beginning of the week and from there they have been attacking while switching this switching the target countries every week however since a September 2023 this trend has also changed and they have started to attack by switching the target countries on a

daily basis not on a weekly basis recently there is also a trend to continue targeting the same countries again for several days in this way there were changes in the in the operation depending on the period if even even in terms of the St of Target country section through this analysis we have the following SS about no namous activities the fact that they have a fixed activity time sent out around 10:00 a.m. to 9:00 p.m. and that they make a scheduled post every day suggest that the tegram operator is not just operating as a hobby but is operating in businesslike man also the fact that the operation policy is regularly changed every few months and that the trend switches

sharly at the beginning of the month suggests that this is an organized activity furthermore the maintenance of an infrastructure capable of handling large scale request for over two years and the fact that they have a substantial source of funds that allows them to continuously provide rewards to more than 500 supporters suggest that they may be sponsors who are funding no names activities in background we are able to obtain Target information from both telegram post and the target list and using this we have calculated metrics such as the attack success late the attack success rate is calculated as the number of reports on telegram divide divided by the number of Targets in the Target raised the trend from November 2023 to

December is shown in the figure on the right in this figure the top low represent the number of targets included in the Target list and the bottom low represent the transition of the attack success rate looking at the number of targets around November and December there were about 20 targets per day and sometimes there were more than 30 targets the attx success rate varies greatly from day to day it is less than 20% at its lowest and less than 90% at its highest We compare this with the time around February 2023 when Japan was targeted at that time the number of Target was around 13 and the attack success rate was less than 60% focusing on the number of targets we

can see that no names activities has become more active with a number of targets now about twice as many as around February 2023 looking at the attack success rate there are many sites where the attack hasn't been successful of course if the attack is not successful it cannot be promoted on telegram so the targets visible on telegram are just the tip of the ice we also got the impression that attacks on cgn often fail however there are cases where region is directly targeted so Region's defense remains important also Target that was successfully attack tended to be targeted repeatedly in fact when calculating the proportion of targets that was successful in the past among the targets

included in the Target r on average about 75% were targets that have been successful in the past from such a situation it can be said that it is important for the defense side not to be rested on the attack success list and how to defend against the fast attack next we will move a SE parts so let's move on to the next section in this part we will share the lessons we have learned about information sharing through mod year of dealing with this activists we have often encountered situations where the sharing or spreading of threat intelligence has led to negative impact for example the disclosure of tdps can lead to changes in tdps or the

victims information can re reinforce the attacker sense of access and be used for further propaganda and this is a timeline of major events regarding attackers reaction to the disclosure of ddps as you can see they appear to be particularly sensitive to the disclosure about their deos mechanisms or their deos infrastructures especially we've observed several times where they switch C2 servers or change the C2 protocols within a week or two weeks of the publications of detailed report from cyber security companies of course these companies are doing great jobs and sharing bunch of insightful intelligence and it might be a coincidence that the publications of reports and the changes in tdps happens around the same time but given an

example of aest where these activists attack Avest shortly after they published the reports about the details of this activist C2 infrastructures it can be said that at least this actor has a significant interest in the disclosure of their internal details the next example is about the reinforcement of propaganda those who have seen their post on X or telegram will understand that they often site the damage information or news about the victims they have attacked to strengthen their claims as we mentioned in the past section this actor frequently targets organizations they have successfully attacked before so it means spreading such victim news with will reinforce their sense of success and as a result it will POS a risk of attracting further

attacks in the future here we would like to Reit re revisit what a hactivist is a hactivist is someone who uses hacking techniques to promote political or social changes their ultimate goal goal is influence public opinion by making their claims widely known the dasas is just one of the means of attracting public interest and they're concerned about how well their message is reaching the world in other words they want to make them and their activities more known from this pers perspective activist and public disclosure of the threat intelligence are potentially incompatible so what we have done in operation s is that we have delivered information as secretly as possible in a timely and effective manner for example

we directly contact the targeted organization to provide specific information about exactly where they have been targeted what we have learned from this action is from the viewpoints of the targeted organizations Early Access to datas Target informations has benefits more than simply being able to begin a timely instant response one benefit is that it clearly identifies the cause of system overload generally it is difficult to distinguish between a sudden increase of benign traffic and a DS attack however having attack information can save cost of such investigation another benefit is from understanding the scope of the impacts knowing which websites are being targeted also means knowing which websites are not this allows us to efficiently allocator resources to

handle the issue and yes sharing information individually is time consuming and cost consuming so we also utilize nonprofit organizations like ISAC to distribute our information to summarize this section what we have learned in dealing with activist is that we need to consider information Shar in tailor to the nature of threat actors this means thinking about the balance of cost and benefits for both attackers and Defenders if sharing certain informations helps us prevent future attacks or reduce the damage of an attack it might be worthless to spread such information but naturally if the Defenders do not gain more benefits than the attackers the information sharing will be a failure therefore we must always ensure that our strategies provide a net

benefit to our defense efforts the same applies to the secondary information sharing when dealing with activists the callous dissemination of informations benefits it attackers mod Defenders we need to pay close attention to what our action bring about so we quickly summarize our presentations in our presentations we discuss the Prussian activist I don't want to name here they might watching this live streaming but you know who the key takeaways are the techniques for tracking and analyzing the doos infrastructures and the long-term multi-persp