BsidesLV 2024 - Ground Truth - Wednesday

BSides Las Vegas · 20248:39:54268 viewsPublished 2024-08Watch on YouTube ↗

Speakers

Sua Saabi Hussein

Tags

StyleTalk

Show transcript [en]

[Music] h

[Music] n [Music] oh

[Music]

[Music] [Music]

[Music] n [Music]

[Music]

[Music] oh

[Music]

[Music] the [Music] [Music] [Applause] [Music]

[Music]

a [Music]

[Music] [Applause] [Music] he [Music] [Applause] he [Music] [Applause] [Music]

[Music]

hey he

[Music]

[Music] TR [Music] hey hey hey hey [Applause] [Music] hey hey hey hey hey [Applause] [Music] oh [Music]

[Music]

[Music] [Applause] [Music]

[Music] he [Music] [Music]

[Music] [Applause] [Music] he

[Music]

[Music] h

[Music]

[Music] w [Music] [Applause] w [Music] [Applause] [Music] I'm just TR [Music] to I'm just TR [Music] to I'm just tring to give something got do BR you I'm just trying to give you something [Music] a [Music] [Applause]

[Music]

[Music] [Music] I'm just trying to I'm just trying to give you something [Music] I'm just trying to give you something okay I do I'm just trying to give you something [Music] w

[Music]

[Music] [Music]

[Music] yeah that's

[Music]

[Music] [Applause]

oh [Music]

[Music] [Music]

[Applause]

[Music]

[Music] e

[Music] a [Music]

a [Music]

[Music] [Music]

[Music] [Applause] [Music]

[Music] a [Music]

[Music]

[Music] [Music]

[Music] [Applause] [Music]

[Music]

[Music] he

[Applause] [Music] morning I know it's really kind of thank you for our coming to our first talk here today at bsid Las [Music] Vegas we're gonna have an awesome session with Su saabi Hussein she's here with us today but first we got to thank our sponsors for having us thank you thank you for bringing us money uh prism Cloud andant Adobe Drop Zone AI project circuit all that good stuff so Su is currently a security engineer on the machine learning Insurance team at trail of bits she's going to share with us please hold all questions to the end we'll have about 15 10 15 minutes for that and if Le somebody's dying or God's calling please turn off your phone

go ahead girl

hey everyone I'm sua I'm really really excited to be here today and with that let's get started so I want to talk to you about ml security specifically I want to talk about this new class of exploits I identified called incubated ml exploits which combine back doors and input handling bugs don't worry if you don't know too much about ml or ml security I'll explain all the important stuff as we go along so uh who am I and why am I even talking to you today uh I'm an engineer at trail of bits where I focus on AI and ml security I've been in the field for a few years now I graduated from Georgia

Tech and I'm originally from Queens uh outside of work I like Brazilian jiu-jitsu trying new restaurants making things and an obscure card game called CU birds so it's becoming pretty clear that with ML and AI popping up everywhere people are figuring out how to trick these systems based on how these models work maybe you've seen someone use prompt injection to convince a chatbot to give them a refund or maybe you've seen this story of protesters tricking self-driving cars with traffic cones notice the fact that this trick is rooted in an understanding of the training data for these models so how can we actually construct our own useful exploits against ml systems let's play a game of pretend

real quick you're a college student and you really really want the prize money for a robotics competition so naturally you decide to sabotage another team uh side note I do not condone this Behavior I don't recommend it I've never done it myself uh but anyway the competition requires teams to build a tiny autonomous vehicle that uses a specific pre-trained model and stops at stop signs so you go ahead and you find out that some of these stop signs have stickers on them and you also find some flaws with how they've stored and distributed the model now uh that by the way isn't out of the question ml artifacts are often shared widely without any meaningful or substantial

trust mechanisms so you go ahead you decide to grab that file and inject a model back door in it using a file format RC rce of some kind and then you put it back then on the day of the competition you sit back and you watch as your competitor's vehicle just plows through and ignores any stop sign with a sticker on it what you just did is execute an incub inated ml exploit which is what my talk is all about obviously the stakes of this story is just a lost competition but the notion of attacking a real autonomous vehicle is a Hallmark of model backdoor research as you can see with the image on the left so I'll let you use your own

imagination to raise the stakes so first I'm going to tell you about this framework I've been using to bridge the gap between model and system security because we can't continue to treat models as Standalone objects next I'll tell you about these input handling bugs I found in model serialization and connect them to back doors I'll do that by taking a page out of this subfield called LC or language theoretic security and so effectively I'm going to be going through a bunch of examples of incubated ml exploits and use lensc to organize them but first I need to explain some stuff what even is a model vulnerability or an or an ml back door

so uh super briefly you can think of ml models like these squishy flexible sequences of linear algebra operations that are trained on tons and tons of data uh there's a popular saying all models are wrong but some are useful it's just saying that these models aren't perfect there are many different ways that models can mess up or get tripped up by something that might be unexpected to us and that's the basis of these model vulnerabilities while popular examples of model vulnerabilities include model inversion and membership inference we're zooming in on one specific type back doors so to be precise about it a backdoor attack allows a malicious actor to force an ml model to produce specific

outputs given specific inputs now there's a couple of things that I think make back doors really really interesting to study so you can use them as atives for other model vulnerabilities like membership inference you can also identify pre-existing quote unquote natural back doors in models and there's also some pretty strong evidence that suggests that this is an inherent threat to ml systems now while there's a lot of awesome research out there on ML model attacks they can actually be pretty hard to exploit in the real world with some exceptions of course and while there are multiple reasons for it one thing that really sticks out to me is the gap between research and the real

world so for the most part many attacks and subsequently attack Frameworks and tools restrict their analysis to this formulation an ml model receives an input and produces an output but this isn't an accurate representation of what an ml system actually looks like there's so much more going on in practice so here is a software architecture diagram for an ml system reviewed by tritz recently this is a system that uses the ask asro tool for Rag and I've circled where the model actually is in the photo do you see what I mean we need to be looking at all of this holistically there's a large and evolving landscape of tools being used in and for ML systems and that brings me

to the exploit framework so the title of my talk clearly references an incubated ml exploit but there's a larger category of exploits that I'd like to that are important to think of about first specifically a hybrid ml exploit chains a system security issue with the model vulnerability so if you look at the diagram you see that the arrow here is bidirectional uh so this can go in either direction a model vulnerability can expose a system security issue or a system security issue could be used to exploit a model vulnerability now this part's pretty important the big issue I see with how ml security is done nowadays is that model security and system security are often treated separately but what I need

you to understand is that if you only know model security you're missing a big piece and if you only cover system security you're still missing a big piece and you can't be treating these two processes as completely independent you're then entirely ignoring the potential for hybrid ml exploits this is an emergent property your model is embedded in a system that is going to interact with all of your different system components in new and exploitable ways so one thing you'll notice is that there's a lot of screenshots of paper titles on this slide that's because there have been specific instances of hybrid ml exploits in literature and in practice they're just not explicitly called that so exploitable software

gadgets have been used for back doors the summoning demons paper at the top chained model evasion with memory corruption and the learn system security paper next to it includes an example of a poisoning attack that causes an exponential memory blowup in an index structure but the ml security literature Frameworks and tools at the very least the ones I'm familiar with are largely limited to just that specific instances or implications what I'm trying to do here what I want to be doing here is treating this interaction explicitly and systematically which is why I made this framework so one kind of system security issue is an input handling bug and one kind of model vulnerability is a model

back door put that together and that's how we get an incubated ml exploit uh which is a type of hybrid ml exploit where an attacker uses an input handling bug to inject a back door so I made this diagram to make the distinction between the two uh a lot clearer and here it is again I'm going to be leaving the framework here for now uh we did end up going into a more formal model of exploitation especially like a schema for incubated ML exploits but we we'll return to these ideas later so to backdoor a pre-existing model the attacker should be able to change the parameters of the model or its architecture at the level of abstraction

we're dealing with we can put input and component manipulation on the side for now but how this actually plays out can vary a lot so sometimes the attacker has control over some element of the training process and they use that to sneak in some manipulated data that will change the model's parameters which is often called Data poisoning or maybe they go a step further and Fiddle with the source code somehow to change the architecture now before we dive into exploits I want to explain a few things about input handling bugs so an ml model is stored as a file and to process these models you need parsers and parser parsing these files into objects and back is deserialization

and serialization but way quoting on albertini here a file has no intrinsic meaning the meaning of a file its type its validity its contents can be different for each parser or interpreter this is the reason we can make potentially malicious file artifacts like polyglots and ambiguous files which I'll talk a bit more about later so I'm focused very specifically on bugs that occur when you parse ml model files there's of course also interesting bugs in other parts of the pipeline but I'm picking ml model files for several reasons the first is very very obviously the most important I think it's fun but more seriously the security of ml file formats have become increasingly important ml has fostered this culture

of sharing these artifacts without sufficient validation real malicious models have been found on the hugging face hub for example and there's also just a ton and ton of ml file formats out there I've tried to list and organize these in the repository listed in the middle uh but but what's really important to take away is that there's a large set of possibilities for these exploits also just fun hacks with these formats and there's already a lot of great work on this in this area as shown on the slide so file format tricks are within the realm of LC but this field actually thinks more abstractly about inputs as a general class LC applies formal language

Theory to system security it focuses on exploring input handling bugs also called parser problems as this big root cause for security issues after all lots of impactful vulnerabilities like heart bleed and Android master key have been parser bugs now while I like formal language Theory uh this talk isn't theoretical computer science 101 so uh what I want you to know is that fundamentally what LC is saying is hey let's treat all the inputs as a specific language and then make our code just capable enough to understand that language properly so our work is centered around a specific taxonomy of input handling bugs uh here are all the different bug classes there are eight different types

now quick note these categories aren't completely distinct from each other uh the one you choose comes from a root cause analysis so with the exception of one I'm going to show you multiple examples of each and ml tools and use them to construct a back door uh so I once again in order to show that handling bugs are an attack factor I identified ml model serialization issues across these different bug classes and built back doors out of them so now we can dive into the most fun part the exploits uh specifically for the sake of time I'm going to focus more on the useful gadgets that actually arise in these situations so these are some characters

that play important roles in the ml ecosystem that can help us understand the impact of these exploits better first up we have Alice Alice distributes models she takes open- source llms and fin tun fine tunes them the model she distributes are what everyone else in our story is going to be using Bob is a Frontline user who's directly using alysis models in his own life maybe through a chat interface uh there's Dave Dave is an engineer who's integrating these models into products Frank is the end user who's relying on Dave's products he might be on aware that there's ml models behind the

scenes uh last we have Chuck Chuck is the attacker uh he's looking to exploit vulnerabilities in the models and disrupt everyone's work our Focus will be on how Chuck can impact Bob and Dave here so I'll show some exploits involving the file formats associated with pickle uh pie torch torch script Onex and safe tensors so this first category is called non-minimalist input handling code it sounds a little fancy but all it means is that the code used to check and parse the inputs as too complex so an attacker can potentially grab the necessary gadgets for their exploits sorry uh this case uh this case is relatively common uh pickling is a serialization method that allows you to

save arbitrary objects uh and pickling is very very common in the ml ecosystem uh there's no way to understate that um so uh recent uh there's no way to overstate that excuse me so recently my cooworker buan milanov led the development of sleepy pickle which is an incubated ml exploit and what it does is that it chains pickle a pickle rce with model back doors so on the right you can see an llm that has been back door to fish users there's also examples of an llm being backo to spread misinformation and even steal user data in the blog post now what's really cool about this exploit is that it can happen on the fly

so there's far more room and possibilities for an attacker than just uploading a malicious model so what do I mean when I say pickle RC python pickles are compiled programs that run in a unique virtual machine called the pickle machine or the PM for short and what the PM does is that it interprets a sequence of op codes in the pickle file to construct an arbitrarily complex python object um but it has two op codes Global and reduced that can execute arbitrary code outside of the PM which makes it possible to construct malicious pickle data and the underlying reason here is that the PM is more complex than something that's only parsing ml model should actually

be so way way back in 2021 we released this tool called Fickling this project was led by Evan sultanic so to our knowledge Fickling was the first pickle security tool tailored for ML use cases it's a decompiler static analyzer and bik code rewriter for the python pickle module uh it can help you detect analyze or create malicious pickle files now the reason it's safe to run on potentially malicious files is because it has its own implementation of the PM uh on which it uh on which it symbolically executes code so I also added a pytorch module to it relatively recently so that you can statically analyze and inject code into pytorch files as well but uh moving

forward pickles are clearly an issue for Bob if Alice is Distributing models as pickle files or pyos files that makes it that much easier easier for Chuck to inject a back door with a pickle RC now on to the next class this term just means you shouldn't try to correct invalid input reject it altogether it's been uh sometimes referred to as the anti- robustness principle so to mitigate the issues with pickling many developers write these things called restricted unpicks which are subclasses of unpicker that enforce and allow list or a block list but the thing is these actually aren't that hard to bypass uh there's this methodology called pain pickle that demonstrates how to automatically bypass restricted and

Pickers which can enable arbitrary code execution and that therefore backd door attacks so they identified eight different types of eners and three strategies that work against the vast majority of them so much like pickle was a problem for Bob restricting restricted un pickling bypasses is bad for date if he's relying on them in some fashion in this product so now we can talk about parser differentials so this happens when different parsers in a system read the same input but interpret it differently so when two parsers are inter interpreting the same file in different ways that file is known as an ambiguous file so this is a pretty common exploit technique it's really good for bypasses

but it means you can create an ml model file that is benign for one system or one system component but back doored for another uh there's some more implications here for ML system exploitation more broadly but we'll talk about that later but uh quick note whether or not this is impactful all depends on your system right so this is where threat modeling comes in handy so we were able to create two differential proof of concepts with torch script torch script is a popular format to store ml models in for a bunch of reasons but mostly performance and portability but you can make a parser differential with it and chain it to an architectural back door that's because

you can turn a pie torch model into a torch script one through tracing or scripting and tracing doesn't incorporate Dynamic control flow so all you have to do is represent the malicious components for the back door through Dynamic control flow so the second example we found was during an audit of YOLO so last year my team and I audited this open-source uh codebase for computer vision called diolo V7 and what they did is they released standard versions of their model and torque scripted versions for deployment so we noticed many cases where tracing didn't capture the model accurately after serialization and deserialization key info was lost and the usual pytorch warnings didn't show up so to spot this differential we Ed

the torch grip automatic Trace Checker torch FX and the torch grip IR but with what we found we created an input that made the two versions of the model act differently effectively a back door attack uh so once again this is a problem for Bob he's getting a fundamentally different model that than the one Alice trained which breaks any pre-existing uh promises so we also identified a parser differential with safe tensors safe tensors is another file for format for ML models that was developed specifically in response to the insecurity of pickling so last year I was on an audit of the safe tensor Library where we identified the inclusion of Json in the file format as

a source of parser differentials now Json is pretty well known to be underspecified there's a lot of exploits especially in the web security world that leverage this uh but the thing is the reference safe tensor implementation uses the third parser which is strict and rejects duplicate keys but a lot of external tools use the python built-in Json parser which doesn't so you can use a duplicate key for the offsets to appen back doored weights and create manipulated safe tensor files so these files are rejected by the reference implementation but accepted by external parsers quick note it has to be a weight space back door because weights and architecture are stored separately here uh there's some more details and caveats

regarding exploitability but just know that the safe tensor pars the differential is more impactful for Dave he needs to be making sure that there's consensus there's agreement between the parsers and his product if his tool is using a more permissive safe tensor parser than the reference implementation it could accept manipulated safe tensor files that actually carry backboard models so one big part of my research is analyzing previous works and noticing Trends and I don't want to get too into the weeds with this because I'd like to save formalisms for accompanying materials but one thing that became clear is from parser differentials we get these things called Model differentials instances where the same model is interpreted differently and as

expected the attacks are dependent on the supply chain component and the life cycle stage so in an ml system you can pre-process inputs and you can also apply modeled Transformations before you deploy a model so some Studies have exploited parser differentials right at the pre-processing stage so things like image scaling or Unicode parsing um those attacks often change the weights there have also been back door attacks that take advantage of differences during model Transformations like compilation and quantizations those frequently change the architecture I think it's very possible that most Transformations that can be encoded within the loss function can result in an exploitable back door but uh let's move forward from here next up is shotgun parsing this is

just what happens when you don't fully and properly check your input before beginning to process it so let's talk about polyot files which are files that can be validly interpreted as two or more different formats um they're a personal uh favorite Rabbit Hole of mine but uh polyglot files have been utilized to distribute malware bypass code signing checks and enable other malicious behaviors but with regards to ml model serialization these can be placed in model hubs to confuse Downstream consumers uh but even more importantly two different ml pipelines could interpret the same file as two different models so you can smuggle in a back door model with the benign one so during our audit of the safe

tensor Library we were able to make multiple polyglots these include zip PDF TF records Caris native and later on Pyar and the safe tensor audit report itself was a PDF zip polyglot with a zip file containing all of the polyglots we made during the audit so you can just slap on a weight space back door model to in one of these formats to a benign model in safe tensors so you open it up with safe tensors everything's good everything's fine load it up with PCH Mar or some other system and boom You've Got Your Back Door big problem for uh folks like Dave here because uh now you've got malicious models sneaking in with benign ones the O overall reason

this is possible is because of a missing check specifically the program didn't check whether uh the start and end offsets corresponded with the tensor size so attackers could appen arbitrary data to a file and that when combined with the ability to change the header size expanded the number of polyglots this issue has since been fixed with safe tensors however uh important note so our next category is incomplete protocol specification just think of it as under specification for now uh while there's multiple uh examples of this in the literature we'll just focus on pyo polygloss for now so many are unaware that pytorch actually supports multiple file formats some are deprecated but are still supported by external parsers and one

big issue is that there's a lack of consistent versioning here so that means you can create polyot of files that can be validly interpreted as different pytorch file formats uh also you can create ambiguous files so you can add three files to polych uh version 1.3 and Tor script 1.4 for another bigger issue is the Reliance on zip and pickle here so pickle is a streaming file format that ends once it reaches the stop op code which means when you're parsing it any data after that stop op code is fair game another thing is most zip parsers don't enforce the magic at the start like the py tar so you can append a zip to a pickle file to create a zip pickle

polyglot which gives you some good pytorch polyglots uh Fickling now has a polyglot module so you can differentiate identify and create um polyglots for the different pytorch file formats uh now on to the next class this one just means that your input should be simple and well defi well defined so you can check it thoroughly take on andx onx is a protuff based way to store ml models adelyn Travers discovered a neat hack for onx that he packed pack AG into a tool called botomy uh so ml run times and Frameworks often let you add custom operators to a model on the Fly and the language used for onx runtime custom Ops is complex so even though the official

specification disallowed side effects in the onx runtime arbitrary code could be encapsulated in a custom op and you can use that to launch an architectural backdor attack just like pickle this is bad news for Bob so to recap Bob our direct consumer was affected by the pickle onx and torp script exploits uh Dave on the other hand was impacted by the py orge safe tensors and restricted un pickling issues now what a lot of people Miss is how important and how complex the ml stack is the model you choose changes the Technologies in the stack so when I'm assessing a system or doing some sort of vulnerability research I'm always trying to think about what layer

of of the ml stack I'm dealing with uh so the layers listed are Hardware infrastructure lowlevel compiler highle framework model and knowledge and I told you about a whole bunch of exploits just now of the ones I told you about the restricted on Pickler on anx runtime uh and pickling proof of concepts are the issues that are exposed and impactful at the framework level uh the Tor script differential corresponds to the compiler level and the safe tensors and py polyglot issues correspond to the infrastructure level and this is just a starting point we're going to be seeing exploits up and down the stack that impact ml systems so if you want to get into attacking ml systems now this is a

solid place to begin are you really good at breaking Hardware go take a look at a TPU do you happen to know a lot about distributed system security go write some uh hybrid ml exploits at the infrastructure level so I made this schema for incubated ML exploits this is just one piece of a more formal model of exploitation um I'll talk about this just at a very high level to shed some light on the terrain here so if you want to pull off an incubated ml exploit you need a right primitive for the weights or the architecture and the proof of Concepts point to some additional capabilities um well uh side note you probably want read Primitives as well uh

but with the safe tensor parser differential you saw that access to a metadata could enable both types of back doors um there's a lot of utility in exploiting model Transformations and model differentials with that you can construct exploits at different stages of the pipeline with existing procedures uh differentials are also pretty useful for an attacker they're localized to their stage and with on andx it became kind of obvious that you can maliciously use custom Ops and serialization formats and maybe even in places like compiler dialects I'll release more details in a comping materials but I do want to make some more explicit wrecks I apologies for the busy slide here I think model should be checked for integrity and

their metadata should be well parsed we want good trust mechanisms and we want robust validation we also want to minimize complexity so we should be avoiding custom operators and separating weights and architecture storage um and I also think we should follow uh the be following the recommended practices for file formats more so you should have versions and check sums and Magic signatures you should enforce your signature at offset zero and we really need to be investing in robust specifications and tools so I'm really hoping that we can see more work on hybrid ml exploits and incubated ml exploits I want to see them addressed by more Frameworks and tools I'd love to see this framework evolved

and also be applied to specific ml tools and context as well as more uh more bug classes and more model vulnerabilities I'd also like to see more investigations of exploit persistence reliability and defense um I also just think more generally there's a lot of interesting work that can be done in ml infrastructure security with differentials and file formats and specifications and even reverse engineering but before we finish I want to tell you what helps me identify and make progress on ML security problems um and that's understanding the two root causes first we're building all of these new systems for ML new hardware new progam programming languages new compilers new file formats there are conferences dedicated just to new and

creative ways to design ml infrastructure um and that means these new systems are introducing new attack surfaces and it's also becoming increasingly clear that the stack in supply chain have not been subject to sufficient review that's why we're seeing pickles everywhere right second simply placing an ml model into a program introduces all of these new vulnerabilities that stem from how your model is interacting with different components machine learning is not a quick add-on but something that can fundamentally change the security posture of your system so I hope you leave this talk knowing that we need to concurrently and holistically think about system security and model security uh I recommend checking out the full audit reports for

safe tensors and YOLO as well as the blog posts on Fickling and the file formats repository I'll post more details and accompanying materials we're hoping to release a paper on this topic as well uh you can find my contact info on my website or send me a message on Twitter uh but thank you all for listening um now I can answer any questions [Applause] [Music] now do you are you aware of any thank you uh are you aware of any tools that as a Defender we can use to audit our models periodically automate the model audit to identify vulnerabilities in it uh so the question if I understand it correctly is are there good defense

tools for ML security uh so we're uh hoping to develop pickling such that it becomes a good detection tool as well um so it was originally designed for reverse engineering and offense um we I think it's very useful for incident response folks to uh be able to look at a pickle file reverse engineer it and see if that's potential cause um there's uh I'm more of a fan of the secure by default strategy uh so I always tell people don't use pickle use safe tensors instead um you should have good trust mechanisms check sums things like that um but yeah I think it's uh a very Green Field area uh so there's a lot of

ongoing work in this and I think we're learning more about it as it goes as it go uh as it moves forward so I'm interesting in seeing what comes up any other

questions what would be a good way to convince uh the high UPS a company to invest time and money and resources in something like uh let's get like no more pickle in the code base let's get rid of pickle uh do you have examples of uh I'm thinking of of I know like the solar winds exploit or LinkedIn got hacked like there's always a a thing like if we invest in teaching people how to detect fishing attacks and social engineering that would prevent this have any of these machine learning exploits been used in the wild could is there any examples we could point to and say look it cost this company $10 million specifically because of pickle or is

that something that maybe in the future and just not yet CU this stuff is so new so there pickle cves that's uh a thing I've seen um with let's see if I have the slide for it I don't know of any that can give you a concrete dollar amount uh which is uh difficult um if someone knows feel free to jump in uh but sleepy pickle is um there's also a followup called sticky pickle that does uh showcase how this can be a stitious and dynamic attack um that's one example I know uh whiz has another example of it being used for cross tenant vones in Cloud security um I don't have a slide for that apologies

but um there's also a let me find the slide for it sorry lots of slides

here uh that one I'm just going to point to it this one is from jrog and is about how they used Fickling to find real malicious ml models um on hugging face Hub uh and uh yeah uh that's as far as I know about it if anyone has once again if anyone has like a much uh like a real FR and whz I'll write those down and look them up later okay thank you gotcha any more questions give it up fora thank

you hey he [Music] [Applause] [Music] [Applause] [Music]

he [Music]

[Music]

[Music] track [Music] hey [Music] [Applause]

hey hey hey hey hey hey [Music] not [Music]

[Music]

[Music] [Applause] [Music] he [Music] [Applause] [Music]

[Music] [Applause] [Music]

[Music] [Music] [Music]

[Music]

[Music] [Applause] [Music] he [Music]

[Music] h

[Music]

[Music] h [Music]

[Music] [Applause] w w [Music] [Applause] [Music]

I I'm just trying to give you [Music] something I'm just trying to I do I'm just I'm thinking get something [Music] w

[Music]

[Music] [Music] I'm just try to give you something I I'm just try to give you [Music] something I'm just TR to so to BR I'm just trying to give you something [Music] w

[Music]

[Music] a [Music] [Music]

[Music]

[Music] [Applause] oh

[Music]

[Music] [Music]

[Applause]

[Music]

a [Music] a [Music] n

[Music] oh [Music]

[Music]

[Music] [Music]

[Music] a [Music] [Applause] [Music]

[Music]

[Music] [Music]

[Music] oh [Music]

[Music]

[Music] oh [Music]

[Applause] [Music] he [Applause] [Music] [Applause] [Music] [Applause] [Music] woooo welcome to ground truth we have an awesome talk today hacking things at hack with Matthew Kahan give it up for him thank you very much that's an early clap but we got to say thank you to our sponsors Adobe job Zone AI all the volunteers staff that here today because we like money and we like you too so let's get it all right Matthew what you got for us good okay well the first thing I have is a Shaggy Dog uh which this is now the thing that's interesting about this Shaggy Dog is that a dog very similar to this was once used by the CIA

to infiltrate a very highly secured facility and the way that this infiltration worked was that um the agency had a source that was already in this facility and drove into this facility on a daily basis and they needed a way to penetrate this facility to physically penetrate the facility and um they were having a lot of difficulty with this because the um the facility was really really highly secured and there was just no way that they could get into it so they kind of brainstormed about this and they came up with an idea and the idea was this that this source that was inside that facility would start bringing his dog to work every day

with him and it was this big Shaggy Dog in fact actually I don't think it was his dog I think the agency actually bought him this dog so they expensed it but got him a big dog as a pet right and uh so he's bringing this dog in to this highly secure facility and the first time he does it like the guards at the checkpoint just go freaking nuts right they're like what are you bringing the dog in here for and he's like well it's a service animal it's my you know my comfort animal you know I got to bring it in with me and um so they freak out and they they you know anal the dog and

do all this stuff and and check everything about the car cuz it's like ah this is just so weird right but there's like a really high level scientist in this facility so it's like ah it's eccentric but we got to go with it because you know we got to make our nuclear bombs or whatever it was so um so they freaked out but they let him go through with the dog takes the dog to work next day same thing and they freak out again a little less so next day they're still freaking out but it's getting to be you know like okay this is just the weird guy who brings the dog to work and okay whatever

two or three weeks couple months go by and pretty soon it's just like yeah that's the you know Bob the nuclear scientist who brings his dog to work whatever just wave them through that point the CIA made a Shaggy Dog suit that their operative could fit inside and guess what happened when the operative in the fuzzy dog suit goes through the gate they didn't even bat an eye didn't even look at the dog they're just like yeah whatever go through and that's how the agency physically penetrated that facility and that story is on uh darket Diaries if you want to hear the whole yeah a few of you have heard this the reason I bring that up is that

that's a great example of a cognitive attack okay and I have been railing for years about cognitive security because I'm just one of those people who rails about things that people generally ignore and in this case that physical penetration was facilitated in fact enabled by a cognitive attack and that cognitive attack was habituation they habituated those car those guards to accept the fact that this eccentric scientist brought a dog in every day and after a while he lost power um in all seriousness I don't know what happen guys um um yeah maybe it went to sleep sorry okay yeah hold on sorry um yeah well let now I know how to fix it we're good um okay so that was a

physical penetration that was facilit facilitated by a cognitive attack okay now what I've been advocating for for quite some time is that we have three essentially phys um security domains three domains of security we have physical which is arguably the oldest area of security right physically uh securing uh physical uh objects or valuables we have cyber which is securing uh information right it's either information going from one place to another or it's information being stored in place but in one way shape or fashion it's securing information but what we're moving into what we're moving towards is the cognitive domain now you could argue that the cognitive domain has been around for a while because humans have been around for a while so

on and so forth but what's different is that we're connected now right and this is a phenomenon that's really only been happening for maybe 100 arguably maybe 150 years uh you go with radio and these other Technologies and now it starts to uh uh evolve into something different okay and more recently with the emergence of AI this is where I think cognitive security is beginning to become really interesting because what we're seeing is we're seeing attacks that are meant for humans but actually also work for machines which is a really strange thing if you think about it uh which by the way this is an example of and so some of you well given the audience I'm sure all of you are at

least somewhat familiar with jailbreaking llms I'm just going to go with that assumption given this crowd um what's interesting about this paper I think this paper is about three maybe four months old is is that it introduced something called uh persuasive adversarial prompting now there's um there's a um psychologist who's very well known by the name of Robert Shalini and he has these uh principles of influence he's got a book called influence if you haven't read it I highly recommend that you read that book because that's like the Bible of social engineering um what was really interesting in this papers that they were using chini's principles of social influence and persuasion that were meant for humans to jailbreak

llms so you have an attack that was meant for a human that's being applied to an llm okay so isn't this just social engineering isn't cognitive security just social engineering by a different name I'm arguing no and so this right here is a cognitive hack we have have paintings that are uh on the street that it's just totally flat ground but they're drawn and they're Illustrated in such a way as to imp imp um impose that um visual representation on the um you know basically in the visual input of the Observer from a certain perspective and in the case on the right with the little girl on the ball that is uh specifically an instance of what's known

as a nudge and a nudge is a um how do I say it it's it's a n it's it's a quasi voluntary way to uh influence behavior uh or qua quasi consensual uh on the other uh on the left side that's just uh painting it's um I forget the artist's name I'm sure some of you know the artist but he's he's pretty well known and this is interesting but check this this out if we draw lines on a stop sign we can make computer vision think that that stop sign is now a basketball uh or if you put randomly uh or selectively placed pieces of tape that are white and black on that stop sign now all of a sudden that car

doesn't recognize that as a stop sign um humorously uh some folks uh projected um Elon Musk in a uh p specal sort of image uh on the street in front of a Tesla and got it to stop very uh quickly so anyway what I'm showing here though is that this is not what we would think of when we think of social engineering right this is something different and that's what I'm arguing is part of this greater umbrella idea of cognitive security oops uh this is uh I think another interesting one that I just came across last week uh Masha sedova from um it was Elevate now she's with uh mcast gave this to me and what these graphs

represent is um on the right it's the time of the day okay so imagine that as being a 24-hour clock not a 12-hour clock and this is when um actual fishing emails are reported the frequency at which they're reported to the security department by users and if we see I don't know why users are reporting fishing at 3 in the morning but they seem to like to uh same with five but what's interesting is is we see this spike a around nine and 10 and then it drops off right around lunchtime and the interesting thing here is that there was an Israeli study done with judges and how what they're like Ood of pardoning um or reducing the

sentence of uh convicts would be and what they found was that the likelihood of granting a pardon or commuting commuting a sentence actually went up if that case was heard just after lunch so if you're convicted of a crime time your pardon or your commutation at the right time in the docket so that you get a favorable judge right and it seems like there might be something kind of analogous here I don't know exactly what's going on but there's a little Spike at around 23 and then the other interesting thing is on the left this is when fish clicking happened and we're seeing a spike right around the time that people are getting ready to leave

work and I don't know what the 7 p.m. thing is maybe people are getting done with their commute and they're checking email real quick and they're being caught off guard the point is is there's something cognitive Happening Here it's this isn't uh fishing filters this is humans okay so what do I mean by cognitive system let me just take a a quick um s uh Divergence here and give you just a real quick thing about my background because I think it is relevant here uh my PhD is in cognitive Neuroscience I did not focus on cognitive psychology I did not focus on humans cognitive psychology was a part of my education but we were looking at

the broader scope of things my first class in my graduate program was we were building um multi-layer actually three-layer perceptrons uh modeling human Vision systems and how they would respond to visual stimulation and to do that we were doing that as a in an effort to model the human uh Vision visual system in the reason I bring that up is that cognitive science has a different perspective on how um cognition happens okay it's not it's not just neurons okay so people get very fixated on the brain and the squishy stuff but the important thing to realize is that cognition can happen outside of the squishy stuff it can happen in other ways and essentially what uh what we can think of as a

cognitive system is this cognitive system or this cognitive agent that has percepts uh perceptual inputs uh what we'll call sensors and it has ways to act on its environment actuators and so this could be um it could be an amoeba it could be a bug it could be I mean a literal like crawling around bug could be a human it could be a group of humans it can be an AI and so when we conceive of a of a cognitive system in this way it really opens up the possibility of what we might consider to be a cognitive system and on the internal side we have uh the ability to store and make decisions on

information that has been taken in through those uh sensors there's a reason why I'm bringing this up and the reason why it's important and that is that once we understand that cognition is about information processing now we can start to formulate attacks against different types of information processing systems and I'll come back to that but this is how this ties into cognitive security is that now we can start to look at ways to attack cognitive systems and so as I became interested in this idea I've been collecting lots and lots and lots of examples and um the idea hit me I don't know whenever I sent the paper in a few months ago I should

put all these things together and um so go easy on me as I go through this because this is my first Wikipedia or my first Wiki and while I have a PhD in cognitive Neuroscience I I have um about three hours of Wiki experience so uh but here it is and so if you'd like to go there again go easy this is a work in progress um cognitive attack taxonomy dorg I really wanted cat.org but I didn't have the $50,000 to buy the domain so this one was uh I think $8.98 and uh it renews at like $4 or something so anyway um when you get to the first page though um you'll see a couple of links if you

click on that very first link it'll bring you to the index and this this is the index right here it's a screenshot and um I'm going to walk through what these things mean but basically what I've been doing is in a nutshell I have been collecting all of these different instances and this is ongoing so I think right now I'm at around 362 or something but I keep adding stuff you know as I can uh I take social engineering attacks I take scams of different types any kind of new sort of tool tactic procedure exploit vulnerability that could be applied to a cognitive system and I I put a little identification tag on that and I write a

little you know summary about it and so um let me walk through that and I put the uh URL on top of a bunch of these slides so um it'll be around for a little bit um if you pull up the page um the first thing that you'll see is the cat name and this is the uh cognitive attack taxonomy ID and within um that ID you'll see let me think it's the uh the number it's the year and then I think it's the number cat year number um sorry I just was doing this on the fly it seemed like it would work if anyone is an expert at indexing things and wants to like put

input into this I'm totally open um underneath that is a very short description on what that thing is um then the uh the identification number that's the tag uh the layer I'm going to get into this in a little bit but the reason I went into depth about what a cognitive system is is uh so that I can talk about these layers uh the scale I'm going to talk about that in a minute as well but that basically talks about is this at a tactical operational or strategic level uh level of maturity I don't know if this is the right term for this but um I'm try I'm trying to be really polite here there's a lot of snake oil

out in the world on this kind of stuff and I'm really what I'm trying to do is I'm trying to account for the snake oil like I'm trying to say oh we have snake oil here but at the same time I'm trying to acknowledge that uh we may not want to really trust the snake oil we might want to trust this other stuff over here so that's where I'm capturing that but I may change that name again if anybody has a great suggestion for a name like that that's not too offensive I'd be open to it I don't know [ __ ] scale just didn't sound right um so uh category uh I'll get i'll get into the

category in a minute subcategory I'll get into in a minute and then also known as um it's it's just kind of what it sounds like it's like a lot of these things have very similar terms or different terms for the same thing uh okay uh brief description let me just uh kind of run through some of these as I um okay layer um now I I talked earlier about um how a cognitive system is essentially an information processing system right it's not an individual person it's uh not necessarily an AI it can be anything that can process information now uh most of us here are probably familiar with the OSI uh the open systems interconnection uh model and this is a

model that describes how different uh systems information processing systems can talk to each other and in security it uh informs us about how we can secure different systems uh what types of attacks we might uh expect for different systems and so um that goes from 1 to 7 one is uh physical so this is actual physical Hardware or it might be uh electromagnetic waves going through the air and on the other side of it we have the application that faces the human I talked to jat uh chat GPT about it and chat GPT told me that it resides at layer 7 so I took it at its word and so I've got chat GPT at layer 7 and I

have layer seven attacks the persuasive adversarial prompt that I showed earlier would be an example of a layer seven attack layer eight so we've got layer 8 n and 10 this um so Bruce schneer talks about it but there was also a gentleman by the name of Ian farore that talked about it as far as I can tell I think Ian farquar talked about it first but if if I'm wrong about that please let me know um anyways back in 2010 said hey we should extend the OSI model to include humans and he proposed the human interconnection model and that has layer eight for individual humans layer nine for organizations and layer 10 for legal and

Regulatory bodies and um so the thing that's a little weird here is the way that I break this down is that a mob would still be layer eight because a a mob does not have a coherent set of rules that process information organizations do and this is important because I I can point to lots of examples where people have deliberately used policies in some cases stupid policies to exploit an organization okay um an example if an organization has a policy to not scan packages coming in through an x-ray what a great way to warship and collect all of that wireless you know stuff floating around and that would be a layer nine attack because you're aware of this policy that opens

up a vulnerability there uh layer 10 this is one I'm a little less familiar with but it's the legal and Regulatory uh sort of layer the the distinguishing um characteristics of this layer are that uh number one it moves much slower than policy policy could be switched well Poli policy can be switched relatively quickly uh to change a law takes a very long time however um there's a kinetic component to the legal system that there is not um present in policy right so the law they can um possess your person they can take away your freedom they can take away your life and that's a very important distinction there now the other thing that I really

wanted to do with this tax aomy is I I play around in a couple of different spaces I play around in information security space and I play around in the um kind of cognitive Warfare information uh Warfare influence Warfare kind of space and the thing that's interesting about this is when you talk to like a scops person they're they're talking very much about the same kind of things that social Engineers will talk about it but they're talking about it in a slightly different way and what I really wanted as a way to unify these two sort of universes because I'm like we're we're sort of talking about the same thing we're just using slightly different

words and um something that is important from a military context is the level of operation so in information security we're typically operating at a tactical level and what a tactical level is is it's a single engagement uh I'm going to see if I can tailgate my way into that building or I'm going to tailgate my way into that building so that I can get access to that server and then I'm going to get out that's a a single engagement that you're engaged in okay now if I am going to um let's say that in order to get access to that same server I'm going to do a seating campaign where I'm dropping USBS all over the place I'm going to uh try

to um you know capture somebody's uh swipe card so I can you know counterfeit that uh I'm going to try to tailgate I'm going to try to fish I'm going to come at them from all these different angles right and let's say that this is over months and I'm going to have people helping me that is an operational level engagement where it's it's multiple people over a long period a relatively long period of time strategic level is pretty much nation state stuff although when we're getting into large multinational corporations we're also starting to talk at strategic level this is things like influencing elections uh causing uh internal um Discord to the point where it becomes internal Civil

War uh stuff like this this is strategic level security threats so when I put these all together what we get is this table because I'm an academic and I love tables uh where we got physical at layer one we got technological at layers 2 through 7 uh and then human 8 9 and 10 and um I don't want to venture into The Nether world here so I'll stay here but if we looked at layer8 at a tactical level you'll see the social engineering uh you can see it at the uh operational and strategic level I give examples of all of those kind of along uh the uh rows there good um okay so coming back to uh um

level of maturity um as I mentioned before I wanted to count for snake oil right personally speaking I have not seen a lot of support for neurolinguistic programming but I know a couple of case officers who absolutely swear by it they're like I use this stuff and it works I'm like okay um that would be something I would put at a fairly low level of maturity um yeah I'll walk through an example but um basically I have uh five Le levels to uh level of maturity so neurolinguistic program is something that I would put as uh theoretical uh I have not seen a lot of scientific empirical re uh support for it and so I'm going to say it's

there but there's not a lot of empirical support for it proof of concept this is something that's been documented but it's not actually out in the wild we're seeing a lot of this in AI right now right so it's like somebody will do some attack it's a really cool attack but we haven't actually seen it you know out in the wild uh something that's been observed in the wild is kind of the converse of that is something that we've actually observed somebody's lost money to it or Worse common use we all know that it's out there and it's it's available and then uh well established it's got a lot of uh scientific empirical support behind it and uh it's

easy to observe so we'll work through an example very quickly and then I'll kind of tell you a quick use case of how I've used this so loss aversion is a cognitive bias that uh everyone has and basically what it says in a nutshell is that you feel um losses more viscerally than you feel gains so if you lose $100 that's going to hurt more than if you found $100 on the ground uh we've got the short description the ID uh the layer operational scale I I'm putting what's it's usually used at um I'm going to probably build that out later but that's that's where I'm at right now I categorize that as a vulnerability

because it's a bias that can be used to exploit a person so there are psychological tactics that can be used to EXP exploit the vulnerability the cognitive vulnerability of loss aversion an example of that would be oops uh would be um um scarcity so imposing scarcity like hey you're not going to have this resource after a certain period of time invokes that law loss ofers um that loss aversion so it's kind of like um it's the exploit that's operating on that vulnerability happens all the time in fishing click here to uh uh prevent losing access to your account that sort of thing okay so how I'm starting to use this and I'll get into U some other uses

uh in a little bit but uh basically I'm building these all out into a graph where if I wanted to attack specifically loss A verion I can do it by using these different exploits or uh tools tactics techniques and procedures conversely if I know that I'm going to go attack a Target using a fishing email I can use this to start looking at let me come back to that I can use this to start looking at different vectors that I might take so for a CEO um I can sort of scan likely cognitive vulnerabilities of a CEO and I can use that then to tailor the fishing email that I'm going to send to those

vulnerabil and the way that I'm doing this right now is I'm taking these definitions and I'm basically plugging those into an llm in the forms of prompts to help me to tailor to automate the tailoring of fishing emails to certain individuals now if it's a CEO they're probably not going to react very well to an appeal to Authority some random person suddenly telling them heyy you need to do this because I'm the boss doesn't work very well when you're the boss and so I can account for things like that so the dotted lines are actually negative connections those are things that might actually backfire uh but decision fatigue which means that you um your um ability to make effective

decisions at the end of the day starts to decline because your your neurons have been basically firing the whole time that is something that is probably very likely to work on a CEO right because they're constantly making decisions so I can use that then to then decide what type of an attack I want to launch so let me come back one slide real quick and um hey Jessica how are we doing for time okay okay cool um okay so um I I actually do work for a living now but I used to be an academic so I I still kind of have that in me and uh one of the downfalls of being an academic is

that I commonly build things and I have no idea why I'm building it it's just like oh this is really cool and I just want to do it and then I'm like ah but what are we going to use this for is there really a reason so no joke this morning I was like oh I got to come up with some reasons so uh anyways um okay so beyond having just this catalog of attacks right I'm thinking that having this catalog may help with uh threat modeling and it may also help with um threat intelligence and if anyone Works in either of those two spaces and has an op of this I would love to hear it

because kind of what I'm thinking is if we have these tags right on these different types of attacks and then we have um like the the Bad actors they're they're people generally too right and they've got these vulnerabilities and one of those vulnerabilities is that they have habits and because of these habits they have certain ways of doing things over and over again and with these tags we can start to develop cognitive thumb prints of these different attackers so I don't know maybe maybe I'm just bsing but that's that's one thing that I kind of came up uh the other idea is that we can unify these ideas of cognitive Warfare and social engineering because there's a lot

of overlap in these two Fields um and uh yeah and then the last thing was I just showed an example of how I'm using this to build cognitive attack graphs to um help with automating cognitive attacks so anyways that's uh that's kind of what I came up with if you'd like to check it out and give me feedback I'd be very open to it uh here's my uh contact uh email if any of this is interesting I do a weekly meeting we're on a two-e pause because of the cons but we do a online meeting and we're limited to 50 participants or less uh you can see uh presentations from previous meetings on our YouTube

channel which is uh the URLs also right there what the way do it is a presenter comes on they'll present on a topic they'll have PowerPoint slides usually in the whole the whole deal and we'll record that that portion that portion gets taken and it gets put up onto our YouTube channel the conversation afterwards stays within that meeting we don't publish that so the nice thing is is that we've got a nice little Community going and it gives people an opportunity to talk one to one with um authors that are doing really really cool stuff uh res archers in different fields and um yeah with that I I'm done I'll take any questions thank

you any questions

thank you hi uh thank you for the talk I really appreciate it I learned a lot um one thing I was wondering in terms of the the attacks the cognitive attacks that you've identified is that an emerging property of llms and how they process information or is it just human biases imprinted in the training data that are then kind of being exploited because we have bad training data um are you spe specifically speaking to the graph um sure not specifically I guess just sort of in general okay um no it's not particular to llms so llms are under that umbrella but there's a lot of human stuff there too um the attack the attacks I think are the new part because

what I'm doing with the graph is I'm taking the information from each of those nodes and those become prompts into the llm and then those prompts are seeds which then the L M uses to create the attack that then I launched does that make sense okay so this so you're not attacking the llm you're using the llm as a tool to I I I could attack an llm if if so what it would look like is um instead of having a CEO at the top there it would be a certain type it would be Claude the llm at the top there and it would be like known vulnerabilities for Claude and then I could come down and have uh attacks for

each of those vulnerabilities okay but I guess my question then would be are those vulnerabilities tied to the training data that Claud is being trained on that's sort of inheriting human biases that are sort of laid in the training data so that's a a really good question I don't have the answer for it um probably yes but um here's the weird way to wait think about it right is is loss aversion a construct of the training data that humans were trained on do you see the mhm yeah so that's where it starts to get really weird yeah thank you

yep um it's a bit less sexy than threat Intel but one area that I can see quite a bit of overlap with this content is Consumer Protection Law so there's a lot of research into so-called dark patterns um the ways in which consumer decision get hacked quote unquote um there's actually a lot of research being done in the Australian government at the moment on how creating taxonomies of dark patterns for the purposes of consumer regulation let's talk after the talk uh yes uh so um oops I'm going the wrong way um yeah so dark patterns I actually have um 20 somewhere between 20 and 30 that are listed in in this I'm big fan of dark patterns and yeah I'm trying to

get you don't happen to know the gentleman that started that do you because I'm trying to come have him come speak at the cognitive security Institute so if anyone does please put me in touch because that's high on my list right now um by the way really cool paper uh about using uh dark patterns to uh fool hackers which I thought was really interesting sort of playing into hacker uh cognitive biases

any more questions remember Matthew needs an indexer y'all thank you Matthew all right thank you very much the next talk is in here at 1400 do deception invading me base tunnel detection with box attack techniques [Music]

[Music]

[Music] back

[Music] hey hey hey [Applause] [Music]

hey hey hey hey hey hey [Applause] [Music] [Music]

he [Music]

[Music]

[Music] [Applause] [Music]

[Music] [Music] [Music]

[Music] [Applause] [Music] he [Music]

[Music]

[Music] h

[Music] oh

[Music] I'm something I'm just trying to give you [Music] something I'm just TR to give you something I do I'm just TR to give you something [Music] a [Music] a [Music] [Applause] [Music] [Music]

[Music] [Music] I'm just TR to I do you I'm just TR to give [Music] something I'm just [Music] something I'm just trying to give you something oh [Music] w

[Music]

[Music] a

[Music]

[Music] [Applause]

[Music]

[Music] [Music]

[Music]

[Music] a I

[Music] oh [Music]

[Music] [Music] [Music] [Applause] [Music]

[Music]

[Music] oh

[Music] [Music] [Music] [Applause] [Music]

[Music]

he a [Music]

[Applause] [Music] hey [Music] [Applause] he he [Music] [Applause] [Music] he [Music]

[Music]

[Music] track [Music] hey hey hey hey [Applause] [Music]

hey hey hey hey hey [Applause] [Music]

[Music]

[Music] [Applause] [Music]

[Music] [Music] [Music]

[Music] [Applause] [Music] he

[Music]

[Music] h

[Music]

[Music] w w [Music] [Applause] [Music] [Applause] [Music] I'm just I I'm just tring [Music] something I'm just tring something okaying I do you I'm just trying to give you something [Music] w

[Music]

[Music] [Music] I'm just to I'm just going to give you something [Music] I'm just trying to give you something I do you I'm just trying to give you something [Music] a [Music]

[Music]

[Music] [Music]

[Music]

[Music] oh [Applause]

[Music]

[Music] e [Music] n [Music] oh [Music] a [Music]

[Music] [Music] [Music] [Applause] [Music]

[Music] a [Music]

[Music] a [Music] [Music] [Music] a [Music] [Applause] [Music]

[Music]

[Music] he

[Applause] [Music] hey hey hey hey [Music] [Applause] [Music]

oh [Music] he

[Music]

[Music] track [Music] oh [Music] hey hey he [Applause] [Music]

hey hey hey hey hey hey [Applause] [Music] he [Music]

[Music]

[Music] [Applause] [Music]

[Music] [Music] [Music]

[Music] [Applause] [Music] w a

[Music]

oh [Music]

h [Music] oh [Music] [Applause] [Music] [Applause] [Music] [Applause] [Music] I'm just try to get some okay I do I'm just TR to give you [Music] something I'm just TR to give you something I I'm just TR to give you something [Music] he [Music] w

[Music]

[Music] [Music] I'm just TR to I'm just TR to [Music] something I'm just [Music] something I'm just trying to give you something [Music] n [Music] w

[Music]

[Music] [Music]

[Music]

[Music] [Applause]

oh [Music]

[Music] [Music]

[Applause]

[Music]

[Music] the [Music] n [Music]

[Music]

[Music] a

[Music]

[Music] o

[Music]

n [Music] [Music]

[Music]

n [Music] [Applause] [Music]

[Music]

a [Music]

[Music]

[Music] [Applause] [Music] hey [Applause] [Music] [Applause] [Music] he [Music]

[Music]

[Music] St [Music] hey hey hey [Applause] [Music]

hey hey hey hey hey hey [Applause] [Music]

he [Music]

[Music]

[Music] [Applause] [Music]

[Music] [Music]

[Music] [Applause] [Music] he [Music]

[Music]

[Music] h [Music]

[Music] [Applause] w [Music] [Applause] [Music] I'm just in I'm just dring in [Music] your I'm just TR I do I'm just TR to give you something [Music] m [Music] St [Music] [Applause]

[Music]

[Music] [Music] I'm just I'm just dring [Music] something I'm just dring in something I do I'm just trying to give you something he [Music] oh [Music]

[Music]

[Music] [Music]

[Music]

[Music] [Applause]

oh [Music]

[Music] [Music] a [Applause]

[Music] oh

[Music] n

[Music]

a [Music] oh [Music]

[Music]

[Music] [Music]

[Music]

a [Music]

[Music]

[Music] a [Music]

[Music] [Applause] [Music] oh [Music]

[Music]

[Music] [Applause] [Music] hey he heyyy [Music] [Applause] [Music] he

[Music]

[Music] TR [Music] hey hey hey hey [Applause] [Music]

hey hey hey hey hey [Music] [Music]

[Music]

[Music] [Applause] [Music]

[Music] [Music]

[Music] he [Music]

he [Applause] [Music]

[Music]

[Music] h

[Music]

[Music] w a [Applause] [Music] [Applause] [Music] [Applause] [Music] I'm just I'm just TR to give you something [Music] I'm just trying to give you something I do I'm just tring to give you something [Music] w [Music] [Applause] [Music] [Music]

[Music] [Music] I'm just to I'm just TR to something [Music] I'm just I I'm just trying to give you something [Music] w

[Music]

oh [Music] [Music]

[Music]

[Music] [Applause]

[Music]

[Music] [Music]

[Applause]

[Music]

[Music] oh [Music] n [Music]

[Music] [Music]

[Music] [Applause] [Music]

[Music]

[Music] [Music]

[Music] [Applause] [Music]

[Music]

[Music] oh [Music]

[Applause] [Music] 2024 this is the ground truth track uh we'd like to thank you all for being here we'd like to thank our sponsors for sponsoring uh Al together you guys make this happen so thank you um uh just want to introduce uh Emmanuel Valente I say that right okay great uh so everybody give them a warm [Applause] welcome hi everyone thanks for joining um in this presentation at beside Las Vegas I I'm TR be here um I'd like to thank uh besides for the opportunity is the first time uh as a speaker um so let's dive in uh this is the the quick overview of the presentation we're going to start with the introduction and then we're

going to uh Define some Concepts that is important to understand the research and next we're going to see adversarial attacks um by the end of the presentation I wish show you a a real world demonstration and also will provide uh all that source code so that you can reproduce uh the experiments and by the end we're going to conclude our presentation so who am I I'm a sub secured engineer at Ty food um with more than more than 20 years of experience in networking and cyber security uh I've been also focused on uh machine learning for the past two years um and also a Mastery student at University of s Paulo about I food I

food is the biggest uh deliver food in Latin America it operates in Brazil uh it's a data driven company since the beginning just to give you a idea I food has now more than more than 2,000 uh machine learned models running on on produ uction why am I here um this presentation was designed um by everyone including those who have not yet uh any prior uh experience with machine learning so uh although this content is based on do um by the end of the presentation I will I will show you the generic algorithm so that you can like any machine learn model any is any machine learning model so stay here uh I'm sure that you can uh get the the the

techniques and the what we will discuss here okay so um start your 90s um in 90s attackers they established um DNS DNS um between the V machines and and to the attacker machine um the attackers would be able to exrate data and do the command control and here it's a paper that uh summarize all the all this time in 2010 um we got the the first machine learning models to identify um those tunnels the in the very beginning uh we got some neural networks but now uh the state of art models are um emble models like gradient boosting extended gradient boost boosting and models like that and today uh this is the scope of the my research

we're going to attack those models so that we can bypass the the model and the communication with going through the the tunnel um I will show uh how in a few moment and also we'll defend the model against those attacks okay now uh some basic con Concepts um we have a traditional DNS and the new do do uh it's a DNS over htps basically um you have a adns payload inside of the hgps payload uh why is that because do uh it uh solves many problems of the of the uh the old adns like like privac private and if dropping if dropping um and things like that here's an example uh of of um do

query using Cloud flare do

server this imag um you summarize the the tunnel what we have here um in the righted arrow we have the actual malicious atonio it use just the DNS infrastructure all the traffic is is going through the tunnel using the DNS infrastructure and in in black arrows we have the regular traffic like web traffic and the regular do traffic here okay

now um this is the most common tools that you can create tunnels uh all of them are open source I prefer using dnst TP because uh is new is written in goang so it's easy to understand and how of and how of our experiment is based on dnst DT here is the big picture of dnst in back hand side I have the applications that uh sends the the data to the the tunnel and on the other side I have the application running on the client side uh here it could be and for instance a compromise machine and here I have the uh attacker machine now I going to show you the first demonstration uh it's a minimum uh

DNS Tel why I'll show that because we're going to use the same infrastructure to to build our uh main demonstration okay

let's pause here in the left in left column we have um the attacker machine it's a ec2 instance running on AWS with the elash KP um on the right side I have the infected machine I just log in on AWS just to establish the the

tunnel it's important to mention here that we don't have any model um evaluating the the connection we're going to discuss in a few moment okay

here's interet cat in the attacker machine on the other side we close the the

tunnel dasing the communication high and high was received here on the other side okay ah important I will provide all the video in the repository the video and the and the presentation

okay so um talk about Network features uh we use um Network features from Network flows basically we're going to get features from the from the piap Le um you can you can for example use TCP dump for getting those those uh features from Network here's an example on how how to get them um you can use a tool called duizer it's open source tool that can capture and convert your data in CSV format and also you can use duizer to both capturing and converting on CSV format here's uh the models that are are state of the art uh in the red is a gradient bushing model that we we will build here and and attack

here now uh just to to recap or for those who don't know the adversarial attack just to summarize our model is representing uh thef function is our model and in DX is our input is our uh Network features and why is the label uh in in our case it could be uh benign connections or malicious connections from tunel when I when I um create some some perturbation in Delta we can um we can do the inference again using those perturbation and ultimately ultimately I can uh the model can classify can predict to uh another class so for instance I could have here a malicious connection here and after perb I could have uh I can have um a

benign flow uh here's uh an example use images this image uh is from the good fellow paper uh on the left side we have a Punda in the middle we have um a perturbation it looks like a Rand bites but there's a lot behind it because when I sum up the the two images the third image is is the result of the of the sum um at the end the last image is classified as a gibbon is a a kind of monkey in Portuguese is

jibone now um definition of white box um um different from the a traditional pentest in in machine learning um uh lingo right box is when the attacker has access to the model um architecture architecture and like the weights um the derivatives and all the information about the model in modern uh attacks the attacks are stated as a minimization a problem here we have a distortion and the loss Distortion is uh how big is our our perturbation and the loss is um how far I'm from my uh how far I from the the the target label so here for for example my target label is Bine and and my my actual label is is malicious so Alpha

and beta are are factors to uh a threshold to to say what the what factors can contribute more during the process of minimization by the end I have a array here of um adversar examples here after the the minimization problem on the other hand in blackbox attack I don't have access to the model architecture I just have access to the inference uh end point okay so the equation in this equation I don't have the derivative here I just have the access to the the inference end the point so to implement our attacks we're going to use the art framework is the most used uh framework nowadays here's an example of how we can generate a blackbox attack

um we just install with Pip and then you can uh load the model you you create an object of the model remember is a black box attack this object it guarantees that we just have access to the inference U end point inference method of the model and here is the where is the actual attack uh occurs

okay so um now this is the special case of blackbox attack it's called zero order optimization attack uh this this kind of attack is um is used when you don't have access to the D derivatives or or uh when you can access to the derivative but is invisible to calculate so you can use zero of order optimization attack why zero because you don't have access to the derivative but you can't um simulate uh a derivative in this case the simulation is the difference between the the actual um actual uh inference and the the perturbed

inference we use um and for minim for minimize we use Adam Optimizer here uh if you would like we can discuss about the about the minimization process we use binary search to find the optimal uh value of C here and we use l Norm to uh calculate our um disturbation okay Distortion sorry ah this is called um vanilla attack because it's from the the original Zoo attack so we called vanilla because there's no modification if compared to the original attack okay

you can run this attack use in the repository this is the steps that that you can that you must must follow to to replicate those attacks and all the details you can find in the repository okay um the vanilla attack is um you're going to get high high success rate because the algorithm has freedom to attack but on the on the other hand you're going to get some weird uh result like uh negative time for for time based features uh huge huge packet size like uh Giga and terabytes and you and it attacks uh complex uh features like coefficient of variation of packet time how do I uh can instrument the the tool to uh create the same result of the

attack it's impossible and to solve this this problem we came up with the target attack the Z Target this attack was um was cre during my research basically uh we we create more more two arguments that is in red here the first argument is the is the tunnel limit tools that say the minimum and minimum and maximum values of the of the features that I I will attack and the second new argument is the a feature list is array that's containing all the features I I wanted to attack okay after running this attack you're going to grab um low success rate but that's okay um because you can see all the results uh those attacks and you just need one

instance of success and having that you can you can see uh the values of the features and you can uh reproduce using your your your tool to uh accurate those those values to bypass the

tunnel here's an Autobook Chun the new attack

and now a generic algorithm to attack any model we're going to go through uh that together with detail and after that I will show the demonstration so um the first step is having the tool um you could be a do a do uh at2 but if you have another um model that you attack and if you have another two you can use those two and in our case we use dnst okay so dnst is our uh main tool here in this case but you can use any tool step two you have um you need um create the the connections using some tools for for capture like TCP dump Ro CH scappy or Aizer and here we use the DCP dump in the

repository I give details to reproduce this in step three you have to identify which features to attack let me show you details here in piap um features you have more than you have um 28 possible features to attack uh which uh I can choose the best features are that you can modify like uh a packet size um response time request time so these are these are are the best features to choose but it's important to mention that uh the model must uh use the same feature you are attacking uh in the model okay the only requirement number four you have to set limits of your Tool uh here's is The Notebook that you can use

um get get the the limits of your tool basically you need just the specify the CSV file that notebook will calculate the limits of your tool after that you're going to get this file here with the limits of your tool for each feature and after that you can um knowing the limits of your tool the possibilities of your tool you can Define uh greater values or or small values for each one of course um you don't want to uh add some negative numbers for uh for time uh features here here's an example of of um modify uh limits and for example here is um the first the first is the with features that fls by

scent in flows by scent I put 0.9 this dat are they are uh normalized so I put the the maximum not the maximum but the close to the maximum value 09 okay after that you execute your your attack uh the targets you attack just run the notebook here I'll show

you here we specify which features with feature to attack and then just run this this attack and number six identify the most altered features like uh a top four using absolute value here um the attack was able to attack U at 22 instances and the foremost um features was was that five I'm sorry five most so last step is apply these values on on your tool okay I'm going to run a the final demo that we use this attack to bypass the the real scenario

no let me explain here uh in the left side we have the attacker machine on AWS now is a is a new guy in the middle in the middle we have um our model and um a script that loads loads uh the model and that captures all the traffic in the same time it captures it uh convert the features in the normalized format so that the the model can predict if if the the connection is beny or malicious all that is available on the repository for you on the right side I have the infected machine instead of using a netcat in the back hand we will use a python script to uh add in our feature values that we

we got from Attack

okay uh login on

AWS here um I just I just pass the model PF you can create your own model using your own data um there's um all the steps that you can do in order to applicate the experiments so as you can see it loaded the gradient botion model a job liip file and starting to uh listen the connections and now I starting the TCP dump on Port 443 and the tunel is closed so now I'm going to use the the file to send data to the model the same file I used I used to I to train the model let me show here it just the connect to the 7,000 port and generates data okay and the model will will predict the

connection TP dump here

is predicting as as malicious as expected because we trained the

model malicious after that I going to use the feature values that we got from Attack

that's important you don't need to uh to put the the exactly value what's the matter here is that uh the magnitude of the of the value like uh 10 uh 100 uh a th000 okay in important starting adding uh one feature per time if it doesn't work you're going to increase the number of the features until the model classific as

benign here uh we we're going to choose the the packet size 2,000 uh bytes 12

kilobytes that's important uh the flow Time by default is is 2 second you can you can change uh inside of the duizer tool to modify uh your flow our flow time is 2 seconds and now we going to put uh 2 kilobytes as as packet

size and the model will classify as benine because we could bypass the model I going to back and forth using benign and malicious

you can use the same technique for any model and it works okay

okay I going to upload this video so you can check with with your time the PLC

okay so to wrap up this algorithm it can be applied for any model this technique the same attack

for defending uh the most basic uh uh technique to to defend is um to train your model using the same adversarial examples which means that you're going to get the features attacked and you use the same attack data to train your model you can use um um another uh suggestion is not use features that are changeable by the user like pack a time uh or any other time based uh feature or uh any feature that depends on the on the size of that user can can change you can also use some auxiliary uh model to uh detect those kind of um

modifications for conclude um adding constraint in in the vanilla uh zero order attack we could also we could to we could uh bypass the do model and we could also able to create a generic algorithm to attack any model that you can can

use feature work our goal here here is validate uh the same technique with other kind of of of of um a problems like um the actual NS um model we can also we wanted to identify uh another feasible techniques to using other blackbox algorithms and we are open for contributors uh here's uh a a contact from Dr Loreno and Dr Julio

and all the source code here is available on the on this GitHub post feel free for taking the pictures

okay and that's it

are there uh snort or cicada modules that will detect these type of uh G tunneling I'm not sure I'm sorry I are there are there nor or cikata rules or modules which will detect yes um cata rules uh was available in 90s for uhal detection but uh they are root based uh uh uh rules so they are they are limited but when you when you create the models you have more more more flexibility the rate of success is high so but if you don't have nothing of course go use uh swiata that's okay thanks well done Emanuel great presentation amazing to see what's happening down there in Brazilia at iFood fantastic work my impression is

that you have a really complex framework to work with at least for from from a beginner's mindset I would say I just wanted to learn from you how generic the tools that you have created are currently so that it could be used for any sort of pcap handling so my question relates to pcap how generic is the framework currently accepting the ingestion of peap feeds good question um nowadays p app basically is the standard for the network models so um any paper uh you're going to see all the network features is based on on p app so uh it's easy to uh to find tools that supports uh um aicap uh features in the wild all right let me

also mention if I may that Nelson wutu our friend is watching on YouTube so he sends kind regards to you thanks um and also um if I wanted to start like I know cyber security I don't really understand machine models from um a research perspective pure research what would be the first steps you would you would advise towards general public um I'm kind of biased but you can start using kago uh kago uh you can for instance learn machine learning basic machine learning 101 machine learning but also uh you can uh get uh writeups from uh AI Village uh CTF of machine learning you can uh learn a lot of with that I I learned a lot uh

reading that uh even uh at the beginning uh because you're going to get some uh different uh answers uh with the same problem and it is wonderful to learn uh I see and if I still May sorry um I understand this is like almost pure research you don't really need to have a clear goal but I would would like to ask you are you pursuing some sort of particular goal with this tool set that you're building and the contributions that you're making is there any research question that you want to answer with this platform that you're building I don't know let me try to guess let's say do you want to prove a food security at a certain aspect of the

platform by conducting this kind of research or it's like open research where anyone can figure out what they want to do with your Mo your attack models the goal here is do the open research so that uh all the people in the world can contribute can can uh benefit of the of the research so fantastic hopefully in the future I will will benefit on that but the main goal is uh open the research for everyone yeah as as the the the the gentlemen said here before eventually this could be part of some sort of intrusion detection Improvement in the future like tools such as for example snort and Surat could benefit from any sort of improvement you're doing in the

AI field to improving detection engineering maybe yeah just but um it's important to mention here that uh uh where can I find those kind of of models and for instance I'm sure that uh cross strike runs that not in in your machine but in the in the cloud uh of course uh uh much better than than I did here but is a start point to to learn things and in the future contribute for more uh impacting uh research all right thank you very much and congratulations thank you very much

than you [Applause]

[Music] [Applause] [Music] [Applause] [Music]

[Music] he [Music]

[Music]

[Music] TR [Music] hey hey hey [Applause] [Music]

hey hey hey hey hey [Music]

[Music]

[Music] [Applause] [Music]

[Music] [Music] [Music]

[Music]

[Music] [Applause] [Music] oh [Music]

[Music]

a [Music]

[Music]

[Music] now [Music] [Applause] [Music] [Applause] [Music] w [Music] [Applause] [Music] I'm just TR to give something I do you I'm just TR to give [Music] something I'm just I do I'm just to give you something [Music] oh [Music] [Applause] [Music] [Music]

[Music] [Music] I'm just I'm just dring [Music] something I'm just dring something I do you I'm just trying to give you something he [Music] w

[Music] a

[Music]

[Music] [Music]

[Music]

[Music] [Applause]

oh [Music]

[Music] [Music]

[Applause]

[Music]

n [Music]

a [Music] okay good afternoon everyone besides Las Vegas see um ground ground truth we are here for the talk what do we mean when we say internet measurements and why it matters so much for security this is Ariana Marion few announcements before we begin we'd like to thank all of our sponsors especially our Diamond sponsors prism cloud vanta and our gold sponsors Adobe Drop Zone project pro circuit um without their support along with other sponsors donors and volunteers that makes this event possible cell phones these talks are being streamed live as a courtesy to our speakers and audience we ask that you check to make sure your cell phones are set to silent thank you and

[Applause] enjoy all right thanks so much folks can everyone hear me okay volume's okay awesome thank you for the thumbs up um yeah so what do we mean when we say internet uh measurement and why does it matter for security let's get into it so often when people say security research like when I'm like oh I work in security as a researcher like oh you must work in threat Intel no pentester no okay but surely you're reverse engineer it's like no I work in this other field called internet measurement or evidence space research and what I'm hoping is that you folks leave this talk with an understanding of what it is some examples and that when

you hear the term security research you think of more than just those first three pillars but before I can get into some examples I think it would behoove us to set the stage and talk about what is measurement at a very broad level measurement is a way to quantify the world around us with the goal to better it and the thing is that the security field is actually fairly nent I mean I know it's only been around for like two or three decades um but that's still pretty new there are other fields that have been using measurement at the core of their science and their principles are far longer for example I know folks who have figured out ways to measure

depression and cognitive well-being in adolescence using measurement um there's been some amazing work as many of us are aware with our masks on um on the public stage about how to measure covid-19 spread and how to mitigate it um and then also you can use measurement in physics to see what is the polarization of light particles so measurement is prevalent all around us it's been around for a long time it's using all these different fields and internet measurement specific is about quantifying and improving the many parts of the internet including the security aspects and the nice thing about internet measurement is that uh it can contain a lot of different facets you know we can measure server and server

behaviors we can measure people like what are they doing or we can measure other people um like the attackers who are trying to break into servers and so there's a lot of different facets when I think about internet measurement so I'm going to concretize this a little bit with three examples what are three ways that we can actually use internet measurement to improve security and so in my own work I've um asked myself you know how do we measure attacker behavior in the hack for higher Market how do we compare our results to ground truth when you're an internet wide scanning engine and also in the context of a sock or an S so how do we improve vulnerability

notifications um within an S so and the cool thing about internet measurement is that like yes there's science and yes we're trying to do like really valid work but when I think of Internet measurement I think of it as a proxy to answer these overarching strategy questions to try and help better the world and so really when I say how do we measure attacker behavior in the hack for hire Market I'm actually thinking what defenses can we build in email to better protect targeted users when I think about how do we compare our results to ground truth really what I'm asking is how could our scanning and comp is more breadth and depth and be

more useful to users and then finally when I'm thinking about how do we improve vulnerability notifications it's not just about the notifications in fact it's how do we reduce the attack surface of an IT organization by removing these vulnerabilities and so just a quick uh break to answer why you should listen to me for the next 30 minutes um my name is Arian I currently work as a senior security researcher at census um but prev I did my PhD at UCSD where my thesis was focused on you guessed it internet measurement and security decisions and so all of the projects that I'm going to talk about are projects that I've worked in pretty directly and I chose these three because

they're very different so my goal is to just give you a a very broad range of what internet measurement can actually be and so I'm going to dive into it by jumping into this first project you know how do we measure attacker behavior in the hack for higher market with the overarching goal to figure out what defenses we can build to better protect users and so as many of you you know um email accounts are super rich in information which makes them super lucrative attack targets and defenses have made large scale attacks difficult defenses like two Factor authentication span filtering and um uh security questions like what are your hopes and dreams um but targeted attacks still

remain an issue and that's because the economics in targeted attacks is very different instead of sending a 100,000 emails and hoping that like 10 people click it you are spending as the attacker a lot more time time um to cultivate a targeted attack a targeted message with the hopes that there is a higher payout and usually when we think of targeted attacks we think of high-profile targets like politicians celebrities you can tell these slides are kind of old because uh the most recent news article was John podesta um but the reality is that at the time of this study um and actually it still exists today there is this Underground Market that provides hacking services

for hire so this is an example of one of the advertisements it's essentially a hacking group that purports to break into any Yandex Rambler or Gmail account that you want to get broken into for anywhere from1 100 to $300 um and this heck for higher Market had not yet been examined but the reality is that1 to $400 is actually pretty reasonable for a lot of us if we've been jilted by an ex-boss an ex partner Etc and so me and my colleagues had set out to answer answer this question specifically these three main questions how many services can we find how sophisticated are the methods of attack and how widely used are these Services again with the overarching goal

to figure out what better defenses can we build I do want to um make a quick note that this entire study was done with Gmail because we were collaborating with some folks who worked on the anti-abuse team at Google but as you'll see the results can generalize uh pretty well and so to give a quick overview before I jump into the specifics the way that this process worked is we discovered these services that purported to break into Gmail accounts we then created online personas as the buyer and the victim in order to engage with the accounts because like I didn't want to give out my personal info um so I just made a bunch of online personas we then

engaged with the service as the buyer Persona saying like hey we want to hire you um for this victim again both the buyer and victim are completely fake um we then monitored the attacks from a a variet of different Vantage points and if we were if they were successful we'd deliver payment because they did what they said they were going to do uh so some people made money out of this study um I do want to dive really quickly into the buyer and Victim personas Because this is this is there's some important details here that'll make sense when I talk about the results so like I said I didn't want to use my own email it'll also be weird if I just kept

being like I have 20 people that I'm really mad about please hack into all of their accounts so instead every service that we reached out to we had made a buyer and a victim Persona and the buyer Persona we just made a Gmail address with you know some believable name like Ariana mirion some numers gmail.com but not actually my name um the victim Persona was a little bit more intricate and that's because one of the things that we had hypothesized is that in these more targeted attacks these attackers would take facets of this online digital footprint that a lot of us have and utilize it in the attack vector but we didn't know what they

would use so we were like we're just going to make as much of a digital footprint as we reasonably can and so the victim Persona had a Gmail address again that was like some generic first last name numbers um they had an online web page where they purportedly worked or owned so I made a lot of small business owners um and in the web page we linked the Gmail address we also made additional Gmail addresses that were their Associates that were also linked on the web page so like in one of the cases uh I remember the victim was a a man in this like he owned a carpet cleaning website or a carpet cleaning

service with his wife and they owned it and there was this whole backstory I got really good at this um we also made a Facebook page for these victims and then we um set up SMS to fa on all of the Gmail addresses and this was specifically because at the time and I I don't know what the statistics are now but at the time SMS tofa was the most widely used form of two-act authentication and so we were like we not only want to see if they can um create a convincing attack but if they can also bypass what is the most effective second form of authentication at the time of this study and so you know the first question

we want to answer is how many services could we find um we looked at a bunch of underground forums we reached out to a number of our contacts at various abuse um anti-abuse Team abuse teams anti-abuse teams at tech companies overall we found uh 27 services and we reached out to all of them 10 of them never responded to us for reasons I could hypothesize I don't really know for sure um 12 of them responded to us like initially like yeah we're totally going to hack into this account for you but then made no attempt um three of them were scams so they like you know purported that they broke into the account um but we we didn't see any

indicators of compromise we didn't see any like logs or uh like Gmail logs that anyone had gotten into the account um and my favorite scam was this and I regret that I don't have the GIF um so I put in the email address and then I just watched this web page for three minutes as it told me that it was hacking into my Gmail account and there was also a button to speed it up and you could pay them more money to go faster they did not break in I lost like 250 bucks on this um it was really entertaining though really cool thing about doing academic research is that you can do a lot of crazy [ __ ] and then someone pays

for it and it's like all right well that's great um but that means that there were five of the 27 that made an attempt and so for the rest of this portion of my talk I'm going to focus on those five because that's that's where we had results um how sophisticated are the methods of ATT we never observed any communication with the Facebook account and we also never observed any communication with the associate so we set up all these facets of the digital footprint and a lot of them just like weren't really utilized in the attack um all of the five Services sent an email to the victim and in one of these Services the email contained a malware

executable that wouldn't run this was like the one time in my life I really wanted to get owned we tried on a bunch of different laptops bunch of different VMS it was just it was just broken um we uploaded to virus total which said that it was probably a remote access Trojan which for those that aren't aware um it's basically a piece of Mal that records what you're doing on your computer including typing what uh including recording what you're typing into gmail.com um but that means that four of the five Services used fishing in their attacks very good fishing um like when I said that they use fishing these were incredibly targeted highly crafted um

messages and this next graph that I'm going to show you well let's just walk through it um this next graph that I is on the page shows that these fishing attacks were persistent in personalized so I know there's a lot happening just focus on that top row for now because I'm going to walk you through what this graph is actually saying so the letter is the service which we've anonymized for privacy concerns so like service a with and the first time that we hired them um each of the dots represents an email that they sent our victim account and so you can see across all the rows they sent in most cases multiple emails and

then the color denotes what sort of fishing lure or what sort of um what sort of bait they were trying to use to get us to click on it and so light blue means that uh they use personal details and when I mean personal details I mean they use details from the web page that we set up in the fishing emails for this fake person that doesn't exist so there is no way they could have crafted this emails unless they had Googled that email address and had found that web page and then crafted an email that was spoofed to look like my wife at my carpet cleaning service um which was actually one of the emails um you also

see on the legend that there's there were different types of lures that were used you know some of them reported to be like a Google login some of them reported to be a government um but these attacks were persistent so when we didn't click they just kept sending more and they were also personalized and this was a pretty big finding for us it's like it wasn't a oneandone sort of thing they were putting in the work um they were trying multiple emails um to try and get in the X's is where we clicked on the emails because I was like I want to know how they get in um so these targeted attacks the tldr

that these targeted attacks were able to bypass two-factor authentication in their flow so I would click on the email it would take me you know the email would be whatever lure um it would take me to a Gmail signin page that was actually not Gmail but it was a domain that looked really close to Gmail I would put in the password for the victim account and then most of these um Services accounted for the fact that there was tofa protecting the account and so the next web page was actually again a prompt that looked like Gmail saying hey we just sent you a text to your phone could you please provide us that TFA code um the fishing attempts

that did not anticipate Toof so I put in my password and then it just like for Ford they adapted in other words they sent more emails later that then accounted for Tua in that fishing flow asking for the code um and in fact one of the services doubled the price when they realized that tofa was protecting the account like they came back to me as the buyer and we're like there's there's some more going on here we need we need $500 and I was like all right it's fine we paid them we paid all of them um the other so so this is an overview of like what was the of sophistication for these attacks right

um the other question that we were interested in answering is you know how widely used are these Services cuz we had a very narrow view we could hire them we could see what methods of attack they were using it wasn't any sort of fancy zero day it was just really sophisticated fishing um spear fishing but we wanted to know how widely used are these services and this is we partnering with folks on the Google anti-abuse team really came um in handy because they could look at the logins from the Gmail side they could create fingerprints because a lot of these ended up looking like fishing kits because they were very fast and um how quickly they were responding to us

typing in the password in the tofa code and then they could look and see how many real Gmail accounts had been logged in with these fingerprints and so this graph shows from March to October of 2019 um how many actual Gmail accounts had been logged into um and so it's fairly small I mean the Y access only goes up to 35 these are unique accounts but there's still hundreds of people affected by these services and this is a lower bound because this is people um these are actual Gmail accounts that had a successful login with this fingerprint not an attempted login and so there could be hundreds more victims who saw it um and were like oh that email looks

weird I'm not going to respond to that um in tandem with this research study Gmail introduced a couple new defenses that helps pret protect against this attack which we call man-in-the-middle fishing um and that's one of the measurement studies so at the end of the day we were trying to figure out you know what defenses can we build to better protect targeted users the measurement question underneath it was how do we measure attacker behavior in this really niche market and in the process we found that the attackers are not as sophisticated as we had hypothesize um and we also found an effective defense for this attack That Was Then deployed um a major email provider so this is one example of

a measurement internet measurement study to help security for good I'm going to change tracks a little bit and now talk about some work that I did at census more recently um the measurement question was how do we compare our results to ground truth and the overarching strategy question that I was attempting to answer is how could our scanning Encompass more breadth and depth it may be more useful to users um those of you who were in my talk yesterday are probably like wow this is dja you are only talking talking about Good data quality we love good data quality this is a lot of what I think about um so just a quick primary sense

this is the one place to understand everything on the internet it's an internet-wide scanning engine that um we do all the scanning and then you can access the data so you don't have to have your own servers you don't need to like have your own setup um for those who aren't as familiar with internet WB scanning I don't really have time to dive into it but it we essentially provide a map of the internet so like you know how Santa goes to every house and drops off off gifts um the analogy here is that Santa is going around to all the network devices in the world knocking on all their doors and being like hey what do you speak and what what

are you willing to tell me okay bye and then runs away to the next device um and so we do a lot of scanning to see like what's on the internet um what are these devices willing to tell us what services do they speak Etc and we have this interesting question which is a lot of people consider census ground truth but how do we quantify accuracy of ground truth when we are considered the ground truth truth um and like I said I think about data quality a lot and so we came up with this experiment which is we're going to compare census to nmap so nmap is like the OG internet scanner it's from the 90s um it's still around you

can still use it I still use it it's great um and so we were like okay we're going to take a set of hosts we're going to scan them with nmap and then we're going to also compare census results um to an in the moment end map scan to see where things differ the scanners are different but we were like this will at least give us a starting point to be like what are we missing if anything um because we want to grow and change and so this is essentially what I just explained um and this diagram I think shows a little bit better what we're trying to do it's like take the Census Data take the nmap data where's

the overlap where's the exclusion um how can we improve and so we ran this and we found that you know between senses and mmap we found an 87% overlap um which is pretty good there were 133% of hosts or services that sens is found but edmap did not so I'm going to just we can call these false positives but like it's comparing ground truth to ground truth so that um is maybe a little bit uh of a misnomer um and then we found about 5% of things where sensus did not find it but nmap did and so of course whenever I see results I'm like well let me go validate and do some manual digging to see what

the heck is going on and in a lot of these cases the discrepancies were because we were seeing these hosts that would be online you know we do like an in the- moment end map scan and they'd be online um and then they disappear like a couple hours later um only to come back and census is an internet wide scanning engine right so we're scanning pretty consistently but we're not scanning like every 30 minutes or like every in the moment um and so this actually brought this really interesting question which I I want to share because a lot of work with internet measurement is kind of taking a step back and being like why is this

happening um or what are we measuring the right thing like is this flapping behavior is are these differences a facet of the differences in the scanner or just because of timing because you know as an internet-wide search engine we have churn we have databases we're not just going to like run a one-off end map scan and so as with many moments in my life I was forced to take a step back and ask are we even measuring the right thing and so we actually ran the experiment again but instead of comparing the nmap scanner to the census API results which is what we originally doing we com we compared the scanners to each other and so we said okay we're

going to take the same set of host we're going to scan them at the same time across the two different boxes and examine the differences um and then again we saw like about 86% of the time the results were the same um in this experiment we found that census was able to find about 10% of services that nmap did not and at this point we're like we're doing a head-to-head comparison so we're going to go and manually verify this 10% and when we did it actually turned out that these were all services that we scan better like for example TCP sip is very complicated it's very funky on the internet and the nmap scanner does well but we have additional logic

to account for a lot of these weird edge cases on the internet so when I say we went and manually verified I mean we really dug in and we're like okay there's a lot of these instances in the 10% Like the majority of the 10% where we have just written code that accounts for more of the weirdness on the internet but then we found that nmap found 4% then our scanner did not and this was the place of growth and Improvement and a place for us to to continue growing as an internet white scanning engine but like I said one of the things we realize in this study is that these hosts some of these are just like going

up and down so we had this real now what moment cuz we were like ah one of the difficulties is that census data is not an in the- moment scan there is going to be a little bit of lag and there's a lot that happens on the internet so these hosts that are very ephemeral that are online for one scan but then all of a sudden offline for another scan we're not just going to churn them out immediately actually we want to go and like double check our work and make sure that they're still there or they're still not there and that we're not doing all this churn for no reason and so as you know as a team we were like what

can we do to convey to the public that there are these instances where we're like hey we saw this thing and now it's no longer online so we're going to go do some double checking just to make sure um how in other words how do we confy these instances where we think a specific host is no longer present but we're doing these additional checks to make sure that it's actually down and there's not like a network blip somewhere on the like 20 pads on the internet and so as a result we actually exposed this field you can see it right now it's called pending removal since um and this is an example of how it shows up in the

data um and it literally just says hey we think this host is no longer online but we're doing our sanity checks but as of 20244 0526 and then some other time stamp um we haven't really seen a positive scin so we're doing we're doing our due diligence the other thing that was interesting is that when we went back to our original results right where we compared the nmap scanner to the census API and we said okay let's go and see if we check how many of those hostes had pending removal since set in other words we think they're probably down how does our false positive change rate drop or how does it how does the false positive rate change

and it drops from 133% to 5.5% so like the majority of those cases were we see something but nmap didn't actually we had received a negative scan we just hadn't turned it out of the database yet and by exposing this flag we are providing more transparency to everyone who's using the data if we also accounted for the protocols that we know sensus can scan better the the false posit rate drops even more and so this is an example of a very different experiment where I basically just kept digging but as a result we reexamined what metrics of comparison we should actually be using for this pretty tricky question and then also how do we expose these

intricacies of timing externally as a result also this project prompted a totally different project which I presented on yesterday um where we scan the internet every 45 minutes and are now analyzing those Trends to try and better our scanning data so a lot of interesting stuff here with my last 15 10 15 minutes I'm yet again going to switch topics and talk about a completely different internet measurement project so first we talked about measuring attacker Behavior we then talked about how do you compare ground Truth for an internet wide scanning engine and now I'm going to talk about some work that I did when I worked as a security researcher at the it org at UCSD um where we wanted to

improve our vulnerability notifications but really we wanted to reduce the attack surface of an IT organization um specifically ucsd's it organization and this was an interesting partnership because I actually was doing this while I um I was in this role jointly while I was doing my PhD and it was prompted because the it org was like we want to do better we want to reduce our attack surface but things are pretty abysmal so what should we do and we were like hey can we run a measurement with you and they were thrilled so that's what we ended up doing with them so just for some background as I've done with the other two um a lot of

organizations have moved their infrastructure to the cloud but there's a lot of Legacy organizations like a large academic institution that has physical machines on premise that are still maintained by a multitude of different admins and in an Ideal World you have people who are updating the machines constantly so that they don't have vulnerabilities that are exposed to the public and the reality is that these disperate physical systems can um have vulnerabilities they are not patched at consistent rates and they can also affect the safety posture of an organization because all of a sudden you have all these different vulnerabilities that are exposed that an attacker can then use to get into the network get into the system wreak havoc and so

patching is not a new problem this actually has a really rich history but yet it persists and there's been a lot of advents uh that have been created um that have tried to make patching an easier process a lot of them are optimized for the machine but instead we wanted to take a very different approach we said what if we tune the process fine-tune the process for the human what if we took the process and the current technologies that are employed and examine holistically how to make how do we make this process easier for the people not only for the machines and this is a little bit of a different mentality um than a lot

of related work that I had seen in the space like a lot of plac like oh yeah just like set up pag your duty do this do this do this and it's like well what if you have an org that is 30,000 people um and those 30,000 people are all running different experiments they all have their different Hardware setups and they also all have their own admins who are not talking to each other so as with any good project the first question that we had to ask is what isn't working so far and so like I said we had teamed up with the it security team at UCSD and the the lead security engineer who was um in charge

of this endeavor had been sending out these emails and I was like hey what are the emails that you're sending out to these admins that no one is responding to and so this is an example all the um oh all the private information has not been redacted um whoops hang on well I need to show you this email that's awkward you want pull out the hm yeah I am this is literally recorded H my bad sorry folks live editing I had to re UMO these slides from a different slide deck and some of the things did not get copied correctly okay

great do the thing

no sorry for this nephu folks I also found out I was giving this talk yesterday so that's why I had to alter some slides real fast um okay we're back on track this is the email that he had been sending out to um the admins that were within the um it org and like I said a lot of these people work in different groups they're just all under this umbrella broadly of it um and so you can see you know there's a couple things that immediately stood out to me with this email the first is that um this was basically just a laundry list right it would say hey here are your SE four your SE five vulnerabilities here

are the hosts that they're on you need to go log in to um qualus and figure out what the vulnerabilities are to go patch and also you need to patch them ASAP um thanks and looking at this and thinking about the human in the loop right this email didn't list the vulnerabilities it just said go figure it out here's the number good luck um it didn't list any additional details it required CIS admins to perform extra steps to get the necessary information like they had to log into qualus which as we found out later a lot of the people didn't have access to qualus so they'd click on the link to log in and they wouldn't have access so they would

just give up and be like Oh I got the email again guess I'm not going to do anything and at the end of the day this added a lot of friction in order to execute right because if anyone has done any sort of CIS admitting you know everything so not everything things are constantly on fire and then you're getting told by the it security team hey also you've got to go fix these 10 vulnerabilities good luck uh to A system that you don't have access to so we reviewed some related work and basically just employed some pretty basic Behavioral Science principles right it's like instead of asking people to do all this additional work what if we just

laid it out for them as simply as possible and so this is the new email that we had started to send out to them some things that I want to point out um is that we every email only focused on one family of vulnerability so this one focuses on Windows we say how to patch and we would put instructions for different versions different dros if that was applicable um we would link to various resource articles and then also um which you can't see in this email there would be an attached CSV that listed out all the details from qualis for them directly so they didn't need to go log in anywhere instead they could see like what is the cve what's the

Vault name what's the IP etc etc all this metadata so we started sending these and then the the question was did the patch rate change did anything happen so we created this automatic pipeline to analyze the data um to see you know like during the old email and during the new email what was the patch rate for different contacts vs and different times of the month and we found in aggregate the patch rate increased from 3% to 78% which was huge that was like a huge difference people went from basically doing nothing to actually reading the emails and doing something but of course I work in measurements so my question was why is it only at

78% why is this not 100 I'm giving you instructions that I painstakingly wrote out and so we analyzed the data in a couple different facets and we found some interesting Trends the first is that some of the contexts were just much better at patching than others there were some groups some admins who were at a 100% there were some who were a little bit lower so the average was 78 but this was there were actually some pretty dis seate distributions when we looked at the groups and the contacts we were sending to um certain vum families just get patched more um patching something like Chrome is a lot easier than patching something like your Linux drro

and we saw that in the data where isolated applications were getting patched at much faster rates than something like your operating system um which makes sense it's more of a pain and then some V families uh just take more time to patch and this was kind of the the second part of that takeaway and so at this point we actually did something that was like a little new for me um instead of just looking at data we went and we talked to people so we conducted semi-structured interviews with many of these CIS edins they were anonymized to add a qualitative view to the quantitative data right like we had all these hypotheses and instead we just went to

them and asked why like why are you not patching this over that how do you feel about these emails um and the reason that I say semi-structured is cuz we had a list of questions and if you've ever done academic research you need to like get your questions approved by this thing called the IRB um but we also learned a lot of fascinating tidbits when we just let people talk and listen to what they said um in particular when we asked about the old email we saw three main Trends the first is that the monot monotonicity of the old email made it really easy to ignore that email looked the same every single week every

week it just said go log into qualus and check out what's vulnerable and people straight up told us it was really easy to ignore so they just started ignoring it um we also found out that many teams have exceptions like they're like oh I'm told by my manager that I'm not supposed to update these four different programs so I just ignored it it's like oh we were not aware of that as the security team uh that's interesting um and then we also found that a lot of these notifications fall outside of their patch cycle um so a lot of these teams have set times that they're going update specific applications specific programs so they'd see the email they're like

yeah I'm going to get to Red Hat in two weeks why would I listen to this email right now um overall we found really positive sentiment towards a new notification but we also found room for improvement and better integration like I said the fact that a lot of these teams were like yeah I have exceptions and I've been filling out this Google form with all of my exceptions don't you get the exceptions was news to the entire security team and so that was huge disconnect that has um since been remediated that we wouldn't have found out if we had just looked at the data we needed to go talk to people to find out that there was some random

Google form that they were filling out and they thought they were exempt from patching um yeah I know it was it was wild um so we not only improved the process but we uncovered systemic differences in infrastructure we figured out what are the right metrics and to my knowledge um these emails are continuing to be sent out and people are continuing to patch at UCSD so this is the third project right so in the process of trying to figure out how can we reduce the attack surface of an IT organization that's pretty spread out um we increased the patch rate significantly and we also found these major discrepancies in systems and organizations that could then be

remediated um and that was a pretty huge finding for us um oh it's 338 okay great so let's just recap I went over three very very different projects that in my opinion all fall under the umbrella of Internet measurement the first one we looked at attacker Behavior because we wanted to figure out what defenses we could build the second we looked at internet wide scanning data and compared it to other scan data to figure out how we could get better and then in the third we did a mixed method study to improve vulnerability notifications and specifically vulner ility remediation at a pretty spread out it organization and so what I'm hoping you folks take away

is that like internet measurement is a tool for security research but you can also employ this in your own organizations in your own um situations and so for example you know one totally different area that I don't work in is hate and harassment so you know how can we measure hate and harassment on online platforms to better think about defenses to protect targeted users I think about the user a lot I think we're all people at the end of the day um the this project that I had talked about yesterday you know how Emeral are host reports on the internet to better understand how do we alter our scanning to have even more up-to-date data for

users or you could go totally different direction say what trends exist in leaked data such that we can better understand how do we help journalists democratized knowledge about what's happening in the world via that leaked data and so there's a lot of different ways you can think about measurement you can think about it quantitatively you can think about qualitatively and what I hope that you've learned from this talk is that internet measurement is about quantifying and improving the many parts of the internet including security and it can be a tool for everyone so with that I want to thank you folks for your time um I think we have some time for questions but you can also follow with

me after on any of these handles and I'd be happy to chat thank you so much

first question what's it like being a badass all the time oh my God stop Christian I used to work together he's just trying to Hype me

up all right okay uh second question uh you kind of mentioned this earlier and you were talking about how measurement is relatively new in the cyber security space and you talked about these other pillars you know if you're threat hunting pen testing all these other things how much do you think we can learn from internet internet measurement and collecting and quantifying that data and using it to run experiments to gain better understanding how do we take that out of Internet measurement and take it to threat hunting and other things to be a lot more evidence-based I mean in my and I don't know if you share this we make a lot of decisions about what we

should do and what defenses we should put into place and what products we should buy and there's no evidence that they're efficacious or that one product works superior to the other outside of a vendor deck which you can't trust yeah so how do we take a lot of what you've just shown us which is objective measurement good quality data science and apply it to the other parts of cyber security yeah thanks for the question um I think that's really hard to answer um the the short answer is by taking a step back and thinking more about holistic approaches and so like often when I think about threat hunting or threat Intel for example we're thinking

about very specific use cases or like a very specific actor and one way to think about you know how do we bring internet measurement to that realm is like okay well like what is the spread of this actor who are they affecting how much are they affecting what does that look like over time and so I think one way to to start merging all these fields because I agree I think it's very useful is thinking more holistically so like instead of just in this time in this moment in this specific actor how do I generalize how do I take a step back and take a more Global uh holistic view so that answer your question

thanks uh so this is sort of a generic question that comes to my mind whenever I'm looking at my log files is it an ISS for you guys uh where people uh view what you're doing as being hostile uh where you hit honeypots where you hit firewalls that start blocking you where people you know try and hack back or complain to your ISP is that is that like just a little noise that doesn't matter or is it part of what you have to deal with you know when you're doing these measurements yeah just to make sure I understand correctly you're a little muffled I apologize are is the question you know like how do you

deal with um like honeypots on the internet people scanning back hacking back correct along with uh like there are tools that will throw up drop rules and honey and you know what I'm talking about drop rules and firewalls as soon as they see you scanning and then 15 minutes later the rule goes away yeah so how do you measure and again you is that just so small that it doesn't matter yeah that's a great question um so how do we I I guess the overarching question is like how do we deal with all the these facets on the internet that make internet scanning hard like honeypots firewalls things going up and down Etc um the first you know I guess the base

answer to that question is like understanding what can go wrong and then trying to quantify that so like in census we actually do label to the best of our knowledge honeypots tarpits Etc because we know those things exist so then we can go and do you know more fingerprinting research to say hey what are the most popular honeypots what do they look like on the internet what do they look like in our data set and then how do we label that such that other people aren't deceived as well um so that's one facet of your question how do we deal with firewalls that go up and down honestly constantly scanning changing our scanning um looking for new things and comparing

it to older parts of our data set and then recognizing like oh this thing is actually really new let's continue to dig in further but I think at the base it's having a deep understanding of the field itself and this maybe goes to the other questions like understanding in networking on the internet what are the things that could go wrong what are the things that could exist and then how do you account for that in your measurements in your scanning Etc does it answer your question yeah thank you very much yeah thank you hi great presentation nice to meet you I would like to say that I follow the work that is held at UCSD mostly

with the network telescope that they operate over there um so I I would like to ask you if there is an intersection between the UCSD telescope and the work that you perform currently so it's mostly a curiosity is there any other work that's coming out of the network telescope yeah if there is an intersection between the the style of research that you conduct the measurements that you conduct and this the the the knowledge that you acquired that could be potentially related to this telescope at the University yeah so I mean I think they're they're very highly related I mean I consider a network telescope to be a measurement tool in and of itself for folks that

don't know a network telescope is essentially like a a block of IP addresses that isn't sending out information it is just receiving and so it's essentially like a um like a a a box to receive information and see like what are people sending organically on the internet with no interference right because like if we send a scan then someone might scan us back cuz they're like what the hell are you doing on my network but like this this block this network telescope doesn't do anything so yeah I think they're very highly related um and it's it's essentially just another tool right like if we think about census census is active scanning so like we're sending out probes we're

seeing what information comes back um a network telescope is a little bit more passive right it's it's listening and seeing what's happening on the internet and then like you said there's a lot of really interesting Trends and research that can come out of it because you're just organically listening to what's going on yeah finally I just would like to share that I have this what is becoming a cloud telescope so I am reprod producing the method that is used in a physically bound Network telescope such as UCSD but deploying the same approach in AWS to see how busy senses is at scanning the devices yeah and also other internet scanners so it's quite interesting to see this pattern this

shifting in terms of behavior what's of what's currently going on on the internet at well just sharing yeah thank you so much I love the mask what mask thank you so um I wanted to ask if you could share a little bit more about how you were configuring inmap with when you were doing your comparisons between the census scanner and the nmap scammer I think that you mentioned that one of the areas that census was excelling in was Sip uh and you know sip ports because yes sip is a lot yeah just hand waves so I was I was wondering if you were using like the the NSA the sorry the NSE uh sip scripts as well and comparing that

to yeah we did a number of different tests like we did like barebones end map we used the NSE scripts we tried like um dropping anything with UDP cuz like UDP gets a little funky um with like filtered results um and even when we tried the NSE scripts we still found a better improvement with census do you provide any feedback to nmap about any of that or are they considered a little bit too much of a competitor oh we haven't provided any feedback um but not because of Any not for any reason besides time to be quite honest um so yeah that's definitely something that I can explore separately um there's just always things to be doing yeah absolutely thank you

very much thank you great presentation um I thought that the uh going through and recrating the patching email was very cool like uh to be able to like just take a different view on it to get people to respond to it and actually fix the the patching was great my my followup on that my question is like for the people that said that they had exceptions previously did you follow that down the risk side of things to like figure out how people got exceptions were they real exceptions like tying it back into like a like a a risk register type system like was that something you guys did I'm sorry I I'm having a hard time hearing what what

were you saying it's just hard with this mas no I think this like I don't think I can hear the speaker oh okay so just like um for the people that said for this yeah sorry I'm just got it okay very cool um for the people that came back and said uh I had an exception already oh yes you went down that did you go down that path back to like a risk register like because a lot of times we find people say that as well and that's how we find out that our risk register process is broken the approvals for that is broken no one follows up on that like they they never expire so you

know that more my question like did you go further that way on it yeah so we we did examine for the people um the the question was you know for the folks who said they had an exception did we dig into that further um we did so after the semi-structured interview we followed up with them and we're like hey what are you talking about and that's when we found they were like oh there's this Google form that I thought was going to the security team and we were like we don't own this where did this come from and so that was when I handed it off to the the the security engineers and and they chased it down and clarified some

things and and this actually led to um a bigger takeaway for UCSD which is that because there's like so many different organizations who just have their own people there's a a breaking communication um and so there were some organizational changes that were made to try and facilitate better communication but yeah we didn't have like any risk registration or anything it was I think what had happened in that case is that someone who had previously been on the team years ago had set that up because you know you understand that there are exceptions to the rule and then that had just never been handed off correctly using a process to better another process like I took over USB access

approvals recently and now we're seeing all the breaks in it that we didn't know existed that we didn't know until we started seeing the requests yeah so like it's just like I I always it always excites me to see like one process bettering things making another one better so very cool thank you yeah thank you and thank you for showing that uh hi my name is Chris uh great talk thank you for that so you might have covered this before I walked in I walked in a couple minutes late but for those of us who are aspiring researchers and would love to get more into research what are some Pathways that you could share on ways we could get in yeah

absolutely great question um one of the things that is super cool about the internet today is that there's a lot of accessible public data sets and so actually this last question like what trends exist in leaked data was in part motivated because I had gone to a talk where um oh man I'm blanking on his name the guy who used to hm I'm not even going to try to remember this really famous journalist who was at The Intercept was like yeah there's all these data sets online but like I don't necessarily have the data skills to go and dig into it and so like I pair up with researchers and then they tell me all the cool interesting stuff and then

I write about it and so I think one thing that you could start to do is just look at some of these public data sets like even public data sets that are used in tutorials um just to get an understanding for the tools and how they're used and then go and apply them to like leaked privacy data sets um more other Trends or other ways um you know uh like I got into research by working with a research lab in an academic org um that was my path but I think there's a lot of different ways and that's just two of them very cool and I didn't even think about working with journalists on when you have a discovery maybe share it

and yeah get the word out yeah I would say you know there was a really interesting talk man I can't believe I forgot his name m Le is that right I don't know Mich Lee just put put out a book hack leaks and Revelations yes that talks all about the data sets in you work that's the book Michael Lee's book this is where I was like wow there's actually a lot of really thank you for that there's a lot of public information public data sets now um but like pairing up with someone who might not necessarily have that background but has the interesting questions if you can bring the data science skills to the table and you can pair up together I

think that's also a really good way to get your toes wet very cool thank you for the the name drop yeah yeah thank you for reminding there was a neuron working really hard to try and remember I was like oh my God I I'm terrible with names but uh I just saw he's going to be doing a book signing at Defcon oh nice so that's why it was in my head gotcha gotcha yeah um I I was going to say I think we're at time you folks feel free to come find me and thank you so much for your time appreciate it

thank you thank you thank you [Music]

[Music]

[Music] [Music]

[Music] [Applause] [Music]

[Music]

[Music] [Music]

[Music] [Applause] [Music]

[Music]

he [Music]

[Music] [Applause] [Music] he [Applause] [Music] [Applause] [Music] n [Music]

[Music]

[Music] [Music] TR [Music] hey hey hey hey [Applause] [Music] hey hey hey hey hey [Music] he [Music]

[Music] he [Music]

[Music]

he [Music]

[Music] [Applause] [Music]

[Music] [Music] [Music]

[Music]

[Music] [Applause] [Music] he

[Music] he [Music]

[Music]

[Music] h a [Music]

w [Music] [Applause] [Music] [Applause] [Music] [Applause] [Music] I'm just wonder soing I to you I'm just TR to give you [Music] something I'm just TR to give you something I do I'm just TR to give you something [Music] a [Music] [Applause]

[Music]

[Music] [Music] I'm just tring I I'm just TR to [Music] something I'm just trying to give you something I do I'm just trying to give you something [Music] m [Music] w

[Music]

[Music] [Music]

[Music]

[Music] be [Applause]

[Music]

[Music] [Music]

[Applause]

[Music]

[Music] n [Music]

for [Music]

[Music]

[Music] [Music]

[Music] [Applause] [Music] oh [Music]

[Music]

[Music] [Music]

[Music] [Applause] [Music]

[Music]

[Applause] [Music] heyy heyy [Music]

[Applause] [Music] he [Music]

[Music]

[Music] n [Music] St [Music] hey hey [Applause] [Music]

hey hey hey hey hey hey [Music]

he [Music]

[Music] [Applause] [Music]

[Music] [Music] [Music] he [Music]

[Music] [Applause] [Music] w [Music]

[Music]

[Music] h [Music]

[Music] [Applause] [Music] [Applause] [Music] [Applause] [Music]

I do you I'm just trying to give you [Music] something I'm just tring to give you something do I'm just tring to give you something [Music] m [Music] [Applause] [Music] [Music]

[Music] [Music] I'm just TR to give you something I do I'm just TR to give you [Music] something I'm [Music] just I'm just trying to give you something [Music] m [Music] m [Music]

[Music]

[Music] [Music] a

[Music]

[Music] [Applause]

[Music]

a [Applause]

[Music]

[Music] a

[Music] I [Music]

e [Music]

[Music]

[Music] [Music]

[Music] [Applause] [Music]

[Music]

[Music] [Music]

[Music] [Applause] [Music]

[Music]

[Applause] [Music] hey hey [Applause] hey he [Music] [Applause] [Music]

[Music]

he he

[Music]

[Music] TR [Music] hey hey hey [Applause] [Music]

hey hey hey hey he he [Applause] [Music]

[Music]

[Music] [Applause] [Music]

[Music] [Applause] [Music] he [Music] [Applause] [Music]

[Music] [Music] [Music]

[Music] [Applause] [Music]

[Music]

[Music] m

[Music]

[Music] oh [Music] [Applause] w [Music] [Applause] [Music] I'm just tring to give you something I you I'm just TR to give you [Music] something I'm just I do I'm just TR to give you something [Music] m [Music] [Applause]

[Music]

[Music] [Music] I'm just trying to I'm just tring to give you [Music] something I'm just just trying to give you something I do you I'm just trying to give you something [Music] w

a [Music]

[Music]

[Music] [Music]

[Music] St

[Music]

[Music] [Applause]

oh [Music]

[Music] [Music]

[Applause]

[Music]

[Music] e [Music] d [Music] h

[Music] let's hit it good afternoon happy happy besid Las Vegas what's going on what's up what's up y'all tired you been partying all right wait wait you got a party tonight don't forget need to get that nap in all right so we are here today with EZ Talan about context is all you need what alerts events and logs are relevant to each other so before we get started we have to thank the money bag so thank you Adobe and all the other great sponsors that we have here and all of our volunteers and staff and speakers and you right it's all for us um so without further Ado uh EZ does want questions if you have any questions or

to raise your hand we're going to do this interactive it's going to be fun we got to wake up okay so let's go EZ what's up sweet I appreciate you thank you very much hello everybody my name is EZ um before we start I want to do it as interactive as possible so if you have questions please don't be shy raise your hand or just shout out the question right away and even before we start if everybody can kind of like just give me a quick poll so that I'm not talking to like different audience um anybody here did something in the sock before or like some blue team okay a lot okay that's fantastic the

other people that are not in blue team can you just shout out um either like you're interested in like what kind of area like it's fine if you're a student and you're just like a generalist and just shout out generalist uh but if I can just get a quick PLL like what are you into so I can kind of like cater to to your interest as well purple team purple Team all right more purple all right anything else all right okay wow this is a fantastic room none of the compliance folks none of the compliance gets me to test everything right right all right any students any generalists okie dokie all right so a little bit of uh an agenda so I'm going

to start with a quick who am I uh so that you know who the hell am I and uh and then I'll talk a little bit about the challenges in the industry as a whole from security operations but since you're already practitioners I probably just going a breeze into that I will probably put a lot of time into the analysis challenges like what is hard in our job today um as people in the sock and then I'll talk a little about attack but you guys are probably experts in this area so I don't know how much time is going to be spent there probably not that much and then we're going to talk about the investigations and the model

of investigating the talks about the what the when the who and how can we automate things a little go for it that's a good one that was a good one okay um and then we're going to talk a little about some models in machine learning I know everybody is all about machine learning I am a data scientist so I'll probably bore you with a lot of machine learning just as I get my mic set up and uh we'll talk a lot a little bit about examples and we'll talk more about questions as well all right I will not I will not jump into the screen okie dokie all right a little bit about me I'm a data

scientist that works in the sock um I have a bunch of sock experience but I have more experience in the data science area um than than in the sock so you're probably more experts in the sock side so if I say something that is kind of like huh this guy is not really covering everything don't be shy I would not get offended shout it out or raise your hand and be like by the way that's not really how we do it that's a better way to explain it and that would be fantastic because I want this to be as interactive as possible but I worked in the sock as a data scientist in Royal Bank of Canada

cyber defense haawi and force Scout and that's about it all right a little bit of a General kind of statement in cyber security is it seems like things keep on getting worse and I'm not sure why if you remember like 10 years back we thought like stuck net was bad nowadays I don't know how bad was that right compared to the stuff that happened um but it seems like as if attacks are getting better and even more stealthier than they used to be and harder to detect and it seems like we have more alerts than ever before hitting our sock and it seems like we also have a lot more tools so why are attacks getting more

successful uh some people seem to say that it's probably just bloated and ineffective but some people are just saying that you know what systems just got less secure and if you think about that are systems actually less secure I mean we have the fact that back then things were simpler and now things are like crazy complicated I mean even in ic things were super simple to to pull off stuck net for instance or Colonial pipeline or whatnot things were very different back then while Colonial pipeline was like a little bit easier than stuck net right um and that's because like the networks have more attack surface right so back then stet was completely air gapped to do anything

there you had to circulate the virus for two years around planet Earth until someone by mere chance gets that virus on their USB and plugs it in now you don't really have to do all of that there's like smart grid there's like I don't know 5G um IC kind of mobile Edge cloud and whatnot there are so many attack uh surfaces and that means that we have a lot more attack vectors to pull off so anyways what that takes us to is everybody in the sock does not mind any kind of automation they actually welcome it more than anything else this is one of the few Industries where you go and you talk about Automation and people are

like yes give me more please take my job I don't want to secure this job ever like please take it give me another job um and according to the sock survey that is pulled off by Sans and I don't know where do you stand on Sans I mean I'm I'm kind of biased because I'm I'm on the gak Advisory Board and I paid them a lot of money so I feel like they're good but they do this report and they go out to all the people in the sock and they asked them what's the biggest problem that you think is out there it's a pretty good report if you want to read it it's called The Sock

survey by S and it seems like most people just point to this idea of my investigators they take a lot of time and their job is pretty hard and their investigations are very inconsistent and all of that is to chase this dream of correlating stuff like I just really want to understand what is relevant to a ticket without having to like jump into hoops and whatnot and that is coming at a time where we have all of the data that we need ideally in a single place either logically or technologically it might be the Sim it might be whatever you want to call it but we have a bunch of alerts and events and logs throughout

our our organization ideally we have a good coverage but even if we don't we have a pretty decent coverage in most cases and we have all of that in one place and the way we deal with it is we actually have a lot of lookups so we actually go in there and we look up what could be relevant to an alert so if I get I don't know like some sort of an ADR alert I go in there and I look up what is going on what could be relevant there and so on and so forth um we can use Sigma rules for things like that in terms of the rules or what they call

correlation rules or analytical stories that have a lot a lot of rules that look for things that are relevant and in this case the rule is simply looking for uh command kind of uh strings uh via some sort of like a query language or regular expression that finds these in web traffic so it's a very straightforward detection rule um and we we like we like Sigma because Sigma is like a unified language and things are beautiful we also like what they call um investigative playbooks anybody from familiar with investigative playbooks not as many okay let's quickly jump into what is an investigation Playbook so um there is an open source thing from Splunk uh called research.

spun.com if you quickly go there you'll find all of the open source content from Splunk so you'll find detections analytic stories and playbooks this is obviously not the only place to find this but it's just one of the bigger uh repos out there with information like this so if you can select like investigation playbooks you can essentially look at investigative playbooks so for instance here is a Playbook that I don't know like what do you I don't know there's like identifier activity analysis there's like internal host SSH there's like what do you do when you find like an email what do you what do you do when you find like a threat Intel hit what do you do when you

find something like that and if you click on any of those like for instance this case once a user or device is involved in something it will go and it will look up the attributes of that so that's already something that an analyst has to do like when you find an asset is involved in some sort of an investigation you better know what kind of business unit you better know what kind of it leader what kind of manager is handling it and stuff like that ideally through a cmdb but if you don't have it you can also do just a live lookup for it um and anyways uh this is kind of like the Playbook you can kind

of like see the exact um I guess Json off it on their GitHub so all of this is completely open source as I said uh none of this is to kind of like um none of this is charged and a lot of people actually take this content and they repackage it via Sigma or other kind of like uh Source or whatnot um and it's pretty pretty common out there that we have this kind of investigative playbooks however I would like to pause here and ask anybody from the folks that are around here do investigations today and if so what is kind of the guideline

can you speak up yes so some sort of like a investigation that once I have an ADR alert what how do I know what is relevant to it yeah I mean we we use a platform called logarithm it has playbooks on it and those playbooks will tell the analyst you know steps to do and it measures the time between the steps so that we can get mttr mttd type stuff and is that enough or does the analyst do a little bit more of a look up that's an aptitude thing of the analyst how curious are they you know how much do they want to impress so it's it's a challenge okay all right fair enough

thank you for sharing anybody else wants to share a little about how do they do investigations or how do they find what is relevant to an incident you can shout it out but is it mainly just playbooks and lookups series of lookups and then anything other than that is completely are there guidelines for pivoting So when you say like a human could be curious got it got it so it it seems like a lot of it is really up to the human Acumen and we call that process a hypothesis and validation so they look at something and they're like you know what I hypothesize that this is a Lattin movement let me go and figure out if it's actually a

lateral movement or if it was just like a driveby and I have to go and I figure that out and how do I figure that out I have to retain this Acumen and knowledge that I have about the environment and then I have to go there and look for things that are considerably um enabling to a lateral movement so if you remember in the miter attack we actually have like tactics and I would be looking at the tactics before the lateral movement trying to find something that kind of answers that question was the tal movement did I see credential Discovery did I see something in the area of maybe execution or persistence beforehand and so on and so

forth so this is the state of art today it seems like we're highly dependent on the human and unfortunately humans don't have a very good consistency so even if I'm an amazing analyst one day I'm feeling like it one day I'm not it's very hard to judge me on my quality of a ticket to promote me or to demote me You' probably just judge me on time so you mentioned mttr mttd right like just time so do you want more tickets I got you yes right they see it think thinking they see evil and then they find logs to support their evidence instead of looking at the evidence for what it is right interesting and that's another

problem because then you're chasing posts interesting yeah so um to rephrase that a lot of the time uh it seems like people might end up chasing the wrong stuff and that means that their hypothesis and validation uh train of thought might lead them to essentially Chase nothing and essentially waste very expensive time on the job and lookups and threat intelligence discussions and pinging users and whatnot after absolutely nothing when leaving the good stuff so it seems like we lack some sort of structure we lack some sort of a methodology there and that's really some of the reason that we don't really have a lot of good tools to support these analysts in these investigations the

best we have in terms of tools in this industry is maybe use an llm and hope for the best but an llm is a magic blackbox and I don't know how useful it is but we can you know I we can actually take it on and ask an l M what does it think of something and see how would that work out so I have one of the best or one of the most popular llms out there and if I ask it um I have an EDR alert about privilege escalation um in a process uh on Windows 11 at my HR department what do you think I should investigate in terms of relevant um events alerts and logs in my

sim the llm would give me highly nondeterministic answers every single time you hit up an llm its creativity configuration or what they call a temperature is usually set up to a pretty high value which means that a vanilla llm would give you a different answer every single time you prompted with the exact same stuff so I I had no clue what will come up um but essentially a big issue that we have there is we're never really too sure what it will spit up so we we try to guard rail it we try to have configurations like you can easily use an O Lama for instance and have a local llama um where you can drive down the

temperature instead of chat GPT you can probably use hugging chat so I can go here to hugging chat where I can have some more configurations set up and I can make an assistant so in this case I have an assistant that I built for this and you can see that in this assistant if I edit my assistant I see essentially what kind of configs I gave it so in this config I can kind of like drive down the temperature to zero all the way to zero and in this case or maybe 0.1 it doesn't even allow me to go to zero in this case it's as less creative and as more deterministic as possible so what

that mean sorry what that means is if I give it the exact same prompt it's kind of deterministic to a certain degree more than others and in that case I can really depend on what's going to spit out and I can have an act experiment figuring out our LMS going help me but I'm kind of jumping the gun I'm going towards essentially what kind of solutions are out there but this is just a little bit of a brainstorming bit so we figured that a lot of things relate back to the attack methodology so you probably already all know about attack if you don't know the attack methodology seems to be always referenced where possible because it allows us to figure

out what is the answer to the what question so when you look at an alert or an investigation you're going after who when and what remember so the what is usually referenced via some sort of a framework um or some sort of a knowledge base that allows me to have a common language for what is happening and answer that what and we use the attack methodology for that just a refresher going left to right there is more causality or sequentiality and then these are different techniques the value of having this is obviously if I'm a junior I don't have to be a unicorn at your sock because we don't have a talent shortage we have a unicorn shortage

there's a [ __ ] ton of people that would love to take a job at your sock but you don't accept accept them because they're not unicorns um but if you had essentially a nice methodology and a nice guideline where everything has MIT attack you can actually have them read the mitigations and detections and procedur examples and they can kind of perform above their pay grade however another problem in this industry is the MIT attack uh labels that we have are usually non-consistent because guess what all of the detections playbooks and whatnot have very very different authors most of these authors put the attack just for marketing they don't really put it for operation s they put the most

broadest technique that they can find or a tactic and it's not really very good so one of the things that we should probably do in a sock is try to make sure that we have a consistent attack technique what can help there probably natural language processing and llm is a good place to start so you can easily go to hugging chat or AMA or whatnot find a nice model give it a few resources try to figure out how to get a nice um essentially attack uh label out there populate it in a nice sheet and now you have this sheet that allows you to map everything consistently every single unique uh alert or signature or whatnot

to a technique and that is what I would do so I would use some sort of like a free thing and and we we built a free uh bunch of these uh that essentially you they have a nice system prompt they have domains uh that can help them find out these kind of things and we built these domains essentially like a list of domains that has all of the stuff that it would need to look up um what is the nice technique there it might not be the best but it's consistent so that's essentially the piece that you want to focus on when you're figuring out what's what um in terms of an answer but

anyways uh you can very easily pull this off if you're not familiar with AMA you can also have uh a local one hugging chat can also be set up offline but you still have to trust hugging chat with your data y yada y but again you only need to Fe to feed the D duplicated or unique alert messages and descriptions that's it so really nothing sensitive out there it's just a rule set that you use in multiple Source types and that's really about it and you would have some sort of a consistent technique so let's assume that we have a consistent Technique we were talking about an investigation go back to the point as what are you talking about so about that

investigation attacks usually are not a single incident right they're usually a coordinated kind of story and when I'm talking about giving me relevant data points ideally if you remember we were kind of hypothesizing is this a lateral Movement we were going back and we were looking for Discovery credential access defensive agent am I do I have any hits do I have any actual coverage of what happened in these uh tactics and if I do I can kind of like Stitch It Up in what MIT calls the attack flow anybody here familiar with attack flow take care Gabe you are familiar with attack flow we got two people three people okay four people all right for those that are not

familiar with attack flow attack flow is a nice project that Gable was one of the people that actually worked on it I know he running out but thank you for working on that project game that's a pretty cool project um so if you want to take a look at the project a quick a quick way to look at it is essentially let's open let's say Equifax Equifax was a big breach that happened in 2017 half of the people in the US have their sin numbers addresses date of births out there in the dark web somewhere so if you suffered identity theft after 2017 this is why um but essentially I can see the story I can see it started with

vulnerability scanning and then uh this happened on a vulnerability scan in one of the I guess um online dispute portal seems like it they had an Apache Stratus web framework vulnerability uh that was exploited out there and then there was a webshell and then after the webshell there were like some credentials harvested on that machine wow there were credentials on that machine and then there was an encrypted Channel where they queried some databases there was like um archiving of the data probably for exfiltration and yep there was exfiltration exfiltration used the proxy to to make it harder and then they deleted their their traces yeah they Tred they deleted Windows event logs and files and whatnot

that they left okay so essentially the attack flow is a way for people to uh instead of just read an AP note or an AP report now you actually have the Tactical uh not really fully tactical but a little bit of operational knowledge of what happened and you can see the sequence out there so it's just a nice way for the community to talk about these kind of sequences that these spot so anyways if we were to have a little bit of a task force a little bit of a work workshop and you know on our Innovation Friday try to make things better you gather the team team how can we make things better how can we make

next week better than this week and they say you know what let's actually action some sort of innovation and make our incident investigation better like can we augment our playbooks can we augment our rules let's think how can we do it so one of the first steps would be like the who when what and when you talk about what we already talked about let's figure out a consistent way to put MIT attack techniques MIT attack techniques is a good place to start and let's use some sort of anal language processing to get that figured out and if we do we have a nice CSV with it we give it to the Sim boom everything in the Sim now

has it um for The Who ideally we also have some sort of a cmdb if not we can also make a um some sort of a CSV of here is here's a service account here's the account business unit here's the whatever and again give it to the um to the Sim to enrich things automatically for you and now your tickets already we have like entity information as well as what is going on information so that's the first step in terms of MIT attack um enrichment for the events and then ideally we have some sort of entities attributes that we kind of figured out and ideally we want to use these in a similarity fashion so what is

similarity yeah usually in the Sim you don't have such thing as a similarity in a Sim you only have a common characteristic so what is a common characteristic well this and this share the port this and this share the OS this and this share the parent process this and this share the business unit or the subnet or whatnot but is there a similarity no we don't really have any similarity we have identicality do they share something or do they not so ideally in a common U practice like this we have an algorithm that helps us with similarity what kind of similarity you know like Port 80 and Port 81 are kind of similar right uh process uh Internet

Explorer and I don't know Edge are kind of similar right uh it might not be straight forward that it's similar but if we get some sort of a model that tells us what is similar we unlock a lot more um nicer enrichments right so I'm going to pause here any questions on this kind of train of thought the wet um a little bit more on the wet in terms of similarity and then The Who and a little bit more on expanding the what and the who in terms of similarity and ideally the output of course is find me stuff that is related find me stuff that is relevant any questions if not I can fly

through the rest all right we go to the fun part so MIT attack techniques we talked about let's classify events with relevant attack techniques we use an llm how do we do that we get any kind of llm we give it some sort of a fine training small labeled Corpus if we can if we cannot we just just thrust it with an unlabeled Corpus and we give it some uh essentially websites that it can look up and we can essentially just quickly pull something like this out on AWS and put it in a little Sage maker and the output is actually pretty nice so we here we use a small Bert nothing too crazy not

really a big llm and it's actually pretty decent like it's actually pretty pretty decent it's not that hard and you can get a CSV out of this and it immediately enriches your sim the second part is kind of interesting so first we're going to take this as a big data problem I I have 10 terabytes that come to my SIM every day I cannot feed this to an llm never I can feed it to an llm so what am I going to feed it to I'm going to feed it to a classical machine learning algorithm such as clustering which is very easily understood very easily explainable very easily um auditable which is very important and

transparent and llm I cannot really understand why it did something clustering I can um so clustering events that's what I'm going to go after and I'm going to Cluster them based on similarity in the characteristics of the event as well as characteristics of the attributes um uh of the entities involved so if it affected two people that are kind of similar I need them to be in the same ticket if the events are kind of similar I need them to in the same ticket and ideally after I have these clusters of course I can stitch them based on sequentiality because guess what I have the technique and the tactic so I make some sort of a basic

data structure and this data structure I can kind of highlight what I need to be the nodes which is how this data structure is going to index things for my clustering algorithm and then I just feed uh this data structure to what we call an embedding algorithm um usually if you don't have a complicated data structure you can pull this off with a very straightforward uh clustering algorithm without doing the the this kind of data structure but if you have the time if you have like a summer intern or something like that or if you actually have like a full Friday you can do this as your step two your step one would definitely just be a basic um data

structure like tabular format but if you can do this it it makes your life a lot easier and the connections are actually after what you want because while you're designing this you're saying a node is a cve now my clusters is going to be related by cve or maybe a node is simply an entity now my clusters are based on entities involved and so on and so forth um and then of course you can chain things around with what we call a markovian uh logic essentially so this is a finite State machine we call it Marian uh essentially models just because we make it fancy but it's a finite State machine and I can easily go

to my chat GPT again I can generate this code and by the way I already generated it it's it's sitting in my GitHub I will have the links ready for you but essentially uh make me uh python to Cluster alerts with 10 event attributes and five uh entity attributes right so and it will now just show me the entities that it will make it will show me the uh essentially alerts in a very tabular format so these are the five attributes for the alerts they're encoded numerically you can encode them in whatever way you want and uh essentially uh you can also get the entity attributes there and you can choose a model so the model could be DV

scan K means whatnot you don't have to be a data scientist you just have to actually deal with this the same way you're not a combustion engineer but you drive a car right like you just deal with it like how do I use it I have this can spit out code at me let me try a few things let me hack around we're hackers consider yourself in hackathon one Friday and pull this off and you would be astonished as how easy you can provide humongous value in this area and again the code is out there we have a vlog about it uh this is what we got so we fed about 2,000 alerts just a quick

kind of uh uh playground that we got uh the ground Truth for this was one uh cluster and then we found eight clusters and actually all eight were kind of interesting so our ground truth was broken cuz we didn't know that one of the firewalls was kind of like letting Chan and other Bots online do stuff so we didn't look for these but these were actually true when we looked at them we were like oh interesting okay um so essentially in this case you have this kind of architecture where you have a couple models uh one that gives them tactics and techniques which be just once in your life via an llm make a CSV

and then once everything has a TTP it doesn't matter what kind of data format it came in um you can also have one of our models that we uh outsourced uh sorry open sourced um allows you to immediately switch any data format into a common data format via an llm and you can have a unified data model that can go into like some sort of a flow or clustering detector uh and you can get things chained up uh very nicely so again the idea is to put out attack flows ideally you're clustering stuff you're correlating stuff and then um you just put it in a flow and usually the first step is just know the causality uh

via uncovering techniques uh here are some of the screenshots that we did so here's the data that we fed um we fed a [ __ ] ton of those and uh immediately the cluster would look something like this so these are all of the IPS of the stuff that is relevant uh here are the entities that were involved here are the techniques that are involved and so on and so forth and here are some internal uh internal users or internal IPS that we marked and and um obviously it's so much better than the current stories and rules but it augments them because they are better at certain things and you want to augment them you don't want to

replace them um how is it easier to adapt and maintain you can easily go and Mark in the cluster what do you not like and what do you like and immediately the cluster algorithm would learn how to Cluster things the way you like it without you explaining too much or writing a rule uh it's actually more efficient than the rule because the more you have of these playbooks and lookups the Sim is going to be slower so if you hear people complaining or ranting about oh you know Q radar is so slow man like I want to move to like spunk or something like that you you immediately ask them like how many lookups do you

have on there like how many continuous searches do you have on there like how many playbooks do you have that are constantly hitting it with searches like if you go in the jobs and Splunk you like you can have a lot of jobs but usually when you transition into a new sim you don't take any of these rules with you so like three years later you're you're again like oh Splunk is so slow man like um so that's what I see a lot and then the n nice thing is it's Standalone so the leadership changes the Sim they change the EDR your clustering thing that you worked on a little with your intern or in your Friday Innovation

Fridays or whatnot it's going to live with you for the rest of your life um and yeah you can correlate via nuances and that's the similarity piece because clustering takes in in consideration similarity it actually looks at semantical similarity of of Kevin and Kev or HR and Corporate Safety like these are actually similar but if you have a rule or a Playbook it's just looking for a match on business unit and that's it um but yeah that's essentially the idea a lot of buzzing alerts um ideally just put them all together this is a UI that we built uh a lot of this is available in our GitHub um but essentially ideally your full posos of

ticket looks something like this you already have the sources and whatnot you have the resolution and the comments and if you already have these tickets historically let's talk about how do we do this via supervised learning which I couldn't put into this uh discussion because this is just unsupervised for people that don't have historical tickets but if you have historical tickets that you can trust and you can be like you know what my tier three my senior guy I trust this guy any any ticket that this guy closed I can use to train a supervised learning model to notice what is a false positive to notice what's a what's a nice investigation to do to just show the

analyst today what are relevant tickets and they can just quickly skin through them and be like oh yeah yeah yeah yeah yeah this is the vulnerability scan that happens every quarter for compliance and every single time it's marked as a ticket but this is obviously false positive because every time I just ring up this guy and this guy's like yeah this is our vulnerability scanner this is our pentest or this is oh yeah like the accountant keeps on going every quarter to pull up Financial metrics for the management like this is the exact same thing every single quarter if I have these tickets uh like the tax story tickets or whatnot even the coordinated tickets like the AP stuff if I have them

historically and I feed them to a supervised model I can teach my new generation of analysts how historic ly good human Acumen did these investigations if I have it but if I don't have it unsupervised is your friend natural language processing clustering and the rest is history but yeah um these kind of these kind of things augment your playbooks I don't leave a lot of time for Q&A so throw it at me if you have it there's a lot of a lot of stuff in that thck um these are the links for the for the hugging uh chat uh assistance that I built for this there is a gist here that I kept for you guys that has

some of the clustering algorithms so this is one uh you can immediately just go to this it's public and it does essentially what the chat GP did like it's just a little fancier uh and then you can see the Clusters uh essentially like what what alerts or what incidents or what logs or what events are kind of clustered together and just to remind you these are coming from hetrogeneous Source types they're not even the same data model none of it is similar um but we are still able to kind of like get um the overall uh essentially similarities in there but anyways any questions I need a lot of questions go for it just a moment oh yes thank you I was

thinking I I got to go there thank appr uh how much do you trust to GPT to transform the data into a common model I don't trust yourp one bit the whole demonstration was starting with Chad GPT and then immediately saying what's wrong with it so I the only reason I brought it up is it doesn't have temperature control on the UI so don't use it in the UI with like for anything serious uh brainstorming is amazing quick C Snippets are nice but I wouldn't use it in the UI the API you can access the temperature for some reason the UI you cannot access the temperature so if I'm in your shoes instead of ever ever ever

using chat GPT I would install all Lama and have a local instance or if you don't care about it being local I would use hugging chat so I would immediately go go here to hugging chat through hugging face and I would go to assistants and I would create a new assistant and I would call it whatever test and I would choose a model that I want to play with I can choose a million of them and I would drive down the temperature as far as it would let me so and mistal can go to zero I would do that because if I'm being serious about anything I don't want it to be creative I'd rather have it be wrong and me

knowing that it's wrong in this use case and it's not good for this use case so zero temperature means zero creativity and it stops giving you nondeterministic outputs it's highly deterministic same input same output a million times thank

you 100% hey great talk um two questions um the first one being um you talked about uh unlabeled data but how often have you found that the miter tactics and techniques are mapped accurately enough to provide a strong signal for doing some of the similarity searches uh that's the first question should I answer it first yes go ahead okay um so when we started training our llm to do this we gave it all of the stuff that we pulled and scraped from GitHub from different rule sets and people have already given them a attac techniques immediately it performed really really bad and when we investigated it we found that the data quality was not consistent

and we were like yeah of course it's Scout Source data what what the hell did we even think so we scrapped all of that and we had two choices to either like sit down and make our own fine tuned and make sure that our people are consistent which means you hire the same person and the same person should write regular Expressions to check their consistency because every day they're acting differently and we did that but that was ridiculously expensive and I wouldn't tell you to do that um you you can just thrust unsupervised like an llm and just hope for the best because it doesn't matter what quality you have literally it just matters that it consistent if

it's consistent you're good to go and actually in your rule sets today if you go to your sim and in Splunk or whatever you just type in like after you do like an index all and you do like a dup per message you how many messages do you think you have in your Sim in the last three months like maybe a thousand different unique ones you can probably just sit down and manually do it you don't need an llm but an LM would give you a quick and dirty uh draft off it but short answer ldr I don't trust any of the stuff that's out there in terms of techniques because they do it for

marketing they they actually don't use it for operations most of the time yeah I agree with you the other question is when you did clustering um so what kind of clustering approaches did you try uh did you try any uh basic clustering approaches like um hashing like tlsh or um did you try more advanced clustering with a vector database that would that can further provide some explanations Downstream did you try any graph clustering algorithms could you just give some commentary around that yep uh so if you want the tldr version of this is use something that is simple first and then build on it don't don't be a perfectionist perfectionism is the opposite of progress uh so don't tackle

this as a perfectionist tackle it as an iterative kind of process like as if you're a a Founder You're Building like a version 0.1 first and so on and so forth so I would tell you to start with either a k means or a DB Scan they would use essentially a tabular data if you want to improve your exactly on point so the graph kind of um data structure and the graph kind of embedding that allows you to then use DV scan is so much better CU in the graph you actually do what we call Knowledge encoding so if you remember in my slides uh I had a little bit of a graph data structure

here that focused more on cwe and cve because because I noticed in my sock I would really like to have my tickets focused on vulnerabilities maybe because uh the way we action most of our tickets is I don't know we we we actually fixed our software because we're a product company but if I'm a manufacturing company or I'm an operations company why the hell do I need this I focus more on entities which of my suborganizations is hacked which of my sub entities we we acquired a lot of companies or something like that which of them actually has the problem and I want an entity based graph so in this case you're knowledge encoding and you're telling the

clustering what would you like to get out of it and as you see things that you don't like when you give it feedback it can action it very easily because it will just change the weights of edges versus nodes it would change the weights of which attribute in the nodes or entities do you care more about do you care about business unit or same operating system or same it leader manager admin and so on and so forth can very easily tune this any more questions amazing questions by the way thank you can we get a question here thank you this is amazing you guys can shout it out and I can repeat it if you

want we're recording um I just wanted to clarify so you wrote hugging chat and attack Lama those are things those are yes so attack Lama is just the name that I gave it um but I just went to hugging chat like any normal person and I just gave it a few links to to make it focus more on certain areas when it's doing its search so in hugging chat um you can go into the settings and you can give it certain things so this one for instance doesn't go online at to uh and it has a very low temperature relatively so in here I just kind of like gave it a system prompt and that was it but in this case um while I

was creating this and again it doesn't take a data scientist to do this that's why I showed this early because you can do it on your phone you don't even have to have a laptop that's how user friendly it is this is 2024 like it doesn't need a pH in data science and I gave it the links that I thought were interesting so you can see in my links I have a [ __ ] ton of links but the links are focused on the sources that I wanted to look at so bug crowd varia DBS W directory like rocket reach even like for Recon on the attribution uh URL void like all kinds of um essentially links that I thought

were interesting to my um to my essentially use case right my use case was threat and attribution software dication I wanted to to figure out what's the attack in any kind of event that I give you any kind of power code that I give you anything that I give this assistant you better give me um an attack technique that is relevant and this is a public bot uh I can share the link I can put it in my LinkedIn this is a public bot and there's a million other public Bots out there and you can look up my settings nothing here is like a secret per se I can activate it and I can go talk to it

this is the direct URL system instructions you can see my prompt there is no secret at all in this case and then if I dis like the the the command plus that I used here I can use another one right like and it's all via UI if if you want another UI something else that you want to do locally because you don't trust hugging chat oh Lama okay and then my uh second question is you know from an operational perspective as as someone who manages like a sock team how do I tell them to use these tools what's the Tactical do they go to these is this something we install ourselves or do we use it online or how I'm just trying to

understand tactically yeah how do we make hugging chat is hosted online or or can we link it to our Sim is that what you're proposing and that it it's its own system that spits out something at us I'm I'm kind of lost on I I think I understood your question you're saying where does the sit um I would it it depends on your use case but if you want the most generic use case right at the Sim you have essentially an exporter to S3 buckets you probably have it already for cold archives for compliance but if it's sitting in an S3 bucket the world is yours because if it's sitting in an S3 bucket you can just open Cloud shell

and tell it access this S3 bucket for free compute and do whatever you want with it or you can start a sage maker you can start a million things one of your engineers can probably do this like assist admin kind of thing and they don't have to write the the code for data science they just have to operate on that humongous database of Json and they can do a lot of things if you give them the capacity now you can give them some directions you can tell tell them listen like you're going to have a lot of hetrogeneous data sources here cuz our Sim is sending stuff our CIS logs are sending stuff we have a lot of

different things that are going out there and that is completely fine so you can uh go to our uh GitHub um and you can go to the data mapper model for instance and this llm is going to quickly map everything into one uh data model and now all of these Json are just like one data model based on this llm that will map everything for you and guess essentially the the integration and now you have essentially some sort of a CSV that they can do with hugging chat or with ama if they want a local instance if they don't want a local instance it's fine none of this is sensitive information it's the rule set

of whatever you use use crowd strike use Palo Alto use whatever you use get the rule sets from the Sim give these rule sets to any llm that you trust they're probably less than a thousand you can actually task one of your senior people to just sit down and give it techniques on an Excel sheet and that would be the best use of their time and this CSV is now going to go back to the Sim and it will enrich everything so on the Sim when they are doing notable alerts they will just do group by whatever this exists in the data model and they will add a data model for technique and you should probably do

this as well for entities attributes like you probably don't want everything in the cmdb nobody has a perfect cmdb but at least tell me when was the last time it was patched when what software is running on this who is the leader of this when we want to call someone and tell them is this fine is this a pentest is this a vulnerability who's the leader if you have these very little things per entity and these very little techniques per alert now your sim is a treasure Trove of information take that information which is already on S3 give it to the simplest clustering model your team can use not right from scratch use and at its worst iteration it's more

valuable than them sitting there and maybe they investigate maybe they don't cuz when they find a that they like that's already a blueprint for documentation of a guideline that you want in investigations they're already saying oh when an EDR happens we probably should go look at the Powershell commands before it I didn't really think about that but now you know what I actually write this into a Playbook or a detection CU then you're not tracking them by time you're tracking them by how many things can you improve in the process in every ticket how many contributions do you have to our content did you contribute to our content did you in this ticket when you

closed it did you suggest a better tuning for the detection a better tuning for the Playbook did you suggest a new one and if they did fantastic if they don't encourage them and give them an llm to support them while they're doing this but make sure that their llm is not creative 100% yeah but d three I think is the best way is you already have a cold backup and it's cheap if you want to delete it you can delete it but it's the cheapest storage in the world we have one more I'd love to get a million more in terms of the clustering I'm wondering about whether you're doing smart clustering or weighted clustering so that you know

maybe we know hey this particular IP address whenever that's involved this seems to be tied to that attack it might not be the only Factor but it could also be um anything involving a certain port number or I don't know what else you could be doing in terms of waiting or clust stream where it's like hey we got if it's got eight things in common it's more likely to be similar than so are you doing anything like that is yes I love it this is a fantastic question so for most people start slow for the people that are interested in like getting better stuff I don't want to I don't want to make it sound complicated

for the people that didn't start it yet so always start slow cuz the smallest kind of element of clustering you have there is valuable it will reduce false positives and most importantly it will reduce data you have a problem of Big Data reduce data even if it's bad at reduction it's consistent at reduction and it's explainable at reduction so anyways to answer your fantastic question your question was how do we make sure uh our clustering is smarter than you than regular for instance like is there some sort of knowledge encoding in similarity for instance like is Port 80 um as close to 81 as it close as it is close to 79 pneumatically it is but

as a cyber security expert no 0 and 81 are alternative HTTP ports 79 is garbage so I can come up with an embedding or an encoding where essentially and again you can do this with ch GPT like I can go to chat GPT I want to show you how easy this is cuz I I don't want to ever sound like a data scientist like a robot develop an encoder for ports that give them um an encoded value representing their ports Association where alternative ports are closer to each other than ports that are not servicing the same protocol right and it will give you a function like that I don't even have to look cuz I know it

will make up something so immediately it goes for like some alternative ports right it didn't even get you the correct Port Alternatives but https has like 8443 and a bunch others now that I have this example I will make them have closer numerical values to each other where I know more call 8080 I call 81 I know more call 81 81 I call it one and I call 79 106 it's a way for the machine to understand your knowledge you don't have to worry about how we talk to the machine you just talk you just worry about your knowledge and if you tell your users or if you tell chat jpt give me a list of all common points and the

protocols give me all of the alternative points boom give me a an an encoder hopefully it actually gave me the code yeah it did some sort of code for the encoding um I didn't check it but yeah it looks like it did so all of the stuff in HT in FTP is now called three so no more 20 and 22 21 and 22 I just called them three so they have the exact same numerical value which means exactly you're talking about like there is there is now some sort of knowledge that we encoded to this clustering and now this clustering is just not doing crazy stuff I can also go in there and tell it HR is

very close to business unit Corporate Safety I can go and tell it our suborganization XX analytics is very close to XX solar power because they got acquired from the same company and they have similar systems this subnet and this subnet are very similar because they have both have same OT systems whatever it is right and I can give that knowledge that is stuck in the brains of your analysts stuck in the brains of the folks that actually know the operation and I can give it to the data so that it will outlive all of us it will outlive your documentation and it's easier to do than your documentation and it will be the True Legacy that enables your people

not to acquire products but even if they acquire products they go into these products uh discussions very intelligently when they go talk to secur next or drop zone or whatever they tell them I here's my use case I actually know how to test this like are you going to give me tickets that have stuff from the same business unit and these vendors would love to take these knowledge bits in their info Gathering sheet and use it for you as an mssp as a vendor of a product or whatever but have the knowledge feed it to the cluster and you're completely on point and the clustering is going to be smart the other question that you asked

was what kind of algorithms as well um DB scan and K meain are fine but as I said before like you can also use the data structure to encode the knowledge

BsidesLV 2024 - Ground Truth - Wednesday

Related talks