BsidesLV 2025 - Ground Truth - Tuesday

BSides Las Vegas8:42:20480 viewsPublished 2025-08Watch on YouTube ↗

Show transcript [en]

Heat. Heat.

[Music]

Heat. [Music]

[Music] Heat. [Music]

[Music] Heat. [Music] Heat. [Music] Hey,

[Music]

Heat. Heat. [Music] Heat. Heat. [Music] Heat. Heat. [Music] Yeah.

Heat. Heat.

[Music]

Heat. Heat. [Music] Heat. Heat. [Music] Heat. Heat.

Yeah, [Music]

[Music]

yeah yeah. [Music] black. [Music] Keep it back. Yeah, [Music]

down. [Music] Down

down down down.

[Music] Um, I don't know if you came here because you like ground truth or if you want came here to see me speak, but I'm going to read to you um the the purpose of this track. So uh because I think it what I'm going to share is very much like really aligned with ground truth. So it's a place where we come together to share ideas, ask questions, compare notes and it's a place where we can uh talk about things rooted in scientific approaches to infosc. So, if you think about a lot of what we do, it's very much uh what we call art, um black magic, uh defense in the defense against the dark arts type of stuff. And um it's

it works for a time, but it really leaves a lot of us confused about like how did you do that? Uh or why do you think this way or whatever else it is? And so, um, I think ground truth for me, this this is like one of my favorite types of tracks because, uh, we can really think deeply about the problems that we're facing and seeing, uh, basically if there's a different way we should approach it. Are we doing the wrong thing? Right? And more, uh, from a from a almost theoretical viewpoint. Uh, are we doing the right things? And are we doing the wrong things? So what you'll what I'll share today is some of

these mental models that will help us uh have ground truth on what we're dealing with especially when it comes to AI. So anyway I I that's not part of my talk uh what I just shared. It's more just about what the nature of this track is. Um I'm going to probably stay for a couple of the talks myself but I'll be here for the rest of the day. So if you want to find me afterwards I'm happy to in fact again it's a compare notes so I'm happy to share my notes. Uh we'll love to hear yours as well. All right. So, with that, let me uh What time is it? I'm going to kick off almost exactly at 10:00.

Okay, 10 o'clock. All right, so hi, I'm Selu. Um I'm the CTO of a company called Gnostic. It's a company I founded that uh is tackling some of the problems that um I've discovered. But uh really what what I'm wanting to share with you is uh the path that I took to discover some of these challenges that that my company is trying to tackle. But it's not about my company. It's more about the process for how we think about some of the challenges. And the way that I do it is through these mental models. And not just mental models, but also bringing mental models together. So I'll share with you several different mental models. Um but you know we we'll walk

through the the sequence as we uh go to the slides. So some quick background about me again I I uh run a company called Nostic. I used to be the chief scientist at Bank of America. Now chief scientist it's a high fluting title. Uh but if you think about what a scientist does they run experiments they try different things. They see what works doesn't work. Okay. And I that's kind of my job. That was my job function. I I experiment with a lot of things. I ran very operational functions like red team, hunt team, um research and development, you know, looking at different vendors, but at the end of the day, um uh it was a really interesting

job that allowed me to to experiment and try, you know, just explore. Uh I've done a lot of other random stuff. Um, but when it comes to AI, okay, I think I'm probably like a lot of y'all in that, um, well, first of all, I think with Chad GPT, a lot of people suddenly became experts in AI. Okay? And I just want to be very clear that I am not one of them. Okay? I I I would love to get a $100 million u, you know, salary and signing bonus from uh, Zuck, but, you know, that's I'm not there. Okay? I think there are people all the way on the far right here um who will probably

earn that kind of salary. But I'm kind of just past the peak of Mount Stupid. Okay. And unfortunately, there are a lot of people who are still at the peak of Mount Stupid. You probably have run into them um yourself. Um I I hope I'm not one of them, but more so I what I wanted to share with you is how I got past that peak of Mount Stupid. But moreover, not just getting past the peak of mount stupid. But the question is, if you consider this bottom part, this competence line to be time like it takes time to gain competence. Okay, how do you shorten that line? How do you make it smaller? How do you squeeze it?

Because this space is moving so fast that it's almost as if like you you learn something new and all of a sudden the world's already changed. And um I think the the the the skill that I have or the skill that I've I've tried to adapt to this is to use mental models to shorten my time to competency. Um and there's this applies for not just AI but just pretty much anything that we deal with in um security or life in general. We want to be able to shrink this time frame. And I believe that mental models is one of the keys for doing that. Mental models is how we communicate with one another as well. Um it's a shared

reference. It's a shared reference point. If we have different mental models, then when I communicate to you something about networks and your mental model is one one person's mental model is the OSI model and another person's mental model is um Noel's uh >> did they even have one? I don't know if they even had a network mental model, but you'd end up with like what what are you talking about? Right? It's it's a very different uh communications uh protocol that you have and a human communications protocol. Our human communications protocol is mental models. That is our API for our brain. Okay, mental models are the APIs for our brains. It enables us to be able to

share very elegantly and efficiently. And what I'm going to share with you is again a couple different mental models. Um but better than that is when you start bringing these mental models together. Okay, when you start merging these mental models together, you discover some really interesting uh things about the space that we're in. And some of these mental models you probably have already heard of like who's heard of the udal loop. Okay, if you haven't already definitely there's a lot of applications in security uh for it, but you may um some of these other models that you may not have heard of. We we'll talk about them as well. Okay, but this whole notion of

merging mental models uh is pretty key. But also the notion of how do you use existing mental models to and adapt it to the space that you're in. Okay, so I'll start with a really simple mental model. It's one that some of you all may be familiar with. It's a it's a famous it's called a serenity prayer. Okay. Hey God, grant me the serenity to accept the things I cannot change and the courage to change things that I can and the wisdom to know the difference. Most people are probably familiar with this this uh prayer. But if you're in risk management, you should probably have a similar prayer. Okay? Grant me the serenity to accept the risks I cannot

change and the courage to risk mitigate the things that I can. Okay? You see how I've just it's simple adaptation but it's exactly what we do in security. We if you're in risk management then you want to be able to accept the things uh you just you know you have risks that you can't do anything about and so you just say okay I risk accept okay all right now here's the thing I I've taken this simple mental model um and I merged it with another one the other one is called the KFEN framework. So the Kfin framework is a simple framework. It's a it's a it's a bit more complex, but it's this construct that goes from chaotic to

complex to complicated to clear. There's many things that we see in the world that go through this cycle. And when it comes to AI, let's say like think about where you are, where your organization is with AI. Are you in chaotic? Are you in complex? Are you in complicated? Are you clear? Are are things what what stage are you in? Okay. Now I've asked this question to hundreds of people and they generally put themselves somewhere between chaotic to complex. Okay. But this mental model is useful because it says well what do you do to move from one stage to the next. If you want to move from comp uh complic chaotic to complex it means that you should run

multiple experiments. Remember what I did at Bank of America? I ran experiments. My job at Bank of America was to move us quickly from chaotic to complex. Okay, not me not necessarily move past that but the idea being run experiments, run experiments, run experiments. So if you are not running experiments with AI, you will stay in chaotic. You will not move from that. If you're prohibiting the business from running experiments with AI, the business will hate you and fire you at some point, but they're they're not going to move past chaotic either. you have to run experiments because you have to then understand what works and what doesn't work for the business. So this mental model is useful in just being

able to understand like where are where are you today and what actions you can take. But let me go back and merge it back with the previous mental model. On the left are things you cannot change. On the right are things you can change. On the left uh you you're in a predicament. You're not you don't have a problem. You have a predicament. And if predicaments the best you can do is manage your risks. On the right, if you have a problem, just go solve it. Okay? And the way you solve it is usually you buy a technology. On the left, you usually buy people and services. Okay? When it comes to automation, okay? And

these are all this is the same mental model, but I've layered all these different ways to think about it. Going from chaotic to complex, don't even think about automation. You don't even know what you want to automate. Okay. on the right going from uh complicated to clear. You automate as much as you can. You automate too much and you flip over to chaotic again. Okay. Faster, better, cheaper. Pick faster, better, cheaper, pick two. >> Right? Sometimes pick one. >> Right? You're in chaotic, pick zero. >> You're in clear, you get all three. So it's a different I mean everyone's well not many of you have heard faster better cheaper but now you have it in a

slightly different context and the mental model of faster better cheaper with the kfen model with the serenity prayer you see how all these come together and gives you a much deeper understanding of why can we only pick one or two faster better cheaper well it might be because you're in one of these different states or rather let me ask a different question when it comes to AI do you feel like things are like in terms the solutions you have, are they faster, better, and cheaper, or is it slow, is it worse, and is it expensive? Well, if it's all three of those, slow, worse, and expensive, you're probably in chaotic. It's just a it's just another

uh way to calibrate where you are. And it's it's just another mental model that we use. Okay? Again, uh mental models are a way that we can quickly communicate. It's an API for our brain. When I say faster, better, and cheaper, those who understand what that means, you're like, "Oh, yeah. Pick one, pick two." Right? And maybe the one you pick may be slightly different than the one I pick, but we instantly understand the construct of faster, better, and cheaper. And we move much faster in our ability to gain competence in that space. Okay, so that's that's an example. Now, there's a problem with mental models. Okay, the problem is uh uh car called out by this famous u uh Nobel

prize winning uh economist uh psychologist who won a Nobel prize in economics. Uh he says theory induced blindness. So change the word theory, replace it with the word model. You have model induced blindness. Once you have accepted a model as a tool in your thinking, it's extraordinarily difficult to notice the flaws of that model. Okay? Because you just see the world through that one particular model. Um there's a corresponding quote from a guy named George Box. All models are wrong, but some are useful. Okay, so it they're both true. Um all models are wrong because that model is it's not a full representation of reality. It's it's a it's a you know abstraction of reality

and you're trying to do your best to try to squeeze reality into it and sometimes just doesn't work. Okay? And other times it works really well. You got to figure out what uh what's the right model for the right situation. So the example I just gave with the kfan model with the faster better cheaper it it's it's just a model that gives you one view of the world but it doesn't work for everything and it works some some of it works for AI and some some of it doesn't. So now I I fall to this particular trap all the time because I created my own model um it's something called the cyber defense matrix. So if you're not familiar with

it, you can look it up. But I created this model because when I was at Bank of America, my job was to talk to lots of vendors and try to figure out what they do. And of course, the answer for them that I get is they do everything. Like no, you don't. Right? So I'm trying to figure out exactly what they do. And so I use this as a mental model, a working model uh to say, okay, tell me what you do and I'm going to I'm going to put you in one of these boxes. And this allowed me to quickly understand and remember a lot of these vendors that we see all the day all the time. Okay. But naturally

you would ask I'm sure the question or at least I asked the question. Okay great. Where does AI fit in this? Okay. I mean that's for me that was my natural that's a question I get all the time like where does AI fit in the cyber defense defense matrix? And uh I was like I I have some ideas but I don't know if this is the right model for that. Um, and I struggled to figure out what the right model is. But I discovered a way to understand where AI fits because I started to merge mental models. Okay? And to merge that, now I have I have to caution you all. I'm going to start um throwing more

variables into this. And when I throw more variables, people get confused and uh there's a there's an inverse correlation between the number of variables you give to an executive and their ability to make a good decision. Okay. um one variable they'll make a decision very quickly. You two by the way just just so you know what the curve looks like in this. You give them no variables and they're like their decision quality their decision confidence is like down here. Okay. You give them one variable or two variables it goes up here. You give them more variables and it drops all the way down here. And then if you add more variables eventually it kind of goes like this

again. Does that curve look very familiar to you? Okay. All right. I'll explain that more later. Anyway, so I'm going to add more variables and apologies in advance. So I I looked at this and said, "Okay, gee, I wonder um what does uh each of these different uh asset classes has another domain or has a has a deeper view of it." And I said, "Okay, what of I I know that OSI model, we talked about that earlier, um has a seven layered model for networks." And I said, "Gee, I wonder if all these other domains have something similar." Well, when I looked at AI, like the domain that was most relevant for AI was data.

Okay. And I said, I wonder if data has another domain like its own sort of layered view of the world. And lo and behold, it does. It's called uh the DIKW pyramid. And if that stands for data, information, knowledge, and wisdom. Okay. So, another mental model. And what I when I studied this mental model more closely, I realized, hey, gee, look, it looks like a lot of the things that we're seeing with like you hear the word all the time. These are knowledgebased systems. These are knowledge systems. These are helping us manage knowledge, deliver knowledge. You're not really hearing these words much anymore. But these LLMs are really about knowledge. And I would argue that we've moved up

this pyramid where we are now operating at the knowledge level. And we're not really just operating at these levels anymore for to borrow a networking terminology. Um we're not operating at just layer three and four anymore. We have like a new layer 7 protocol that we're having we're having to deal with. Now when I use use this mental model I was also trying to figure out okay how do I understand the problem space of AI itself? Like define the AI problem space. Well in isolation that's kind of hard. I have to think about well like AI seems like a brand new thing but if I can replace the word know AI with the word knowledge I now have a way to

anchor my understanding because I know my problem space for data problems like data engineering and data security and data quality and data privacy all these suffixes that you can apply to the word data also apply to the word information and then when you apply to the word knowledge you now have the problem space for AI okay so the DK AW pyramid plus AI gives us the shape of the knowledge economy. These are the problems that we're going to run into and the challenges that we're going to face when it comes to AI. Okay, let me give you some examples of just thinking through this. Uh, and by the way, when I first did this exercise like eight years ago,

when I saw this pattern, I didn't know what some of these words meant. Like for example, knowledge quality. I like what is knowledge quality? Anyone know what a knowledge quality issue is today? >> Hallucinations. >> Hallucinations. That's right. Exactly. That is a knowledge quality issue. Now, here's an interesting challenge. Let's suppose Okay, so usually when when we have a when we say, "Hey, I have a knowledge quality issue." We say, "Ah, you know what? The reason why I have that is because my data quality sucks." But let me offer a perspective here. What if your data quality was perfect? Would you still have a hallucination issue? Would you still have a a knowledge quality issue?

>> Yes, >> you absolutely would. Okay. All right. Let's let's pick another one. Um, knowledge privacy. What's knowledge privacy? Well, you know what data privacy is, but knowledge privacy became pretty evident when uh we had Cambridge Analytica and the issue associated with Cambridge Analytica. I mean like try to stop what Cambridge Anal like go into Facebook and turn off what my what your political preferences are. There is no field for that. Okay, it was all inferred from this. So knowledge privacy was is an inferred attribute of you. And when we talk about privacy regulation, we focus a lot on this, but what we really mean is this. Okay, what we really really care about is this. But

yet most of the time we're focused on stuff down here. Again, hopefully you see the same problem. Again, fix your make fix your data. Does that actually fix your knowledge issues? You have perfect data quality, you'll still have knowledge quality issues. you've tried to make this privacy preserving in every possible way. Will you still have uh inference issues? You absolutely will. Okay. And this became evident for me when I was looking at what what is knowledge security. So, you know, we're all security people like I was trying to figure out what exactly is knowledge security. Well, it's pretty evident in the context of you roll out a large language model inside of an enterprise. You you do it because you want to share

institutional knowledge, but it also accelerates your ability to find lots of overshared content. And so this is not necessarily a good thing because now you're starting to uh if if if your organization is trying to roll out co-pilot or Gemini or you know Glean or whatever else might be um you're going to run into these sort of issues. Okay. Now you people run into those issues and what's the first reaction that we get? My data governance sucks. I haven't done any data classification. All my permissions suck. All right, agree. Well, think about that though. That's solving the problem down here again. Okay. And if I solve the problem down here, not only is it solving at the

wrong layer, but here's the effect of it. I'm basically squeezing this bottom part of this pyramid. And when I squeeze this bottom part, you figure out what happens next, which is my LLM becomes stupid. And who wants that, right? Nobody wants a stupid LLM. What's the point of that? And ultimately the perspective is um uh like I mentioned the the layer 7 example. If you have a layer 7 protocol, you need a layer 7 control. That is how PaloAlto ended up with the nextG firewall. We need the sort of nextgen approach for for this new LLM uh space. And as much as we keep turning to this type of stuff, we have to recognize that at some point these

become somewhat obsolete. And let me explain why on that front. Back in the day when all you had were databases, you'd give people database level access. People run SQL queries. Well, at some point you're like, oh, you know what? Forget that. We'll write applications on top of the database. Which meant that we can start removing people's database level access down here. What we have now is people rather come in at this level and at some point when we can control it properly here, we can start removing accesses down here. Okay. all the problems that we have here could could be simplified. I don't know if it'll get it won't be eliminated, but at least it

will be simplified. And that's the opportunity I think we have in the future to to be able to leaprog some of the challenges that we face today. And that and that's kind of what I'm trying to focus on. So anyway, again, um the DIKW pyramid gives us a sense of like how AI fits into this overall construct. Okay, let me now uh do another mental model. And this is a mental model. Again I think I asked if people knew the udal loop. So this is the udal loop but with a small small modification. So instead of observe I call it sensing. Instead of uh orient I call it sense making. And in in the construct of uh the udal loop um

it's not a loop here sorry um but the but the basic construct of when I ask um well first of all let's let's think about where AI is in this construct. So you have sensing which is where you have raw telemetry coming in. You have sense making and this is really where AI fits. Okay, sense making and AI is really right here. We're starting to see some AI also help in the decision- making and then the acting is just the execution piece. I do want to make sure it's very clear there's a separation between what we traditionally call what AI is starting to blend a lot. Okay. But I think just by having this sort of

separation helps us understand how to think about this in a much more systematic way but also understand what the risks are as well. And so to understand that here's the exercise I went through. I said okay you take each of these four functions and let's suppose I turn it over to a machine to do some of these I'll turn over to a machine others I'll let I'll reserve the right as a human okay um and so if I have the machine for example do sensing and I let I do the sense making and decision- making and then machine does acting well that looks like essentially a case like automated patching so automated patching is I have determined

that Windows updates or Windows is a reliable sensor. I've also determined that um I'm okay to follow a a playbook that says patch only workstations and as long as I've scoped my actions narrowly as long as I have a reliable set of sensors and maybe a big red stop button and a reverse gear, I'm willing to let a machine do this. Okay. Um and another usea use case like this is uh uh that threat intel blocking. But b basically this is a very reflexive action. Uh you get a sense you you act and the human uh is sort of in the loop in that they've already made the determination of what to do with the

sensing and what what actions to take. Let's now let the machine do sense making. So now let the machine do sense making. This is a classic uh sore use case security orchestration and automated response. um the machine does sensing, the machine does sense making. I have determined the playbooks and then machine acts. In this particular second case, um I I'm still letting the machine do sensing and acting. Therefore, the controls for sensing and acting still apply. Okay, they still apply. But because I'm doing sense making, I now need to have additional controls for sense making. In other words, the I'm turning over the sense making um uh activity to a machine. It can go south.

So I need to make sure I have proper controls around the sense making. Okay, follow so far. You figure out what the last step is, right? Which is you now let the machine do all of these. And this is squarely where we have agentic AI. So agentic AI is basically a situation where you're going all the way through sensing, sense making, decision-m and acting. Now the additive piece of course here is the decision- making. Okay, the additive piece between case number one and case number two, which we're already well familiar with, the only thing that's really different with agentic AI, I mean, I'm generalizing a bit, but the the biggest difference is now we've turned over autonomous decision- making

to a machine. What could possibly go wrong? Okay, that's what WCP GW stands for. What could possibly go wrong? So again, the controls for the last case, I still have the control. I still want to apply controls for sensing and acting. the same set of controls. I still want the controls for sensemaking. But because now I have decision-m uh at turning over decision-m to a machine, I need controls around decision- making itself. Okay? And I I give a couple different mental models. One mental model I use is interns. Imagine you hired a hundred interns. You let them loose, see what happens. Okay? And what I find I I've actually hired hundreds of interns over a summer before. They all

work for me. It was my it was the best way to find uh build a really sharp team because I get to pick you know I get first pick of the best interns. But who are the best interns? They're the ones who can navigate the intricacies of your business and find the new business process or new way of doing things that actually like work. Okay, 90 90 95% of the other interns they're going to fail. They're gonna suck. They're gonna uh stumble over them. they're going to call create all these uh security issues. There's all these things that these other 90 interns are going to ha do. Okay, that's a great mental model for agentic AI. You're going to have you're

going to unleash all these agents to do things and a lot of them are going to suck. They're going to be like it it ran into this issue. This broke this whatever whatever whatever. But, you know, five 10% of them are like whoa that's actually pretty clever. Okay. And what do you do once you find that new business process? You bake it into business as usual, which is case number one or case number two. The biggest benefit of H&TKI in my view is actually not the action, it's the discovery of the process that led to that action. It's the process that led to that uh outcome. And so in the context of how we think about agentic

AI, we we generally think of it as oh look at all the things it can do for us. But what if instead it's a significant investment or opportunity to invest in process not just the action itself. Okay. So let me now map it back to the cyber defense matrix. So cyber defense matrix uh I shared earlier I didn't really talk about this bottom part and the bottom part um I mentioned that there's a degree of dependency on people, process and technology. people process technology by the way is another mental model. Okay, this itself is a merger of at least three mental models and it's super powerful just because I've merged them together. But one of

the things I merged of course was this people process and technology. Now we think of AI as displacing people as you know that's the concern that we all have right uh AI is going to displace us and it will it will happen. I I I think it's I'm not saying that's not going to happen. But if you think about agentic AI, what if Agentic AI is actually an investment in process? Okay, let me let me give you a scenario. Um I hire McKenzie or some consultancy. They come in and they say, "Oh, you know what? You need to rework these different this these different business processes." And you're like, "Oh, you guys are great. that means I can I you

know I actually I realized I overhired these 10 people because um it just I didn't I don't really need them anymore because the business processes are better. So these 10 people who get laid off, they can yell at Mackenzie and say, "Ah, Mackenzie, you know, they're the ones who uh who uh they're they're the reasons why they that my job got um that got fired." Well, actually, sort of. I mean, McKenzie helped the business figure out how to make be more optimal and more efficient. But there's no Mackenzie consultant that's taking over your job. Okay. All right. Now, replace the Mackenzie consultant with an AI agent. And the AI agent says, "Hey, here's a new business process that makes

it much more efficient." And you let go of 10 people. All of us are going to blame AI for that. But really, we should be blaming the inefficient business processes. Um, there's a wonderful uh scene from the movie Founder um McDonald's story. Uh, anyone see the courtyard? Anyone anyone know what the courtyard scene is or the tennis court scene? Um I should have a video here, but it's a beautiful scene. It's a beautiful scene of the McDonald's founders uh uh working with their team on a tennis court. They draw the uh the layout of the the kitchen and so on so forth. And they keep redesigning it, doing a different these different things. And they found out a way to get

a burger cooked from 30 minutes to 30 seconds. Okay? And in doing so, this is a story that you you didn't hear, but they laid off 22 people. They improved their business processes and they laid off 22 people. They kept only 12. So they had 34 people and they cut basically like uh twothirds of their staff to due to business processes being improved. Okay. So going back to my comment earlier, um the uh agentic AI the I think the biggest most underinvested part of our of anybody is not the people and technology, it's the business processes. And what if agentic AI actually allows us to really truly invest in business processes? At which point we will absolutely see job cuts,

but not because of AI, but because the business itself has figured out how to be more efficient. We will still lose jobs to AI directly. Okay, don't get me wrong there, but we may lose more jobs to uh to better business processes. Okay, I know that's pretty uh u not that's not very, you know, happy sounding, but you know, just being aware. Okay. All right. So, now let me comp let me now combine all these different all these different mental models. So, cyber defense matrix, kfen, udaloop. Okay. Uh so first is um this notion of chaotic the complex the complicated the clear we tend to actually move left to right sorry right to left when it comes to um uh the cyber

defense matrix we usually uh find ourselves in some chaotic situation. Oh, hey, I just uh got hit by ransomware. Uh chaos, right? Um but along the way, what we discover is, okay, what what just happened? Um let's figure out we have to we it when you're in chaos, you act you move in a direction. Don't The worst thing to do is stand still. Uh moving is better than standing still. You even going in the wrong direction is better than going in no direction at all. Uh because once you know you're in a d once you go in a direction, you can know if you're going in the wrong direction. But anyway over time you figure out what the

right uh processes and playbooks are. Again this is where I think agent AI is going to help us um identify and of course the codification of it will be accelerated through AI as well and then eventually we want to codify it into technology itself and when it becomes codified into technology that's when things become clear and with vibe coding and whatnot you know we might eventually get there as well. um in the context of of uh how I look at the problem space here um on the left usually we're fighting against technology on the right we're fighting against people okay um but throughout the whole time we're also fighting against business processes and business processes again is the one uh

most underinvested aspect of our business today all right so again this I had to um each of these mental models has its own sort of complexity And if you saw this coming in, you'll be like, "What the heck is this?" But hopefully you see how these different mental models merge together to give you a much more complete picture of of the challenges that we're dealing with. Okay? All right. Now, me and now give you a whole different mental model. And this was this will all also give us a sense of like how to think about where we're going with AI in general, too. So there's this great book by um Max Bennett by uh a brief history of

intelligence and he talks about these major breakthroughs in in brain evolution and uh there's five stages. The first stage is uh steering basically you end up with these bilateral systems and it makes a decision to go left or right. Okay, instead of like this ambia where it just kind of just floats. Okay. Um next stage is reinforcement learning. reinforcement learning. Uh, hey, I turn right. Uh, I get positive reward. I'll keep turning right. The next stage is simulating. This is where the most advanced AI systems are today. And this is a situation where you say, hm, what if I go left? What if I go right? I go straight. Okay. Um, so that's that this is where the most

advanced AI systems are. The next stage after that is called mentalizing. And this is a a hot area of like AI research today. Uh this is also squarely where um uh the the study of like AI safety comes in. How do we how do we ensure that AI systems are aligned to our values? Uh so mentalizing uh it's also called theory of mind. It's trying to understand the intent of others. What what are you trying to do and can I infer that intent and then adhere to that whatever it is? That's exactly what we want machines to do. I tell a machine, hey, get me to the airport as quickly as possible. I don't I I I want the machine to infer that I

don't want to break any laws. I like to still be alive when I get there. I like to not be nauseated. Okay, I like to not, you know, well, there may be various other conditions, but they're all inferred. Um, and I need the machine to understand my intent and not just, you know, run over people and and uh kill me along the way. Okay, the last stage, interestingly enough, is language. Okay, we it seems odd that we're languages at the end. Um, but it's it's it's interesting because uh more than likely whatever language the AI systems come up with at the very end, it's probably not a language that we're going to understand. Okay, but the the

point is that the language um the the breakthrough is uh comes at the the language breakthrough comes at the end. Now I want to focus on uh two of these for now and because there's a mental model that uh is relevant here and the mental model well so first of all uh most systems that we have are reinforcement learning based again I mentioned the most advanced systems are in simulating but uh the uh most adv systems including large language models are in reinforcement learning and it has these different shortfalls. It's prone to bias, it's overconfident, it's unexplainable. These are all things that are uh reinforcement learning uh that we see with reinforcement learning systems. So if you're trying to get an LLM to

explain itself, just this is like this is like a um uh this is like a um inherent flaw. Okay, a flaw that you can't just say, oh, I can go this is not fixable. Okay, that's that's the point. This is not a fixable thing. U at least not with reinforcement learning systems. It's not fixable. Um but there are controls that we can put in. Um and the controls are actually like things like checklists or uh uh uh the the control is actually simulation. Okay. So the control in other words uh you have simulating on the right. The control for reinforcement learning is what you see on the right. Okay. But what's really really fascinating on top of all this is that

this maps to another mental model called system one versus system two. So I mentioned Daniel Conorman earlier. One of his claims to fame is that he figured out uh that our brains operate in these two different modes, system one and system two. LLMs operate in system one. And by the way, we we as people think that we're rational people. We make rational decisions. That's system two. Except that we don't. Actually, most of the times we operate in system one. We make all these irrational decisions and we believe that we are system two thinkers and we believe that all you jokers are system one thinkers. Okay, like that's why you're you're all irrational. I'm the only rational person

here. We think of that um you know across everyone. Okay, but that's what that's the problem that we have with LLMs. have a whole bunch of system one things and we're saying okay how do hey system one lm explain yourself like hey the only thing I did was uh generate random to tokens that are based on a uh uh on a probability sequence not anything else there's no explanation outside of that what you need is a uh system two type of construct and the system two construct is essentially um things like mental models okay what is a mental model But a simul a a a a way that we can simulate the world and say

hang on does this fit into this mental model. Not all mental models are right. Okay, some are wrong. But the more mental models you have, the more you actually are operating like system two people. So I think mental model again the the whole like if there's one thing that you get out of this whole talk, mental models are amazing to be system two thinkers. Mental models are amazing to have these systems operate in a way that helps you really understand what's right and what's wrong. Okay. All right. Um now so what's interesting uh so I mentioned system two is a control for system one. System one is actually a control for system zero. What's system

zero? Going back to uh this again that's a control for system zero. Okay. This is this is system zero. This is system one. By the way the system zero doesn't exist. I'm just using that term. Um this is system one, system two, this is system zero. So mentalizing is system three. Okay, which is the challenge that we have when it comes to um how do we control AI systems? We're we're at at this point we're here or most of ant systems are and simulating and the challenge we have is how do we align these systems to our interests and the answer is well we have to figure out what system 3 is but we already know

what system 3 is okay well what is system 3 all well the key question that I was trying to struggle with is yeah we're talking about AI I mentioned earlier mentalizing and theory of mind is that's where the forefront of most of AI research church is is in. And it's also where we're trying to think about like the reason why we're trying to think about this is you're trying to define what's safe, right? But whose definition are you operating on? Are you operating on China's definition? Are you operating under Trump's definition? Are you operating under, you know, uh Zuckerberg's definition? Whose definition of safe are we talking about? U Musk, anyone? Right. Um so I think the

challenge is this is something that we see not just with AI systems. This is something that we see in life in general within the United States and other countries and the way that we struggle through through this is politics. Okay, who's right? Is the right side or left side or you know red, blue? Who's right here? And unfortunately, we don't really have a clear definition of right when it comes to these political def uh uh discussions. And so um the founders of the country said, "Hey, you know what? We know that people are fallible. people are going to disagree about what the right thing to do is. And we need a control for that. Okay? And the control

that we created is called the constitution. The the control for politics is the constitution. And the constitution has three branches of uh government that in theory are supposed to be, you know, equal se equal and separate and well balanced in theory. Um but what was also interesting was um Patrick Henry had this quote. The constitution is not an instrument for the government to restrain the people. It's an instrument for the people to restrain the government. Okay. The constitution is not an instrument for the machine to restrain the people. It's an instrument for the people to restrain the machine. So Anthropic has uh something that they call the constitutional AI and and you can read about it. But the idea of a

constitution is what I think will help us understand what to do with um with this next stage of AI. So again it's a mental model. The constitution or the idea of a constitution but moreover the structure of the constitution of three but separate um bodies is what we're looking for. Now in many organizations we don't have a exe leg legislative executive judicial branch. Okay. But I was so I was trying to find a similar sort of model with within organizations and I found one and it's called Westerm's typology of organizational culture. So Ron Westerstrom uh he wrote about this like 11 12 years ago and he identified three different organizational typologies and he said every organization falls into one of

these. I think actually every organization has a combination of all three of these but what are those three? The first one is called pathological. Um pathological information is hidden messengers are shot. um ideas are crushed. What a wonderful place to live work in, right? >> All right. >> Bureau of Labor Statistics. >> Um the next one is called bureaucratic. Uh most people understand what bureaucratic is. Uh but generally it slows things down and bureaucracy no one really likes but we tolerate it. Okay. But what was really fascinating in his talk in his uh construct is that the third he called generative. I was like, "Wow, interesting." This is like 12 years ago. He said, "Hey, we

have generative organizations." So, I was looking at this and I said, "Well, we have generative AI. Um, what exactly is a bureaucratic AI and what exactly is a pathological AI?" Okay. Um, I mentioned earlier every organization is actually a combination of all three. And if you have an organization that is out of balance, let's say Enron, and they're all only generative, only generative, then you end up with the kind of chaos that we see. What do we have in AI today? It seems like all we have is generative. But what exactly is a bureaucratic AI? Well, a bureaucratic AI in my view slows the AI system down. Gives us a chance to say, wait a second, hold on, wait, wait,

you know, let's let me see if this is the right answer. Whatever. Okay. But a pathological AI um what exactly is that? Well, I mentioned earlier like information is hidden. Remember what what we're trying to do with oversharing of information. We actually want to hide certain things. So I'm sorry if this disturbs you but I think this is a security function. Okay. The if I think about uh an organization, the generative side is the business side, the money-making side. The bureaucratic side is the legal, the HR, all that kind of stuff. pathological that's us okay um but a pathological AI what is its job the job is not to terminate us its job is to terminate the generative AI okay

it's to pull the kill switch for the generative AI and so again it's all in balance what what do we you know as much as we like to say we're business enablers we're not okay we actually uh there's a legitimate reason to stop the business and say hey business no we can't go either And effectively that's what we want with a pathological. Hey AI generative AI system don't go there. Uh I'm p I have the um the ability to pull your kill switch and I'm going to pull the kill switch. And people may hate me for that like we do in security but that's essentially the I think the role of a system like this and a and a balanced AI system

where a system where we have balance whether it's agentic or whatever we end up with in the future is going to have a combination of all three. But today all we have is generative and that's a problem. Okay. So multiple mental models I shared one the KFN model uh just to be able to figure out what you should do next if you're in chaos. Um knowledge the DKW pyramid. We have a new layer of controls. We have a new layer of of uh um we have a new layer of the OSI stack so to speak with AI. If you think about this as an a AI OSI model, it's you can think of it as

that and say okay we have a new way of thinking about what controls we need. Um if you want to understand the risks of AI LLM and the controls for them, system one and system two thinking, it's such a perfect match. Um agentic AI the udaloop is a is a great model to think through uh what are the implications of what agentic will do. And then lastly when we as we go into the future um I think the idea as much as we issue politics uh it may actually become even much more interesting in terms of who makes the right who makes the decisions of what is safe and what's the right balance of u

these different systems as well. So um and and let me let me just uh before I end we get a so I've shared a lot with you okay and you'll hear a lot of people talk about different things throughout this week but how do you know what they're saying is correct okay and all I can offer to you is here are just mental models here's my a here's the API to help us communicate better and if the structure that I gave you uh or what what I say to you what somebody says says to you doesn't fit that model. Either they're wrong or the model's wrong. Okay, but it's an easy test. You'll hear for example, I don't know

you you like the problem that around the oversharing of content that people will say, ah, go fix your data. And right there, you're like, wait a second, that doesn't fit the mental model that we just talked about. So, you now have a way to to sift through um all this knowledge, all this data, all this information that you're getting. And now you have a way to decipher what is actually true wisdom. So you remember the top of the pyramid is wisdom. Okay. Uh grant me the wisdom to know the difference and mental models I think is the key that unlocks that wisdom for us. So with that, thank you very much.

I'll hang around for questions if you have any as well. and special prize if anyone notices anything interesting here. So, all right. >> Hair color. >> No. Oh, yes. I gotten older. Nobody notices anything >> like squint.

>> All right. Nice. >> All right. So you I know it's like you expect it to be way out there and it's right here. Um, in the cyber defense matrix you talked about kind of moving from the chaotic to the clear and that kind of AI will ultimately we want to get it to that system three of wanting to mentalize and understand what we want. But do we need more to go from mentalizing and knowing what we want to choosing to doing what we want and then actually doing? Right? Because in psychology people have to know the right thing to do then they have to choose it then they have to actually do it. Um is it enough

to get the model to mentalize? >> Um okay so everything you said uh can be done right here knowing what to do choosing it. Um it's at this it's mentalizing is the point at which we start judging other people. Uh mentalizing is at the point we say no you're actually wrong. Before then there was no sense of wrongness in the sense of like from a moral standpoint. Uh before then it was you know it it just works right whatever it is. But as soon as you hit mentalizing it's like um that's shameful. Uh so the question of what is safe what is right um what's okay. If you start putting human judgment against it, meaning that m what the machine did is

not not the right not right in the sense of wrong from a again from a moral standpoint. Um the troy problem of hey should I run over three people or pull a switch that runs over one person. It's the same sort of rightness wrongness type of question that's being uh posed there. But so there's getting the model to know and this is like you're here for Matt's talk yesterday, right? Um there's getting the model to know what's right wrong that something is a lie. But is that enough to get it to not lie or does it also have to want to not lie? >> Oh whether whether you want it to lie or not a system one system one

reinforcement learning system doesn't care, right? Um but how do we know a uh LLM is lying to us? We have a mental model that says, "Hey, in fact, what you should instruct LM to do is to fit things within a certain model. Um, hey, write a write a five paragraph essay for me." Well, it generates a four paragraph essay or a six paragraph essay. I don't really care about reading the rest, you know, that it's wrong. Whatever it is, uh, the whole chain of thought, the idea behind chain of thought is it's giving you the structure, the mental model, but the structure itself you should question and say, is this the right mental model itself? But you once you have the

structure, you can then validate whether the response that comes back fits within that structure. Do we have um I know the next talk is at 11, right? Okay. So we do have time, right? >> Four more questions. >> Cool. And again, this is ground truth, right? So I hope this this this fits the narrative of what ground truth is all about. >> Um where you had the things on system one, system two a couple slides ahead. Yeah. uh the the control for system one was system two. Where does that sit? >> I mean obviously system one system doesn't have that kind of process. So you haven't >> Yeah. So actually um Okay. So let me let

me see if I can actually describe it from a from our brains. So our brain our our system two brain um prefrontal cortex uh versus our amygdala. Okay. So like physically where it sits it's uh in front of layering on top of the amygdala. >> Okay. >> Separate on top of >> Yes. And so if you think about where these LLMs are going, it's the reasoning function is really actually operating on top of the LLM itself. Okay. >> Layer the way the brain evolves. >> Yeah. Yeah. It's it's the science of like where the brain like brain science is amazing because it tells you so much about how these LLMs work. Uh but that's essentially the same construct that's

happening. Uh but where if you don't have that layer, you can you can create your own layer right there. What you when you ask it to say uh do a chain of thought or when you ask it to fit within a certain model, you're essentially creating that own you're creating your own uh layer on top of it. So, a couple years ago, I had the honor of attending one of your training classes on your matrix, and I've taken it back not only to my organization, but to my own community. We've added a bunch of extensions onto that because I love how it makes a very clear one-pager that you can have a conversation with anybody

in the organization. But if you add on to that like mappings to whatever your framework is, mappings to what norms are in play, mappings into what type of evidence, you get a really great strong uh view of what you're doing. And the biggest thing that I've seen in my in my own and my peers when we're dealing with AI a lot of the less than desirable outcomes we're getting is because we don't have the intimacy or into me I see into what our processes our procedures our datas and what our failed uh what I what should I say assumptions are. Uh well going back to the notion of these mental models as a communications tool, right? It helps us the mental

model the the cyber defense matrix um has been extraordinarily useful for a lot of folks because it serves as a quick API to help us get on the same page on like literal page on certain things. Um at the end of the day if this mental model or something does there's other mental models for business processes. I don't know what they I'm not a business process person. uh but you we should probably figure out what those are and as we discover them like this whole what you saw what I hope you saw was not just a single mental model but the bringing these mental models together and how powerful they are once you bring them together they're powerful

on their own but they're even more powerful when you start bringing them together and you start discovering some amazing things I don't I'm not a I think there's a huge huge opportunity here with agentic AI but I'm not a process person so I have no clue what the mental models are for this and so like Let's, you know, we'll explore and if there are people here who can think through that and come up with new ways to describe that, then we've created a new API for our brain and, you know, hey, we're all good for it. Any other questions? >> All right. Well, thanks for your time.

[Applause] Hey hey hey. [Music] Hey hey hey. [Music] [Music] Baby, [Music] doo. [Music] There you go. [Music] Hey, hey hey. [Music]

[Music]

I am really appcel.

Good morning everyone. Good morning everyone. Uh welcome to the uh besides Las Vegas and this talk is mainly about advanced network thread detections through standardized feature acceleration and dynamic enable learning and by the speaker Jason 4 and I would like to do a few announcements before we begin. our sponsors. We would like to thank our sponsors especially our diamond sponsors Adobe Caddy and our gold sponsors Formula and Run Zero is their support along with our sponsors donors and volunteers make this event possible and I would like to say everyone please don't take pictures who would like to post our videos in the in our YouTube panel.

And and one more announcement, today we have a data science meetup at 700 p.m. in uh pool party entrance. And that's it.

>> Good. Awesome. Everybody hear me? Okay. >> Yeah. >> Outstanding. All right. And right on time. Well, thank you all for coming. Um, this is my first talk at Besides, so I appreciate the fact that you are my first audience. Thank you so much. Advancing network threat detection through standardized feature extraction and dynamic ensemble learning. You know, when we write academic research papers, they do not have catchy titles. They are very descriptive. They are also incredibly wordy. But for the next uh 19 minutes or so, I'm going to walk you through some research that I've been working on for the better part of the last 2 years. Um, and then talk about where I'm hoping to take this work next.

So, a little bit about me. Um, I'm a research engineer at Proof Point. Been there for a little over 10 years now. Um, done a bunch of school at a few different places. University of South Carolina, go Cox. Um, published papers on a variety of different things. I'm not a threat researcher, but I've had the opportunity to do some research on state sponsored actors um using machine learning to detect QR codes in email. For those of you who have any exposure to email security, you know that the last couple years that's been one of the hot topics in the industry, but most recently, and the reason that I'm here to talk to you today, uh is about using

ML for network intrusion detection. Uh when I'm not working on this stuff, um I live in Colorado. I'm usually out enjoying the outdoors, whether that's when the weather's real nice, whether the weather's real cold, uh, finding a mountain to ski down, usually something to that effect. Uh, and my career mentor, um, for those that know Sherid Degrippo, who works on the threat research team over at Microsoft, um, she told me, "You really need to add something fun to your about me slide, like tell them what your favorite protocol is." And I was like, "Sharon, why would anybody care what my favorite protocol is?" She's like, "Just do it. Just put it on the slide." So favorite

protocol is DNS. And why? because it's what makes a lot of other things work. And usually it's the first thing to break that really becomes a problem. So let's talk about the talk. What is the problem that I was trying to solve or at least find an answer to as part of this research? Um I don't think it's any surprise to anyone in this room that the attacks that we are asked to defend against are becoming not only more sophisticated but more difficult to detect. Now, NIDS has played a really important role in detection over the years. Um, but one of its main constraints is the fact that NIDs is going to be reliant on reputation on

signatures, right? So, the ability to detect things that it has seen before. A lot of the prior research in this domain um has focused on using an individual ML classifier with a very very limited or very small data set which looks really good in a lab but when you actually try to turn it loose on real traffic it does not generalize well. So end up with very poor efficacy in a lot of cases and in a lot of the studies uh anything that was sort of turned loose outside the lab saw efficacy under 40%. So, these are some of the things that we wanted to try to tackle with this work. Now, just as a

brief bit of background, uh, for those maybe that aren't as familiar with the terminology, just wanted to do a quick 30 seconds on NIDS versus NDR because that's usually the first question I get when I talk about this work. Well, Jason, you know, they use ML in in network detection and response. I know we're specifically focusing on uh NIDS as part of the comparison for this talk. So, just a a brief uh explanation. NID's going to be looking for things based on, you know, rulebased or signature based detection. NIDs is going to or NDR is going to incorporate some of those behavioral analytics and things of that nature. A lot of times it's using anomaly detection models in order to be

able to do that. Um, but also offers uh if you've used products like Vera or others, the ability to do um threat hunting and some automated response capabilities. So the research uh or the objectives of this research was really to find uh if it was possible to improve accuracy in detecting threats by looking at stuff on the wire and there's three pieces to that. Uh the first is developing a standard feature extraction framework for analyzing the traffic. Um, for anyone that has any experience, uh, you know, working with machine learning training classifiers, I'd argue that in a lot of cases, the features you extract are actually more important than the model that you choose to train. It's not

to say that they're not important, but in a lot of cases, uh, what we see is that models that generalize poorly didn't necessarily extract the right features during the training phase, which leads to them not being able to detect the stuff that you want reliably. So, uh, one of the things that that, you know, I worked on as part of this was looking at what everyone else had done. What sort of feature extraction frameworks were out there. One of the things that I found is that in a lot of cases, people were relying on things like training the data off of IP addresses and port numbers. As you can imagine, that probably didn't scale super well once you turned it loose on

traffic that it hadn't seen before. Uh, in other cases, they would actually use another classifier to do the feature extraction automatically. That's actually a fairly common approach in ML is let the model decide what things are important and extract those things. The problem with temporal data or you know things that um are transactional in a lot of cases is the temporal dependencies don't necessarily lean themselves uh to uh the type of work that we're talking about where what I actually want to do is to be able to distinguish between benign traffic and potentially malicious or suspicious traffic. Right? it's not necessarily going to be able to pull out the nuances that are required when you're just

allowing the model to decide on its own what things it should be paying attention to. So step one, figure out a feature extraction framework. Step two, pick a bunch of classifiers, use the train framework to then uh train them to be able to actually do detections. The last piece of this and and where I got to flex a little bit where the creativity comes in as part of the research was actually designing an ensemble classifier. Ensemble classification in ML has been around forever. This is just yet another opportunity to be able to leverage uh ensemble learning in order to be able to hopefully provide uh a classifier that will work really well when it's exposed

to things that it hasn't seen before. So you'll see this visual a few times. I don't expect everybody to read the text. Essentially what we're going to do here is walk through the process of how I extracted the data, trained the classifiers and then we actually get to the point where we can start providing scoring. So we're going to work on that first box first. Load the samples, extract the features and then apply scaling. Now one of the things that I found as part of this research um that is also very common in this domain is we don't necessarily have a lot of really good public data sets that have the stuff that we want in order to be able

to do model training. Um, in a lot of cases, the data sets that have been used in prior research, uh, actually leverage a subset of the data. So, someone will take some pcaps, they'll reduce that down to just the features that they want to train their models on. They'll then convert that over to something like a flat text file or a CSV. They'll make those public, but unfortunately, it doesn't have again all that other metadata that we might want to actually train uh, our models on. So for the purposes of this research, I went out and sourced data sets, public data sets uh that actually provided raw pcaps. We want to see the whole thing. This al

this allows a lot of flexibility in terms of uh doing things like looking at flows uh the ability to look at forward and backward packets. Um you know as well as a bunch of other metadata that you might not get if it's just a bunch of sequential lines in a CSV. So the data sets that I sourced um there was three of them. um one from Czech Technical University um another from the University of New South Wales and then the last one there from the University of Science and Technology of China. Um in addition to that we collected some uh benign traffic from home and academic networks. So uh with ML training in a lot of cases you're looking for balanced

training data sets right both for actually training the models as well as validating them after the fact. In a lot of cases, these public data sets were great because they gave me very very good representations of known bad traffic, right? It was labeled. I had to have the ability to say this is actually something that is malicious for the purposes of of training the model. What they didn't have was the good stuff, the benign things, uh, you know, a Zoom conversation or, you know, two people chatting on Microsoft Teams or Slack, things of that nature, regular web browsing. So we collected some network and or network traffic and controlled environments uh to be able to then

balance out the training data set for feature selection. Um the things again that I had seen in a lot of the prior research were based around unfortunately fields that did not end up generalizing well when they were shown traffic that they hadn't seen before. So I focused on a lot more uh metadata related things um at the packet level things like the count of the TCP flags the average time to live of the packets and then most of what the model was going to be trained on was flow level statistics. So we talked about things like forward and backward statistics in arrival times the number of packets that were observed in the last 10 seconds

duration uh etc. So back to the graph. Um moving on from there, extract the features out. The idea then was to uh generate predictions from each model. But I had to select some classifiers to train uh in order to be able to do that. Now some of these have been researched exhaustively in this particular domain. Uh the first two random forest and isolation forest. As I mentioned, a lot of NDR solutions will do behavioral analytics or UEBA type things. Isolation Forest is more often than not one of the models that gets used because the nice thing about that is if you don't have the data set of all the malicious traffic, but you do have

one of what good traffic looks like, you can just train it on one and not the other. The idea is that it has the ability to identify anomalies that don't match the things it knows. Some of these others, uh, GMM in particular, not a lot of research on using that in this domain. So, um, decided to include that to see, uh, if it would be feasible to use it. Same thing with QDA and some of the boosting um classifiers as it relates to their ability to again be able to generalize well against things that they haven't seen before. Uh the neural network models that I selected have also been used fairly extensively in identifying network traffic. Um I'll

we'll come back to that closer to the end because as I found out um hyperparameters for training those models become that much more important in terms of their performance. So moving right along uh through our process flow here um after training the individual classifiers uh then actually aggregating them uh based on benign and malicious scores and coming up with an ensemble score. So this was the art part of the presentation and I'm just going to warn everybody ahead of time this is also the math part of the presentation. Stay with me it'll be fairly quick and painless uh but but we are going to get into the the math part of this as well. So uh the idea as I mentioned at the

beginning was to create an ensemble classifier and yes when you are the author of the paper you get to name it after yourself. Ford class specific weighted values uh is the name of the classifier. Um I'm going to just go ahead and build this all out because there's a lot of text um a lot of it atalicized symbols uh and then uh an explanation of how the waiting function works uh for the purposes of understanding this. The idea here is that we train all the individual classifiers. We then run validation testing against them to get their accuracy scores when they are scanning traffic individually. We then use that as the base weight to then be able to

combine them together so that we can have them effectively cancel out each other's weaknesses but gain the benefits of all the things that they do well. Um what is somewhat unique and novel about this is we're actually looking to give more weight to less confident predictions. Right? So simplest way to think about that, go to the uh next slide here. Uh let's see. Applying the thresholds, this is probably the one that makes the most sense. For things that generated a score closer to 0.5, they would get higher weight. For things that were closer to zero benign, or one malicious, they would get less weight. The idea is to filter out overconfident predictions. And in testing, uh you'll

see here in just a second, that actually performed incredibly well. Um, the math part of the presentation is right there. It's just some basic addition and division. Didn't want to scare anybody away. Uh, but basically taking the weighted, benign, and malicious scores, adding them together, dividing by two to come up with an average, and then the classification is made by comparing that to 0.5. Basically, if it was greater than a half, the sample was labeled as malicious in testing. So, that gets us to the last part here, actually classifying the traffic. Um I realize that may be somewhat difficult to see. Um but the point here is that the ensemble classifier is listed at the

top in contrast with all the individual classifiers that comprise it. Um the ensemble classifier achieved over 97 almost 98% accuracy in being able to identify samples. Um now the the big difference here between it and the individual classifier, some of which performed exceptionally well on their own. Um the boosting algorithms in particular uh which are a bunch of weak classifiers combined together, no surprise, um actually did really well at being able to uh properly classify the traffic. The point here was that the ensemble had the most balanced precision and recall for all of the classifiers that were identified and tested as part of this. The idea being if we get the opportunity to actually test this

against live traffic, hopefully this thing would perform exceptionally well. Um, in terms of future work, um, probably the number one thing I'd like to revisit if given the opportunity, um, is retraining some of the poorly performing models. So, I mentioned this a little earlier on. Um, things like, uh, GMM and the neural network models actually did not perform nearly as well uh, as what I had seen in some of the other research in this domain. Um, I have a few theories for why that is. I mentioned uh, using different hyperparameters um, is is absolutely part of this. That's something that I need to take a closer look at to determine what the feasibility is of

actually using these at scale because other studies have shown there's actually a lot of promise using uh CNN and RNN's in particular for um identifying potentially malicious network traffic. And then the part that probably excites me the most um is the feasibility of actually productizing this. And what I mean by that is actually seeing if this works well outside of a lab, right? Um everything that I did here was Python based. So, um, in terms of the ability to run it on different platforms, fairly easy to to set up, to scale, um, and not a big, uh, footprint required in order to actually be able to run this, uh, in a production setting. So, to that end, don't want to

jump too far ahead here, but if you all get to see me next year and I'm giving a talk at Bides, hopefully it's about the fact that this week I've actually been testing this thing out um, to actually observe uh, traffic to see if we have the ability to classify things. You're probably not going to be able to see anything really good on that photo. Um, I just grabbed that in the room to be able to show. We've just got a little Raspberry Pi with an external uh, Wi-Fi adapter set up there and basically grabbing anything by channel hopping and seeing if there's open Wi-Fi networks so we can actually observe the traffic. So hopefully more to come on that um, you

know, in a future talk. Uh, with that um, just want to leave you with a couple of things. Um, the feature extraction framework that I built as part of this research um, is available um, under GPL license. It's on my GitHub. you can grab that either through the URL or if you want to scan that QR code. Um, and then probably the most important thing to mention is that none of us gets anywhere on our own. I did the research on my own, but I had a lot of help. Um, I want to thank my adviser. um a lot of my friends and colleagues that had the opportunity to uh you know assist me with this and and the last person named

there is my wife and I can't tell you the number of times that she read reread reread again and then offered feedback on both the paper and these slides uh before I got here. So I want to make sure that I take the opportunity to to thank those folks as well. As I mentioned that's totally not a credential harvesting attack. So uh feel free to scan that QR code. I'd love to connect with you. Um have the opportunity to give you to to read some of my research uh and maybe collaborate on stuff in the future. So with that, love to take any questions. Thanks so much. [Applause]

>> Hi. I was just wondering does that change over time? So is this actively monitoring traffic? Does this like real time start changing those weights depending on hey it started out looking benign as I start receiving more understanding more starting to look more malicious? >> It's a great question. So I I think what you're referring to I don't want to put words in your mouth is a retraining mechanism for those classifiers as it observes more traffic. Does it get better over time? Is that the question? Yeah. Um in an ideal scenario where we would actually deploy something like this in in production. Yes, that would be the goal. Um, as it stands, the the classifiers that I used for the

experiment were were static. So, the only thing they know is the data that they've been trained on. If I was actually able to turn this into something that we could deploy um, you know, in real life outside of the little Raspberry Pi that I've got set up in the hotel room, um, yes, that's absolutely one of the things I'd want to build uh into any platform that that would do this because it's only going to get better by seeing more data. >> Right. And then as the like live packet, so a TCP connection, you're intercepting the packets, maning it. >> Is it like classifying like once it classifies it's done or is it actively monitoring that connection and then

reclassifying based off the parameters? >> Yeah. So without giving too much away because obviously this is something that I cobbled together in the last week just to be able to to see if we could actually uh uh test this with live traffic. That absolutely would be part of the idea right now. Um, the way that it works based on the feature extraction framework is it's looking at a sliding window of 10 seconds. So 10 seconds before, 10 seconds after, so 30 seconds worth of traffic. That's obviously a window that could be changed based on performance uh, and things of that nature. But the cool thing is that works really well on a Raspi 4. So the ability

to do this and get a classification decision in under 30 milliseconds is something that absolutely could be feasible uh, in a production scenario. >> Thank you. >> Yeah, you bet. [Music]

Thank you. Great presentation. Uh can you share any uh results or findings? You know, what kind of stuff did it detect? Uh and uh and then a second question would be how do you deal with encrypted traffic? Can it detect things, you know, behaviors and patterns based even though the traffic's encrypted? >> Yeah. No, two great questions. So, um I'll try to to touch on the the first one um as as best as I can. Um again, as I as I mentioned to the uh the other individual that asked a question, all of the testing up to this point has been based off of um static traffic, right? So, we're actually looking at PECAPS, not live traffic. So, the experiment

that we did as part of the research was against a corpus of traffic that was known. there's already labeled benign and malicious. Um, in testing, the idea would be that, you know, if we turn this uh on live traffic, you would have to decide what things you would want to capture outside of the metadata that was being extracted to then make a determination. I can give you a score. I can tell you that it's 57% confident that this is malicious, but the next phase of that of well, what is it then? um I I think would be be sort of the next thing in terms of of developing this and actually turning it into a tool

that could could be used by others. Um and I apologize, repeat the second part of your question for me. >> I was curious how it uh it handled with encrypted traffic, it detect any patterns and then a third question that occurred to me cost uh in terms of the the tokens and the >> Yeah. Um, so, so the little experiment that I've been running, you know, this week at at summer camp, um, is only looking at open Wi-Fi networks. So, we're not inspecting, uh, any sort of encrypted traffic. I think in an ideal world for what I had aimed to create as part of this project, this would actually be something that you would plug into a trunk port uh, on, you know,

your back plane as opposed to doing this over Wi-Fi. It's not to say that you couldn't. Um, there's certainly interesting things to observe over Wi-Fi. I can tell you it's really interesting being in a casino full of hackers and observing everything that those who choose to connect to the open Wi-Fi actually put out there. So, just food for thought there. Um, in terms of cost, um, because of the fact that we are using all I guess what I'd put in the category of traditional ML classifiers, uh, the cost associated with this is something that you can run off of a 3060 on your desktop PC. I mean, there's uh nothing in terms of training that would be prohibitively

expensive for an undergraduate college student to be able to do this probably on either their own PC or one that they could grab out of the lab. Um, you know, that being said, um, the rig that I've got at home's a little bit bigger. The classifiers, uh, trained in under 30 minutes in most cases on the training data set. >> Thank you. >> Yeah, you bet. Any other questions? I know we're right at time here. Okay. Well, I'd love to connect with any of you if you have other questions. I'll be hanging around for a little bit. Thank you all so much again for being my first besides audience. I appreciate it.

[Music] Heat. Heat. [Music] By far down. [Music] baby down. [Music] Baby d [Music] There you go. [Music] Hey, [Music] Burn.

[Music] Heat. Heat. N. [Music] Black. [Music] Hey. Hey. [Music] Heat.

[Music] Heat. Heat.

[Music] Hey Heat. Heat. [Music]

Heat. Heat. [Music] Heat. [Applause] Heat. Heat. Heat. [Music] Heat. Heat.

[Music] Heat. [Music] Heat.

[Music] Heat. Heat. [Music] Heat. Heat. N. [Music] Heat. Heat.

[Music]

[Music] Heat. Heat. [Music] Heat. Heat. [Music] Wow. [Music]

[Music] Heat. Heat. Heat.

[Music]

Heat. Heat. [Music] Heat. Heat. [Music] Heat. Heat. [Music] Heat. Heat.

[Music] Heat. Heat. Heat. Heat. [Music] Heat. Heat. [Music] Heat. Heat. N.

Heat.

Yeah, [Music]

[Music]

yeah yeah. [Music] Hey [ __ ] Yeah, [Music] down down down down down down down down down down down down down down down down down down down down. [Music] Down

down down down.

[Music] Hey, [Music] hey hey. [Music] Uh-huh. [Music] [Music] Baby, [Music] baby. [Music] Fire.

Hey. Hey. [Music] Down. [Music]

[Music]

Heat. Heat. [Music] Heat. Heat.

Heat. [Music]

Heat. Heat. [Music] Heat. [Applause] Heat. Heat. Heat. [Music] Heat. Heat.

Heat. Heat. [Music] Heat. Heat. Heat. [Music] Heat. Heat. [Music] Heat. Heat. N. [Music]

Heat. Heat.

[Music]

[Music] Heat. Hey, Heat. [Music]

[Music] Wow. [Music] Hey. [Music] Heat. [Music] Heat. [Music] Wow. [Music] Heat. [Music]

Heat. Heat.

[Music] Heat.

[Music] Heat. Heat. Heat. N.

Heat. Heat. N.

Heat. Heat.

[Music]

Heat. Heat.

[Music]

Heat. Heat. [Music] Heat. Heat.

Heat. Heat. [Music] Yeah, [Music]

[Music] down. [Music] Hey hey hey. [Music] Yeah, [Music] down.

down.

[Music]

Hey hey hey. [Music]

Hey, [Music] Hey, [Music] Woohoo! [Music] Beauty.

[Music] Baby, [Music] doo. [Music] Hey, hey, hey. [Music]

[Music]

Heat. [Music] Heat.

[Music] Heat. Heat.

Heat. [Music]

Heat. Heat. [Music] Heat. [Music] [Applause] Heat. Heat. Heat. [Music] Heat. Heat.

Heat. Heat. N. [Music]

Heat. Heat. N. [Music] Heat. Heat. [Music] Heat. Heat. N. [Music] Heat. Heat.

[Music]

[Music] Heat. Heat. [Music] Wow. [Music]

[Music] Yeah. [Music]

Heat. Heat.

[Music] Heat. Heat. [Music] Heat. Heat. N.

Heat. Heat. N. [Music] Heat. Heat. [Music] Heat. Heat. Heat. [Music] Heat.

Heat. Heat. N. [Music] Heat. Heat. N.

Heat. Heat.

[Music] Yeah, [Music]

[Music] down. [Music] Hey, hey hey. [Music] again. [Music] down down [Music]

Down down down down down.

[Music] Happy. [Music] Heat. Heat. [Music] by [Music] Johnny. [Music] Hey. [Music] Fire

home. [Music]

[Music] Heat. [Music] Heat. [Music] Heat. Heat.

Heat. [Music]

Heat. Heat. [Music] Heat. [Applause] Heat. Heat. [Music] Heat. Heat. Heat. [Music]

Heat. Heat.

[Music] Heat. Heat.

Heat. Heat. N. [Music]

Heat. Heat. N. [Music] Heat. Heat. N.

Heat. Heat.

[Music]

[Music] Heat. [Music] Hey Heat. [Music] Heat. Heat. [Music]

Woo! Wow! [Music] Heat. Heat.

[Music]

Heat. Heat. [Music] Heat. Heat. [Music] Heat. Heat. [Music]

Heat. Heat.

[Music] Heat. Heat. [Music] Heat. Heat. Heat. [Music] Heat. Yeah, [Music]

[Music] heat. [Music] Black. [Music] Yeah, [Music] down. [Music] Down

down down down.

[Music] Heat. Heat. [Music] [Music] Baby, [Music] baby, daddy. [Music] Fire

home. [Music] Hey. Hey. Hey. [Music]

[Music] Heat. Heat. [Music] Heat. Heat.

Heat. [Music]

Heat. Heat. [Music] Heat. [Music] [Applause] Heat. Heat. [Music] Heat. Heat. Heat. [Music]

Heat. Heat.

[Music] Heat. [Music] Heat.

Heat. Heat. [Music] Yeah. [Music]

Heat.

Heat. [Music]

[Music] Heat. [Music]

[Music] Heat. Heat. [Music]

Woo! Wow! [Music] Heat. Heat. [Music] Heat. Heat. [Music] Heat. Heat. [Music]

Heat. [Music] Heat. Heat. Heat. [Music] Heat. Heat.

[Music] Heat. Heat. [Music]

Heat. Heat.

Yeah, [Music]

[Music] black. [Music] Sh. Yeah, [Music] down. [Music] Down

down down.

[Music] Heat. Heat. [Music] [Music] Heat.

[Music] Heat. [Music] Heat. Hey. Hey. Hey. Heat. [Music]

Heat. Heat. [Music] Heat. Heat. Heat. Heat. [Music] Heat. Heat.

[Music] Heat. Heat. [Music] Heat. Heat. [Music] Hey,

[Music]

hey hey. [Music]

[Music] Heat. Heat. N. [Music]

[Music] Heat. [Music] Heat. [Music] Wow.

[Music] Heat. [Music]

Heat. Heat.

[Music] Heat. Heat. [Music] Yeah. Heat.

Heat. Heat. [Music] Heat. Heat. [Music]

Heat. [Music] Heat. Yeah, [Music] heat. Yeah, [Music]

[Music]

down. [Music] Hey hey hey hey hey hey hey hey hey hey hey hey. Yeah, [Music] down. [Music] Down

down down.

[Music] by [Music] Heat. Heat. N. [Music] Black.

Hey. Hey. [Music] Down. [Music] Heat. Heat.

Heat.

[Music] [Applause]

Hey. Hey. Hey. [Music]

Heat. Heat.

[Music] Heat. Heat. [Applause] [Music] Heat. Heat.

Heat. Heat. Heat. [Music] Heat. Heat. N.

[Music] Heat. Heat. [Music] Heat. Heat. [Music] Heat. Heat.

[Music]

[Music] Heat. [Music] Heat. [Music]

Wow. [Music] Heat. [Music] Heat. Heat. [Music]

[Music] Heat. Heat.

[Music] Heat. Heat. [Music] Heat. Heat.

Heat. Heat.

[Music]

Heat. Heat.

[Music]

Heat. Heat. [Music] Heat. Heat.

Heat. Heat.

[Music] Yeah, [Music]

[Music] down. [Music] Hey, hey hey. [Music] Yeah, [Music] down. [Music]

Down down down.

[Music] Heat. Heat. [Music] [Music] Baby, [Music] baby. [Music] Fire.

Home. [Music]

[Music] Heat.

[Music] Heat. Heat.

[Music] Hey. Hey. Hey. Heat. [Music]

Heat. Heat. Heat. [Music] [Applause] Heat. Heat. [Music] Heat. Heat. Heat. [Music]

Heat. Heat.

[Music] Heat.

Heat. Heat. Heat. [Music] Yeah. [Music]

Heat.

Heat. [Music]

[Music] Heat. [Music]

[Music] Heat. Heat. [Music]

Woo! Wow! [Music] Heat. [Music] Hey. Hey. Heat. Heat. [Music] Heat. Heat. [Music] Heat. Heat. [Music] Heat. Heat. Heat.

Heat. [Music] Heat. [Music]

Heat.

Heat. Heat. [Music] Heat.

Heat.

Yeah, [Music]

[Music] heat. [Music] Yeah. [Music] Sh. Yeah, [Music] down. [Music] Down

down down down down.

[Music] Heat. Heat. [Music] Down. [Music] [Music] Down. [Music] Heat. Heat.

[Music] Hey, [Music] hey hey.

[Music]

Hey. Hey. Hey. [Music] Heat.

[Music] Heat. [Music] Heat. Hey. Hey. Hey. Heat. [Music]

Heat. Heat. [Music] Heat. [Applause] [Music] Heat. Heat. [Music]

Heat. Heat. Heat.

[Music] Heat. Heat.

Heat. Heat. [Music] Hey, [Music]

[Music] hey hey. [Music] Hey. Hey, [Music] hey hey. [Music] Heat. Heat. [Music] Woo! [Music] Wow! [Music] Heat [Music] up here. Heat. Heat.

[Music] Heat.

Heat. [Music] Heat. Heat. N.

Heat. Heat. [Music] Heat. Heat.

[Music]

Heat. Heat.

[Music]

Heat. Heat. [Music]

Heat. Heat. N.

Heat.

[Music] Heat. Yeah, [Music]

[Music] yeah yeah. [Music] Hey black. [Music] Yeah, again [Music] down. [Music]

Down

[Music] Hey. [Music] Heat. [Music] Heat. [Music] [Music] Banner. [Music] Heat. Heat. N. [Music] Daddy.

Down. [Music]

[Music] Heat. Heat.

[Music]

Heat. Heat. [Music] Heat. Heat.

Heat. Heat.

[Music]

Yeah. Heat.

[Music] Heat. Heat.

Heat. Heat. [Music] Heat. Heat. N.

[Music] Heat. Heat. [Music] Heat. Heat. N. [Music]

Heat. Heat. [Music] Heat. Heat.

[Music]

[Music] Heat. Heat. [Music] Wow.

[Music] Heat. [Music]

[Music] Heat. Heat. [Music] This is customer selfassessment data. So this is a long list of hundreds of questions that when they come to us to apply for cyber insurance, we want them to fill out a form and then to say how are you doing in terms of this area, you know, multiffactor authentications and so on. And then also we have insurance portfolio data. So uh since we have uh we also do the uh reins insure the reinsurer we also get the portfolio insurance portfolio data. So um it's kind of unique on that. So um go to some trends and statistics and um the risk environment this year um data encryption business interrupt data and uh business interruption still to be

one of the most impactful um of of losses and uh one of the things I do want to focus on this year to talk a little bit more about is privacy regulations and compliance. uh this is becoming a uh pretty big things from year to year. This is starting to get more and more uh um bigger risk apply to more of the datas. So in terms of data would be like how do you use the data? How are you transparent? on how you going to use it in terms of use collections how long you keeping the data uh in terms of data brokers we're all doing AIS uh the data broker sometime you get data from your

data brokers and what what kind of law do they follow so those are kind of things that that's continue to be a challenge um the good news is uh that we also see let's see we have increased frequency and s sophistication of attack. So uh this year we definitely see more event uh however uh the good news is that extortion uh event percentage wise compared to all of the cyber event business interruption uh privacy breach uh the ransomware event actually have gone down in terms of percentage wise of the events. Um also uh another thing was that decrease in company paying ransom demands. So last year we did it was 23 some percents and this year it went down to 18%. So it's

good news. So that what that's also saying is a lot of our customers getting better at managing the control, managing their environment, doing backups and doing exercise so that they can they don't have to pay the ransom. So that's a nice thing to see. Uh wire fraud, we see that growing. Uh so um this was from uh Baker Hustler on that one. Um we do see 200% increase more than 200%. Uh the industry sector that we see more event this year's on communication media and technology healthcare and wholesale uh retail and wholesale those are the top three industry that we saw in the f last in terms of our clients. So just some background we in we usually

have the largest uh companies that we broker for. Uh Marsh is a bigger uh the biggest insurance broker in the world. So um in terms of global trends we see a lot of the zero day vulnerabilities being exploited. Uh we see more of the cloud event as well. Um so uh one of everybody have seen that one cloud strike event and that's uh anywhere depending on who you ask it's 1.5 to five billion and depends on you know how many that's a lot it's a big number uh Delta Airlines is we lost they lost $500 million for that one event for five days that was it $500 million loss in that five days and then change

healthcare that was another big on uh that's also in the billions. Um and then we also see a lot of the uh supply chain silk python and uh as well as scatter spider. We see uh scatter spiders everywhere. We saw last year a lot in a lot of the caesars entertainments MGMs also a lot of European retail as well. And so this is becoming a pretty big it's starting to connect to also Snowflake uh the AI storage company and then of course the uh Microsoft we also seen that as well. Uh so you see a lot more of the event that would impact uh multiple companies quite widespread type of events. Um before I say anything about on this

one, this is by country. This is the top 10 country and it doesn't it's always there's data bias on this one definitely. So there there's definitely more event in the US. Uh it depends on the requirements of the reporting requirements and um so you can see US ranked pretty much number one and then Canada uh Great Britain and so on and so forth. But some of those uh it's also due to language barriers. So maybe India it's uh have a lot more event but because our data source may not get translated. So we do see the English-speaking country has more event than the non-English-speaking and also the alphabet English alphabet type of country we see more of those kind of

reporting as well. Um so if you drill down to 2023 uh uh drill down to US this is what you see more of uh 2024 is still not complete set of data because takes about 30 some days to find them and then to report them to to actually uh report out and then the data delays and so forth. So I picked the data up to some of it's January of 2025, end of January and so the the data report-wise it's uh not complete. So 2025 we didn't do that good but that's just because that's a partial data set. It's not full data set. We're not done with 205 yet. So in general we do see probably six month to nine months

of data delays. Um in terms of cyber incidents you can see uh finance insurance goes and public men and professional services being the top three but you can see in general finance starting to from a lot more to slowing down and then the bandwidth on this one the next one is public amends also at first it's all about them on 2013 2016 go on and then that's also narrowing down and then the third one's a health care healthc care's continued to be a uh pretty focused target to the to the uh uh bad actors. Um so we some seen some pretty big events in healthcare this year as well. Um in terms of ransomware, it's a little

different uh in terms of like highest one being manufacturer. You can see the manufacturer bandwidth starting to actually quite increase quite a lot. The next one is professional services. So that's this this bandwidth right here and you can see that the band is widening and then the next one is healthc care while the frequency of it look like it's decreasing but the severity we see it's pretty uh severe in terms of healthcare events. So the different industry does experience different type of cyber events and if you look at incident response time um you can detection now have gotten a lot better. So different company companies are using different tools to try to figure out. So on the

average they improved by 10 days uh which is from 36 to 26 and uh um that that's for the network intrusions all average all incidents from 42 to 31. So that's quite a lot of improvement and uh containment is about the same didn't change a whole lot. uh analysis to completion that also has improved some and notifications uh maybe not that much of a change a little bit longer. cause components. So I did this one two different ways. I the first one I ran this num the number here and I was like wow the one and 100 is huge because I have some really large bridge response done and so I said okay let me just take

those guys out given that that's not typical and so then the second graph it's I took out the outliers and uh it it went down a lot but still you can see the averages has gone from it's still like 86 8.6 6 million versus uh uh 14.8. I would say that probably for most company would I I do wonder to to to see there could be pretty huge uh numbers for bridge response litigations. And by the way, the bridge response we had about more than almost 500 data points on on those. That's how I ran those numbers. And then on litigation I have we also have about 400 something data points as well. So you

can see that on the average the data average is about 2.4 million but the one in 100 is about $35 million. So it's pretty expensive in terms of uh uh cost components. And then uh recovery uh data average is about 2 million and the one 100 is about 16 million. And some of those clients actually took a long time. We have in terms of durations of data recovery, we have clients that has some still recovering data that's a year long or more. So that that's a pretty expensive uh number. Yeah. >> Examples of what I guess examples of what how how you came up with that number. So like are you are you calculating people's

salaries and >> No. No. No, we just based on this is a conditional uh severity. So given there's an event given that they told us what their number is. Here's what the log normal distribution look like. That's what that is. Yeah. So you can do the different you know on average they're here the median's here and then the one and 50 is this thing and one 100 is that thing. So yeah. So this has when we actually do this severity analysis when we build the model we would associate the revenues and employee count those are pretty um standard parameters that we use to actually do the severity for a particular client. Uh but for the number uh for illustration

wise this just saying that given all the event that we have numbers on here's what the average look like. Okay. >> Yeah. Sure. Um good news cyber market conditions great uh look at the numbers. So one of the things that we are seeing is a lot of our clients are this year spend uh last year and this year spend more time actually transferring the risk. So increase the limits and decrease the deductible retentions. So uh we see a lot of client because given this environments evolving so they wanted to so this is the you look at the po total program and the primary layered it's pretty consistent the price have gone down a lot so this is good news for our

clients privacy risk so I'm going to go through some risk and trend um it's still one of the most impactful uh risk and then uh of all the cyber perils um we things that challenging to the um our customers where is the data where do you store it is it in the public cloud is in the private cloud is it on prem is it with a partner do you have you know data coming from the from the data broker does it uh is your SDK in integrated with your environment that consumes that embedded in the application that consumes data does the SDK collect data if it collects data you know there's where's the data come from right and

who's has access you know is it employees uh business partners business partner being one of the big one that we wanted to call out for uh sometime you know um you have a company you have thousands of business partners that you work with how much of the data your data do they have and we've seen companies that uh got breached by the business partner. The business partner went out of business. The lot company actually end up having to cover for those kind of losses because that is their data. And then how do you provide now is all those transparencies and user controls and stuff like that. Can you keep up with all of those kind of things? And then uh

if there's rec record consent and certain consent that you have to have and where is it needed how do you get those and how's the data being used and so if you're doing machine learning doing advertising target advertising is it can you do that is do you have to write to those do they provide consent for those kind of things. So that's kind of a um questions that we see challenges to a lot of the companies um your data assets. So we've been talking for a long time PII, PCI and PHI. So now this year this is starting to get more complicated. So I mentioned about SDKs and and application data collections. uh do they

collect unnecessary datas and who's receiving those datas? Um APIs API API that works with SDKs may allow other party to obtain data unintentionally. Uh what about data brokers? Uh smart connections, variable device, uh a lot of the California CCPA and privacy, what's this? CIPA California Invasion of Privacy Act. Uh there are sweep stakes, there's contests, surveys, uh any sorts of reward program, biometrics. So those kind of thing like Texas just sue Facebook for biometric data violation. It end up to be hundreds of millions of dollars. So I think it's billions, 1 point something billions. So regula regulatory in this area has gotten to be a little crazy uh from a state level. The BIPA, the C

CCPA, the CIPA, the California is kind of leading on this one. uh the federal government's now enforcements on you know used to be the the the technology surveillance technology now being re repurposed to reinforce the pixels and session replace and those kind of stuff require compliance and in terms of a global point of view there's about GDPRs um there's about 144 plus country participates in that and then the UK's and the France and Australia being the top three gotten a lot more aggressive about it collecting those kind of thing. Um there's 140 privacy regulation global and they're changing. So this is a tough one to actually manage and there other tools that there's a uh 98% of the

company fail uh the cookie audits per regular requirements and then uh that's a study done by prey and then uh for Alons Alon had a report last year saying that the average privacy liation cost for data violations and now is at 30 million per incident. So this is a pretty big risk that you know requires some attention and if you look at the number of claims that we see this is something 2023 sort of exploded and 2024 it went down some uh the CIPA being the one that's uprising the California uh invasion of privacy act and uh the blue one the light blue the VPPA and the website tracking So it looked like the website tracking

had gone down a bit, but you can see that those are getting more and more sub more from 2020, 2021 and 2022, it's gotten to be a lot higher. So here are some of the losses. Um this is a chart that I updated yearly oh actually regularly for our rep and uh at this conference and you can see one of the uh the healthc care is 2.3 to 2.87 in terms of damages in billions and so there's some number here that's pretty uh subst it's substantial compared the bottom on this thing that fits on this page 101 million on the privacy. So, uh, in terms of fines and penalties, uh, it's also in the billions. Uh, the

the second one was the Facebook ones is 1.4 billion. Uh, state of Texas sue Facebook and got a settlement of 1.4 billion out of that one. So, those are kind of like large uh fines and penalties that companies should uh pay attention to. So didn't mean to throw out all those bad numbers and not tell you what to do about it. So um in terms of data data minimization on the new data you can actually tag a lot of your data. So there is u privacy called privacy enhancement technologies. So those are things that things like that you can do because with AIS and stuff you can do federated learning that's one of them. uh you can uh

basically you train the machine on uh decentralized data without directly sharing the data. Um there's also differential privacy so add random noises to your data so that the statistic doesn't change but it kind of obscure individual records and stuff. Um there's anonymization and pseudo anonymization. Those are some of the uh uh pets that you can use uh for the old data. I know this is something that I'm always guilty of. Oh, I want to keep it. Maybe I can use it for something later. And uh so it's not necessarily a bad thing, but you should prioritize. Say by keeping it, what kind of risk do you incur? Is make that conscious decision about it? Um notice and consent. Uh you

need to keep all of those kind of collection and understand how your data is being used. And of course any sorts of disclosure requirements keep track of it. If the regular ask you we need the all the uh records of requests for uh documentations how do you provide those and then there's a lot of actually have a process and there are tools there's privacy program tools there is basically for privacy setup not necessarily risk monitoring but notices consents and and DSR implementations all of those kind there's tools for those and there's also data discovery and bulk tools Um I don't want to name vendors here but if you want it's actually in the notes of uh my

uh uh presentation. Uh of course too uh map you internal datas and there's some of those uh insurers um their risk management uh offering actually provides those two given that for free and it's included I shouldn't say for free it's included in the risk management option that they have. So if your premium is above certain points, some of those things you could just get and that's all this monitoring then automating those and testing out and see what are you failing what are the things that you got to do. So that there actually have a process and tools in place you can do all this stuff so that you don't have to keep up with the 140 plus regulations

and stuff that worldwide. It's impossible. Next I want to talk about is business interruption. Uh so one of the big one this year last year was the crowd strike one. Uh again that was depends on who you ask that was 400 million to 1.5 but the cyber cues is about 5.4 but delta was saying that we lost 500 million according to their CEO they lost 500 million in five days. So due to cancellations and stuff like that. So um the United Healthcare had the uh $2 point some billion dollar worth of losses. Uh 2.87 was the high-end the business interruption there. It's about 800 some million dollar of business interruption loss in that one. So, um, again, this is

actually a pretty big bucket and I know I talked about this one before and I wanted to just give the heads up of talk about the different type of business interruption that people should be aware of. One everybody know about the uh malicious hacking type of business interruption and DD dolls and so those kind of stuff. So that's the first type it's network interruption and interrupt your security failures type of interruption but second type of in business interruption would be the uh system failure even the software you use I mean that's where move it right that's another one of those you use somebody else's software and they have a bug in it and then the the Microsoft SharePoint

there's something that's also uh vulnerabilities those kind of has also could cause business interrup interruptions and that's not always called out in the insurance policy. So make sure if that's something that you guys that the company is concerned that's something that you would want to make sure that's included in the policy. And then the third one would be this contingent business interruption. caused by a third party which could be IT provider or nonIT provider your supplier your critical supplier uh you don't have batteries that you can put into electric cars that could be another type of you can't produce cars and you don't can't have car that you can't sell so that happened during the pandemic for example

um I also wanted to go through some of the risk areas of business interruption so uh third party risk that was the one I want to focus on a bit it's um 60% of the organization work with more than a thousand partners so you can imagine you have partners partners have consultants consultants have contractors and blah blah blah there's so long list of the chain of that and um 73% that's more than twothird have experienced significant significant disruption by a third party. That's a pretty big one. And then 27% about a third of incident it's involving a vendor. So and then of course the supply chains and stuff. So this is an area that um

that I think you should focus on have some awareness and understanding make sure you understand what kind of interruption you could experience. And then of course the AI stuff uh the access the plug-in the design the data uh that could be poison data uh that could be the train your model who knows uh in in the those kind of thing that could cause interruption as well and here's some of the large business interruption event uh cloud strikes with sears about 400 million to 500 1.5 but not including lawsuits. And so this is not done yet. So this will still be working. It's involving over a hundred some companies now in terms of we're seeing um one of the things that I want to

spend a little bit of time because a lot of one of the trouble with this business eruption claim is that we we don't always see the data and part of it because it's so hard to make a claim. So I want to spend a little bit time just to give you what's acceptable and uh also the length of a claim and adjustment process like 12 to 18 months on average. We see a lot of those. And then the proof of loss is somewhat complicated as well as uh everybody's arguing about their forensic accounting. I have mine, you have yours. How do we manage that? And then also what's covered, what's not. So if you have a environment that's breached and

you upgrade your computer, you upgrade your software, how much of that is count as Betterman that does not it's not replacement, it's betterment. So how much of that should be covered? So on the left it's uh coverage for um best practice and that is widely accepted by the by the um by the market. So proof of loss, you wanted to make sure your whatever proof of loss you put together, it is a cover item. And then of course um get ahead get agreement ahead of time to get the forensic accounting. So you can have your own forensic accounting and uh also the insurer would have their own. they sometimes just agree to uh compare notes and uh so often time one

of the things that we seen in it's that the the the insured the companies um gets forensic accounting the first submission and they said no our for uh forensic accounting doesn't agree with that. So, if you have your broker that actually could help you shepherd you through the way to say, "Hey, this broker is it's approved or this FA is approved ahead of time," then they'll save you a lot of time for those. And then also, it's okay for you to ask them to say, "You know what? If I submit this, I want a response within 30 days." And then whatever it's agreed to, pay me now. Don't wait till all the way in the end. So, this is all standard.

Okay. Accepted by the market. The second area which is make things could make things faster is simultaneous review. Uh it's on the on the other side right side um my I'm on the left. So anyway, uh you could have your own forensic accounting and they have their own and each one of them given the same data set, let's calculate it and compare notes in the end or you can also have um get a joint umpire to say we agree to have um so and so as the empire to to reconcile this. So do this upfront so it'll make things a lot faster, a lot quicker to um to uh for the claim. So you could reduce your

12 to 18 months that way. Uh ransomware um ransomware I'll start with the losses. Um this one the healthc care showed up in both places as well as retail. It's also uh some of the big one that come up in 2024. Uh so if you look at the event count by year 2024 it's from 2022 to 2023 it's still a lot I mean number-wise it increased quite a lot um by year and uh if you look at quarter over quarter it's still uh 2023 Q2 was jumped right up um that that's also uh you can see ransomware is still out there compared to other peril it is less percentage- wise and some of the total loss

year-over-year total loss it's anywhere so from you can see um from 20 2022 is 1.1 and 2023 it doubled that almost and 2024 it's 50% more than what 2023 was so that's jumping I think 2024 has a big uh ransomware um uh claim uh losses. Um in terms of percentage of people that pay ransom, I get a lot of company ask what's other people doing? Are they paying or they not paying? I want to know who's paying, how many of them is paying. So there's the graph of who's paying, who's not paying. So you can see from you know 88% in what 2021 they pay uh by 2024 it's only 18%. So it is getting better in terms of paying uh

companies are not paying because their control is getting better. Um this is a demand versus payments and um you can see that the demand amount have gone up a lot on 2024. Uh pay also gone up. So this the first graph so first side it's a demand second one's a pay for the year. So you can see in the beginning in 2019 they just wild. They're just like we'll demand whatever it is. But now they have gotten more precise toward the end there. The gap of the differences between the ransom demand and pay has decreed. It's still pretty big, but it's not as big as it was before. Um, in terms of pay, I took the 95th

percentile, so I won't uh show what their losses are. Um, so fragulum fund transfer. This is just a mention I starting to say okay since this is a category that's up and coming um we do see jumps in this one and it's often occurred through business email compromise we see more of that causing fraudulent funds and here are some of the large fraudulent fund transfer the big one being about 600 some million on top and uh that's from 2022 2023 2023 it look the numbers seem to be smaller and uh so but uh still we see more of you can see a lot of those 100 million 120 some million uh 100 million in 2023 and 2024.

So here's the fun one. Uh how do you improve your odds using what we have done? So we collect on an annual basis we collect about thousands or couple th few thousands of uh uh insurance applications and self assessments. So this thing is still the same. Uh the for 2025 we still recommend the 12 uh cyber security control. Um there's a paper that's coming out uh in a couple weeks that actually drill down to say what kind of multiffactor authentication should you do and then in employee data protection how much is good enough uh by by implementations. So like for example some of the I I'll go through the more details on this one a little bit. Uh

asset management uh asset management matters. Um you can see that um you need to understand where you where the data is where your assets are. Is it on the cloud? Is it on you know public cloud? How do they transfer? What are the data transfer restrictions? What are law that the data you got from is complied to? Um those kind of things in terms of asset if there's any way to identify uh rogue act assets and stuff. uh in terms of identity management uh control privilege access being number one in return on investment but not just on Windows but also in the clouds in software software service applications various other applications um so those are kind of thing uh also

too uh process coverage and uh cloudware pen testing just not not just your on-prem stuff but as well as things that happen in the cloud a validation of uh doing help desk of passwords and those kind of things. Uh I want to talk a little bit about MFA. MFA was that of the client that we surveyed about 90 to 100% that's a pretty good number have MFA. So what kind of M MFA does it pay off? So one of the things that we did study is the fishing resistant MFA and then the other one it's about 3x better if you actually have that implemented and also there's a biometric and endpoint certificates also has the highest value it's also another

three points 3% better so how much of a MFA you should um uh implement also makes a huge different of course uh EDR R um so 25% increase in EDR correlates to 2 to 3% decrease in breach likelihood. So if you've done your EDRs and then about 75% was kind of where we see the uh saturation point is. So um how you how the the degree you implement certain things it makes a different And then of course the um have a plan you know that the um usually it's a 2.2 2 million less in breach respon if you actually have uh br deploy the security AI and automations across your socks and uh plan for zero

day softwares uh educate your staff uh integrate your insurance into your instant room response have a conversation ahead of time um in in our website we actually have a list of uh vendors that's been vetted through by us to say that they are vendor that's approved that would reduce your risk. So you can go to our website and look through those things and who are the one that the insurance uh broker does uh um endorse them. So um also two secure AIs and uh given have a very clear roles and responsibilities uh and know the where the data is how it's used we spend a lot of time talking about it the average bridge cost for

data is 4.88 88 million. So, and also confirm your process and of course tabletop exercise. Um, optimize your resource. Uh, so delete your unnecessary data, the value of the data versus the risk of the date. Keeping that data, you should make a conscious decisions. Um, you have you want to assess the impact based on you have a very uh well-defined rules. um ahead of time to decide what this impact could be and classify your in incidents by impact. So do if is it a escalation? Do I need do I need to you know notification? Do usually you want to involve legals and stuff and documentations and all of that stuff that those are kind of things that you

should have if a code do I need to file AK. So last but not least back to again reduce your data data asset. So automate delete and uh limit data provided to the suppliers. Um make sure they have a deletion kind of thing inventory of your asset and uh classify your data. So new data coming in classify it uh apply the uh privacy enhancement technology to it. uh set it up right so that you will have mitigate your costs. uh understand where your data is stored, whether it's in the land, in the cloud, it's a public cloud, priv private cloud, software service, and then any source of s third party supplier, both supplier as well as your

third party consumer of your data. be understanding what they're doing, why they're doing it, how they're doing it, and uh what kind of law do you need to be compliant in order to supplies those data to the uh to the uh partners. I got four minutes for questions. >> Yep.

Uh, first, thank you. That was a great talk. And maybe a round of applause real quick. >> Thanks. >> Thank you. >> And you mentioned a study coming out soon with some of the like the details about MFA. Where will that study come out? Where can we find it? >> Uh, it's Oh, wait, wait, wait. Let's see. There's a little scanning right right there. um you can scan it and uh I forgot about that one and that's coming out in a couple weeks and uh you'll see all the detail I think that's a public we will publish that and but there are a lot of paper in there that uh talking about what we learn from

our clients and then by doing studies to correlating uh there's also a physical loss study is also coming out uh physical damages in terms of cyber it's also coming out in a few weeks as well. So um go there and then uh if you want a presentation you can also go there. I'll send it to you as well. So >> and so maybe more practical question um it looked like ransomware went up around you know 1920 21 big losses and it dropped back down. Y >> and now it kind of looks like very similar trend to that previous point with, you know, a doubling, tripling, quadrupling in losses, but we haven't seen the insurance market

respond with premiums because you showed a a minus 3% rate. >> Um, do you have any ideas why we're seeing those increases and losses but not the response of the um industry? Um I think part of it could be that um this is my speculation now. Uh the the the attacks become a lot more complicated. Uh our customers also a lot better in terms of defending themselves and also to insurance company as well too. They have a lot more tools that they actually some of those they used to like those privacy scanning stuff. they would say you know this is the management risk option with that option given that you're premium that you are you do get some of those tools and that

you don't have to pay it's part of your thing so they are helping customer managing those risk and a lot more of integrating it within their their insurance within their environment now there's a lot more restrictions and I wouldn't say restrictions a lot more deep and want them to know more about their environment when they apply for insurance. So if there's something that doesn't look right before you come again. Yeah. So that's why there's hundred some questions that uh 300 I want to say it was like 300 questions that they have to answer when they come apply for insurance and about your uh all the list that you saw the the 12 controls but there's a whole

bunch more of those. We just say say that of those things that was the list of risk things that we see that that's basically required especially the top six if you don't have those done that's why uh when we do MFA is 90 to 100% our customer have it but to what degree that's a different story. Yeah. >> Yeah. Last question.

>> Thank you. Uh, jumping back up to your trends and stats uh, section of the presentation. >> Uhhuh. >> You, uh, I noticed that you spoke a lot about the source of the incidents that you were perceiving there uh, and like who perpetrated them kind of on the attribution side. Is that data that's critical to your loss evaluations or is that just something that's kind of like a nice to have that you you study? >> We haven't gotten to that part yet. So um and also too who attacked your environment that changes all the time. So it's hard to track that one. >> It seemed like a lot of the incident data that you were talking about as far

as the trends we were specifically attributing to various uh threat actors there. Am I perceiving that incorrectly? >> Um I think they change they change quite a lot. The the threat actor would be there like the scatter spider that was one of them. That was this year but before this year they weren't. Yeah. >> Gotcha. So it's less important to Yeah. the covered loss and more just something that you are tracking more broadly. Yeah. We're keeping the bad bad actors but often time they would disintegrate and become a separate thing and call themselves something else. So it's harder for us to >> Yeah. >> How much asset management is enough? [Laughter] Repeat the question. >> Oh, okay. How much asset management is

enough? Oh, god. So, I think you should go look at the uh paper that we have on that one. >> I'm sorry guys, we have to end the session. Um, it is um past time. Um, I'm sure she will be free to answer questions outside. >> Thank you. [Applause] Huh? Oops. [Music] down. [Music] Heat. [Music] Oh, [Music]

hey.

[Music] [Music] Doo doo beer. [Music] Heat. Heat. [Music] Hey, [Music] hey hey.

[Music] Heat. Heat. [Music] Heat. Heat.

[Music]

Heat. Heat. [Music]

Heat. Heat.

Heat. Heat. Heat.

[Music] Heat. Heat.

Heat. Heat. N. [Music] Heat. Heat.

[Music] Heat. Heat. N.

Heat. Heat. Heat. [Music]

Heat. Heat. N. [Music] Heat. Heat.

[Music]

Wow. [Music] Hey.

Hey. Hey. Heat. Heat. [Music] Um, if you have any questions [Music] Oh, right. Um, also data science meet up 7 p.m. um at the Tuskanyany pool. And I think that's it, right? Cool. Um, that with that introducing Brennan. >> Awesome. Thank you. Thank you. Can you guys hear me? All right. In the back. Cool. I know we're like a minute early. Um, but I think I am holding you guys back from the the happy hour, right? So, hopefully I'll make this entertaining and and quick. Um, awesome to be here. Uh, first time here at Besides Las Vegas. Uh, I've done a few besides uh other places. Uh, I am from New York. Uh, live in Manhattan and um, yeah, made

the mistake of most Manhattanites of trying to walk around Vegas. Um, right. Like come on. Uh it's, you know, one road usually you walk on. Uh and I'm realizing like there are, you know, three other types of people that walk around V Vegas. It's, you know, the drunks, right? Um the the bums that aren't really doing much walking and then the the zombies like myself who think and walk like this and our brains are fried by the time you get to your destination. But lo and behold, we're we're here at Hacker Summer Camp. Um really excited to to be here. I'll be here all week. Um feel free uh to come up talk to me afterwards. Uh I've got

some fancy new stickers uh to to share. Uh and they are fresh from the the pack. So you guys are are very lucky. Um I don't they kind of smell new as well. Uh but anyways. All right. So Rage Against the Machine. Let's rage on. Uh quick who is lookup. Uh Brandon Lodge. Um got a a few rolls. Uh let's see. First off, I've been in the financial industry uh fortunately or unfortunately uh the last 18 years. Uh you know, I'm from Manhattan. Well, from Philly, live in Manhattan. Uh I've worked in a few of the banks, JP Morgan, Federal Reserve Bank, uh Bloomberg, Goldman, and HSBC. Uh I teach a professor at NYU. I teach

IT and management and analytics. Um and you know, with that teaching, I've gone down the the rabbit hole of doing some research. um and using my students as lab rats uh for some of my AI work. Um you know, no no bad testing or or anything like that. They willingly help. Um yeah, and won a few awards in the the research um using rag uh aka rage um for cyber security. Uh won an award at Oxford. Anybody familiar with Kaggle competitions, the data science? Awesome. Awesome. Awesome. uh won uh ML use cases uh Kaggle competition at Oxford and then uh US Cyber Command this is back in 2022 I think pre-Chat GPT before AI was cool

and sexy uh if you guys remember the terms like machine learning big data all of that and analytics yeah they're still around uh was using it back then for um yeah alerts fatigue and finding needles in the the hay stack uh so really cool uh opportunities there. Um, and yeah, uh, I've got a LinkedIn learning class. Uh, anybody done LinkedIn learning? Really good. Uh, get your company to pay for it. Uh, I've got a class on rag, uh, using rag for cyber security use cases. And I also got one coming out August 16th for MCP, model context protocol. And I will get into that as well. Let's do it. Uh, all views are my own and not my employer. Um, I do

the fractional CISO stuff. Um, I'm an employee at Manhattan Institute. Really great place. And I get to use uh this stuff to help out and protect our company. So, what are we going to talk about? Um, the good, the bad, the ugly of AI and cyber security. Hopefully, it's not repetitive. I'm sure almost every talk has sprinkled in AI. Uh, we'll get into rage or rag against the machine and some use cases. Uh, GRC and in the sock. Um, I got my start in the sock. Anybody sock analyst? Yeah, it's tough. It's grunt work, right? We've we do follow the sun. Uh so I've been there. I've been in your shoes and you know when I'm implementing researching

doing this stuff, I think of junior analyst Brennan Lodge in the the sock coming in uh coffee, you know, you with a ton of caffeine. Um and just trying to take down the queue down to zero, right? I think we're familiar with that. Um, and it's an endless goal and that doesn't seem to be stopping. So, putting that in mind, you know, how can we we help those folks? Uh, I'll go through some of the data workflow architecture and hell roll your own, right? So, I'll give you all the open source tools um that I've used and even the the costs associated with it. So, that should be the the takeaway. you should all be uh

retrieval augmented generation uh experts by the end of this talk. Uh if not, check out LinkedIn Learning. And yeah, I mean like that's the the takeaway. Bring it back. Try it out. Experiment. Uh we can't sit on the sidelines anymore with AI. The stakes are too freaking high for the cyber security attacks. Breach after breach after breach. Come on, let's do better. All right. So, the good we've got information overload. Is that good? I don't know. Um we are facing a a deluge of threats in data. There's a 27 24 by7 need. Uh we've got talent gaps. We've got burnout. And AI is not the silver bullet, but it can help. So, and I'll get into details about why it can help.

Um the rising attack volume, right? I I don't have to tell you guys this. Uh we've got complex integration challenges that are just all over the place. So AI to help and but if we've got AI and if it works, don't touch it. Like this old network um router uh here collecting a bunch of dust, but it's still got the blinky blinks uh working. Still working, right? Do we clean it off? Do we touch it? Do we leave it in the the closet? If it's still working, um new technology here, what do we do? So, the bad. Um, let's not be like Salt Bay and sprinkle AI into everything. What are we going to do with AI? Is it

Soore 2.0? Uh, is it our co-pilot? Um, can we be more creative with it with new detections? And can we answer annoying thirdparty risk reviews? Can it be the automation? Well, but is it automation for the sake of automation? The end of the day, the human is still in the loop. I'll repeat that. The human is still in the loop. It's not going to take away from that that expertise. Uh, one thing I experienced uh as a a CISO of um not Shadow IT, but Shadow AI. Anybody familiar with the annoying Read AI app? That Yes. Yes. Okay. So, I'm not the only one. Um, quick story. readai entered our teams uh through somebody clicking and filling it out. Um it's

kind of addictive with it being a worm. If you sign up um it sends an email and hey you want to read this transcript of the email. It'll summarize it. Give you really good metrics, analytics, content, right? Tell you who's talking? Don't do it because then it hooks into your browser extensions. it hooks into your teams and yeah, it it's a mess. So, this is not the the first and it won't be the the last um great business use case for it to to spread itself, but um it's going to annoy a lot of of CESOs and security folks. So, just be aware of that and the bad of AI. But what about the other

thing of costs? So, this is Jensen um sprinkling in AI and tokens. Uh in the the big keynote speech he had at GTC, he mentioned tokens 85 times. Uh we're all familiar with the rising cloud cost and lack of transparency. How the heck am I supposed to know of, you know, cloud costs and now they're going to put this onto AI? What is a token? I don't know. The definition is different. It could be a phrase. It could be words. uh it could be whatever AI model wants to come up with it to charge you mysterious funds uh in order to to use that model. So just be aware of that and push the vendors

ask for transparency. What is this going to cost me? You know what are the the expectations? Can you tell me how many tokens I'm using uh with the utilization of your AI? Last but not least is the ugly. Does anybody remember Clippy? Yes. Why are we back to Clippy all over again with with AI and chat bots? Uh would you like help? No. Go away. I can do this. Um yeah. All right. So, uh the AI with the CISO dilemma. Uh I've been speaking to a lot of CISOs in the industry. Um some of the complaints, hey, we're scrambling to come up with an internal AI policy. So, we don't have one yet. all of our

employees are using chat GPT. Um, we can't stop our employees from doing it. Uh, we can implement it, but how do we explain it? The mysterious black box of lack of transparency again with how the heck AI is working. A lot of data scientists that are on the open AI team that are leading and building these models have no freaking clue how it works in the the back end. Some idea, right? um but how it understands our you know English language it's able to translate uh rap you know lyrics uh and write it back in email form right like how and if you can't understand it and it's making decisions can you trust it okay so but in security we need a secure

AI solution we needed to cut down on uh time it takes to make those decisions and we needed to be cheap. So my take here AI as a guide not particularly a fullon stop guard uh we need to understand it and these models you know analyze and generate text u but does that text make sense in the threat perspective does it make sense with our how many alphabet soup acronyms are in cyber security it's endless right can it understand them translate it uh and give us that context back so we need an ally We need help. We need an assistant. Not Clippy. Um, but somewhere in between. And I think we need to be more

transparent with our tools and democratize tools. Huge fan of open source. Uh, but open source is really tough with LLMs right now. And we need cheaper, faster uh, products and rapid analysis. So, can we be data driven? I think you know measure what matters. Uh can we measure ROI for cyber security especially with AI? Yes, no, maybe so, right? Um and we need to integrate and scale. Um there's some really cool tools out there. Uh my friends at Deep Tempo have, you know, a really awesome product of integration at scale and it just gets out of the way, right? it runs in the the background in your your data lake and just does what it's really good at

doing, finding anomalies, finding things that you know your standard static rules can't find. And I think that is the the next step of hey, let it run in the background, get some transparency, get some understanding, get some context, and let it do its thing and have those be high fidelity. uh the first time it's low fidelity. It's, you know, really tough to to prove that that value. Um we're dealing with a crap ton of false positives in the in the sock. And how do we cut that down? And more and more vendors are giving us more and more alerts, right? Like that's the opposite of what we want. So, we need to get higher fidelity and get, you know, some

snipers in there. Those snipers can be AI models that can do anomaly detection on our big data, but they've got to have really good high accuracy rag. Okay. So, what is it? Retrieval augmented generation. Uh, the analogy I like to use is, you know, think of a library. You go up to the librarian. Um, yes, you should still go to the library and read books. They're free. Um, remember the simple system of a library, right? You've got like three components. It's very similar to how rag works. You have a sentence embedding model, a vector database, and a large language model, some AI, you know, hoopla jumbled words out there, but let me make it clear for you. So, your sentence

embedding model, Dewey decimal system. Who remembers that from elementary school? Nice. Cool. All right. translating your ISBN numbers or the authors or numbers like that number in the the back of the book to a genre to a specific section in your library that is over to the the vector database. So the ingest is hey take my query take my question let's translate that into some topical terms that can then be uh semantically searched within the vector database. Our vector database is that stack of books that stack the the layers, the genres, the sections uh that are hopefully organized in a you know easy um manner and understand place. And then we've got the LLM or our librarian.

So your librarian is going to ingest, take that question, translate it with the Dewey decimal system or sentence embedding model, find the right book, find the right page, find the right, you know, paragraph, for example, speed down the aisle, bring it back. English translation, is this what you need? I'll give me a quick blurb and also give me a link as to, you know, where I can find it. So, that is all packaged in a nice little gem uh for you all through Rag. But why should we use this? Well, we've got Kermit the Frog who is hallucinating here um like many static LLMs are doing these days. Don't get me wrong, they're getting better, right? But with a rag,

you can ingest uh with some, you know, real-time feeds new data. with a static LLM. Yes, they're upgrading um you know different versions, different flavors, but you know once that's released, it is a binary. It is a snapshot in time. Any new future data is not going to learn on it because it's a model that was released and done. So with rag, you can continue to ingest that data, upsert it to your vector database, and it's going to be live and it's going to be fresh. the LLM can incorporate that latest data and ensure up-to-date sources. You guys with me so far? Cool. Okay. Um, and it could be private data. Private data that you do not have to share with OpenAI

even though they've scanned the internet and have many lawsuits against them. Um, and other uh models. I I tend to pick on Open AI. I use it. Um, but there are others under that um, you know, same legal matter. Uh, and you can, you know, keep that on prem. You could use your own local LLM. There's really good open source ones, and I'll get into uh some of those in a minute. So, roll your own. How are we going to do this? We want to funnel some of our data. The miter attacks, CE advisories, uh, policies, uh, the endless amounts of internal private data can remain internal and private. When I was building this, I set a a benchmark.

Okay, I wanted it to be, you know, relatively cheap of $500 or less uh as far as processing and the resources. And yeah, this is two years ago now where I rolled this um and I wanted to respond within 10 seconds. Um at the time I tested some AMD uh GPU models. Anybody tested AMD? No, nobody tests AMD anymore because it's so hard to freaking use. Um and that's why Nvidia is Nvidia and the the price of the stock is uh you know so skyhigh. Um you know if you're building any hardware uh build something for developers um easy to use and go figure like easy to integrate with Python. Um so that's what I went with and here's

the steps to do it. So to get to rag we've got to do the data ingestion. um lots of open- source uh tools out there to parse, scrape, um ingest that those PDFs, those word documents. Um and there's this, you know, concept called chunking. Um this is where, uh you've got to, you know, do a lot of trial and error. Uh I said, "Hey, let's just chunk by sentence. Let's chunk by paragraph. Let's chunk by page." And in order to get, you know, good efficacy, you got to figure out what works best for you. Um, if you've got, you know, a couple terabytes of data, probably best by page or, you know, maybe a couple pages. If

you're working with a small amount of data, hey, maybe by the the sentence miles is going to vary. Um, but you've got to do that splitting and then creating embeddings. So, the embeddings are there to create the the similar uh vector representations of the the words and the terms uh that you're using. Uh again, you've got to use a sentence embedding model out there. And then lang chain is the glue that helps you do the the orchestration between a vector database, your sentence embedding model, the the chunking, uh making sense out of the uh question that you ask and bringing that back all through um the yeah lang chain uh function there. Um, and then you ask a question

and then you've got um an LLM. Um, I highly recommend you start to visualize your vectors or um what are in the the vector database. This again gives that transparency of you know what data you're in and you can have the the sections. You could also search in there. So if you're getting bad results of your your rag, um you know, just take out some of those those documents that are kind of throwing it off and maybe outliers. Yeah. And then on the the right side are some of those open source tools. Um you know, hugging face, uh you know, the the GitHub of uh open source LLMs, um the sentence embedding model. I'm using the

all mini L12 version two. I'm still using it. Still seems good. Uh, and yeah, check out Nomi. I've got no association with them. I think they're a really cool company about um transparency. They've got a tool called Atlas uh where they visualize and they have like a you know an inventory of models that they continue to update and you can search the models for context as to what subject uh you want to use your model for. Um it can be a little tricky. So when I was first experimenting, I wanted to build uh like fishing detection uh using rag. Um so I searched in Nomi uh fish uh the band fish came back uh a couple times. So just um you

know watch out for for that and but yeah you know then I got you know some spam and like examples and then I went down the rabbit hole like hey does it have cyber security related things? Um, and I'm sure if there aren't already, um, there, you know, are going to be, uh, LLMs in hugging face that are built for our particular domains, whether it be GRC, cyber security. Um, they're they're probably already there, but more to come on that. Oh, and then Chroma, uh, Chroma DB is the vector database that I use, free. Um, really easy to use, really good, uh, Python help. This is the the workflow. So starting with the the data

sources, the chunking, uh I was just, you know, stripping out some CC advisories from Gmail. Uh I got a really cool uh CSV from the mightier attack techniques. Uh chunk them, upsert them into the vector database. Uh get your embeddings right with the sentence embedding model. I talked about lang chain. Um and then moving over to the LLM services. Um doing the the search, putting it on an endpoint, and then hey, having a little Slackbot. That's where I initially tested it out. Then I built a UI and lo and behold um got some rag. So this is a visualization and I've got two here of um a um on the document embeddings you know with different um

you know types of of clusters. I think this one is a little bit better. We'll play it here. Um so on the left hand side I got some open source uh detections from Sigma. Got a couple thousand. And then on the right hand side are um my thread intel advisories. Um and it's cool because you can you just see rowby row what's in there. Um and then you know given the searches that you do and the results that come out um it'll make a reference and then yeah so it puts them in you know certain topics. So on the right hand side I've got you know my thread intel. can go through that and a really good way to

provide transparency that is lacking I think in the the AI space. So um on to security security of AI and rag uh why do it right our data is on its way out to open AAI you with the API services that's not good uh let's put it in a container we can you know do that um keep it on prem in a VPC and it stays locked and safe in there again model is going to vary right you got to jack up for the the resources if you're going to do on prem pay for the the GPUs um depending on you know how many queries you're doing um and this is how I benchmarked it um I think it's you know

about on average from what I've seen through rag uh within uh cyber security use cases um but I used my lab rats aka my students um to test this out uh so around like five analysts, if you will, are asking 20 queries per day. And if we total that up, that's about, you know, 5,000 tokens. Again, the token is like, you know, phrase or short phrase or words. Um, so we, you know, do the the math there. We're getting about 100,000 tokens each day for these questions. If you do this in a VPC, um, the cheapest at the time was with the G4DN 4 extra large GPU. um met the benchmark of responses within you know five seconds

uh and it's around 500 a month right we do it with chatbt open AAI making those queries out at 100,000 tokens I think the prices have come down especially for you know the the early models but at the time it was 100 bucks more um and our data was you know leaving the uh network and yeah so we're we're giving up some of the the privacy uh there as well. Has anybody read the book Phoenix Project? Yeah. Yeah. Nice. All right. Cool. Um I feel like we're at the Phoenix Project again with AI and the evolution of like integration. Um I followed along with the the CISO. Does anybody remember the CISO and the the

Phoenix Project? Um like tons of anxiety, like under a ton of stress and also a drunk, right? Um kind of checks out like for a lot of So these days and yeah so what do we do right like how do we integrate and you know help the the business um and let's just go back to our old school methods of implementing projects with some cyber security or no or this is new right is it new is it earthshattering does the old methods of user authentication guard rails um encoders encryption Well, I think so. Um, it is a new, you know, vector if you will. Um, but some of the the old school methods can be applied to AI. So, let me tell

you how. So, user input, SQL injection, right? We're working with prompt injection now. Um, sanitizing the user input, very similar, right? Like, let's do the same freaking thing. um check on is it trying to you know like trick the the AI model um who's heard about um you know talking to the the grandmother right to trick the the AI uh into what was it giving up the Windows keys right like brilliant um but let's you know try and look it's going to be endless on you know methods to attack but go figure right all it takes is one little hole to to find in the the network for attackers these days. But we need to start thinking along those same

methods. Um so let's put in those guard rails. Let's encrypt our database um in transit. Um let's encrypt it at rest. So that would be our our vector database. Let's put some output guard rails. let's log freaking everything and send some of these things to the the sock in the event it violates it. Um and those are you know sort of red kind of the guard rails here. Um really cool know Matt from Galileo uh I think they're still in the AI space but um you get the picture and then so the subjective part here with how do we evaluate the results from LLM models I mean it's next to impossible because it's very subjective. Um there is a

method called hide uh which is you know hypothetical document encoding and it's freaking hypothetical right but it's another LLM that tests against the documents it tests against the the queries um to either validate or invalidate the the answers that you're getting back. So there are some methods out there. All right. Who came here for MCP model context protocol? All right. All right. You're in luck. Um, this is how I think of it. A USBC port, right? I've got one here, uh, to my my MacBook. It is the USBC port for AI that connects your models on the host with a client, your MCP port to your servers, your Slack, your Gmail, your calendar, your local data sources. sense out of it

all. And running within the the wires is the AI model. Um so Agentic, yes. Uh a protocol. We need one um to make those these communications happen and log it with a standard one port endless tools and zero spa spaghetti code. Under the hood, it is um JSON RPC 2.0 IO uh in a nice little JSON format to structure it and it's pretty good. Um the architecture I mentioned the host the client the server and you know think of it like hey the host I'm asking a question show me the the current weather. Um, the client, you know, knows that I'm in New York City and the server has, you know, the the API and the feed and, uh, the data to,

you know, make that connection happen and route that data back to the host. All right, who's ready for some demos? Um, this is my open source project. Uh, it's called Arsenal Forge. It uses MCP and Rag. uh why did I do this? um not to be the next sore but orchestration and making sense with some transparency of um you know some of the alerts right um so I embedded you know similar to what you guys saw for rag I've got miter I've got a detection inventory I've got you know some advisories or thread intel typical sock data that we're dealing with and the solution is I want something fast. I want something smart and that's going to

help me make more informed decisions uh without the crazy uh price tag of some sore 2.0 product, right? That, you know, uh is going to be really expensive, really slow, and it's going to take me, you know, a year to freaking implement and integrate because I've got 1500 Splunk detections. I've got two terabytes worth of thread intel. um and I want to make sense out of MITER um and inform my analyst all at the same time. So this is the the data workflow um taking those uh data sources. I've got the the ETL says extract, transfer, load. Uh so behind the scenes here of some secret sauce uh I use a schema that we're all familiar with which is you

know email, right? So within email we have subject we have a body we have a source you know usually an IP address um and we have a timestamp date. So using that as your schema for how you parse and collect your data goes really far um almost as far as like the the prompt engineering but um just you know giving you the heads up when you're you're building this on your own. So an example here um we've got this lengthy Splunk detection rule with and we want to know what MITER attack technique is associated with this many times we're doing the the manual labeling of uh this Splunk alert usually a junior analyst or an engineer um so

are they right are they wrong um and don't get me wrong like there's you know multiple techniques that have some overlap that it may be one or the other but this is going to at least a best effort. It's going to go through the trifecta of the library what I mentioned uh for rag and output uh a mapping of the technique. In this case, it was remote desktop protocol. Um I'm sure because it picked up on, you know, some of the the network indicators here. Oh, and RDP. So, we didn't really call out remote desktop protocol. We used an acronym. Picked up on the acronym. Good job, brag. And it's pretty cool. It tells us um what technique it is and

then creates the link. So, why is this good? Well, this is the end result that a cyber security analyst should see um with an ability to dive a little bit deeper, get some more context, maybe even provide a playbook uh to help them through and do their good do a good job. So, let's play this. All right. Um, this is what's going on behind the scenes to set up your MCP. I upload the data to Chroma DB. Um, this is all in an open source project and I'll share the link at the end. Uh, let's get those scripts. Let's upload the the MITER techniques. Let's upload C advisory detection inventory. Come on. Come on. And get

them up and running on uh a server. So now we've got our data that is complete and we want to start some MCP servers. We start a memory server. So why are we doing a a memory server? This is going to log all of the queries. It's going to log everything that the uh MCP Arsenal Forge tool is doing to give some transparency and it's also going to log those conversations, right? And learn hopefully about what the analyst is asking and provide better context. We fire up a Streamllet app and welcome to Arsenal Forge. We're taking a look at the Move It vulnerability detection. That's what came up. It first goes through OpenAI. Gives a little runbook

of what um the Move It vulnerability is. Uh we've got the MER attack technique which is some uh scanning uh with the the Move It. We've also got our detection inventory and a link to that particular for additional details. So, this could be on your backend database of all your logic, uh, a description here to help the analyst, right? They want to dive deeper. Great. They want a quick summary. We've got that within Arsenal Forge. Let's check out the thread intel related to move it. We've got a C advisory progress software releases uh server patch for the move it uh transfer vulnerability. And then last but not least, the memory, uh, the full result, the time stamp,

the, um, the text that came through with it. Thank you.

>> All right, let's pause that. Go to the next slide. Shake and bake. Oh, yeah. We've got some security detections. We got MITER. We got CES set advisories. and it's doing the automatic mapping all from MCP. Okay, how about some more cyber security use cases? Governance, risk, and compliance. Ah, cringe, right? Um, >> yes, the Sorry, we'll we'll go fast, but when I think of GRC, it's the ugly baby that is really hard to talk about with parents. Um, we've all got some skeletons in the closet of lack of policies, lack of controls, and you know, as a fractional seesaw, this is one of the most like annoying tasks that I have. Um, God bless the souls

that are doing GRC full-time. It's a tough job. And the data nerd that I am, I wanted to get into, all right, is this getting worse? You know, we hear from the current administration, we're going to get rid of all laws and regulations and, you know, the companies are going to do business themselves and we're not going to get in their way. That's not going on. um we see a huge rise in laws and when we break this down uh by you know the laws that that we've got to abide by in our industry you know they're they're on the rise and lo and behold AI bot rules are up and up the the privacy laws uh within the states

are you know staying flat and we've also got cyber security laws per state that are going up. So, how do we stay a breast of these laws? Does it impact us? And what does it mean? Here's the the numbers. They're on the rise and then also around the world, too. Um, so I worked at the the big banks. This is one of the first use cases I used rag for. um we were in 100 plus countries and our head of GRC could not figure out um what we needed to do because we were going into uh business and banking in say Iceland, right? What what do we have covered? Uh here's Brennan the Iceland law like does it map? Well, you know,

we've got to you know hire the lawyers that are super expensive to to do that. And yeah, just on the the really good stats from the UN, you know, 158 countries have e-transaction laws, 156 have cyber uh crime laws, and then consumer protection on the up and up. And I thought like US, you know, we were kind of lax in it. Uh not the case. So how do we make sense out of it? Um we've got we talked about rag, we talked about the vector database. Um there are some other tools for classification. Um I used a foundation model called BERT. Anybody familiar with BERT? Yeah. It's like that sentence completion on our text on our email of you know

figuring out the the next word and go figure that is really good at classification and legal types of things. So when it comes to that gap analysis, why not get AI to to do it for us, right? I mean, yes, we still need that human in the loop. We still need that approval. We still need to understand the spirit of the law of the regulation, right? And why not, you know, get some visibility into that. So that visibility, transparency is key here, right? Um percentage match. So you can use BERT, you can use classification to upload your policies and tell me how far along am I with SOCK 2. Uh how far along am I with ISO CMMC? Right. Um I've

got an open source project on GitHub. I have these starter packs. I've got 50 plus policies that serve me well in getting to sock 2. Hopefully can help you guys out as well. Um, and then uh a tool that we're releasing called audit caddyy that can do all this for us. But the the what I'm calling compliance notebooks are that comparison and it'll tell me you know which policies I either have or need to write. But what if we don't have uh a sock 2 or a framework like a new regulation uh and fundamentally a lot of the regulations that are written have something related to sock 2 or have some domains or controls that we need to to focus on. So

putting it in a nice little tabular format um similar to the sock 2 can be done and we can still do that comparison and policy mapping. So the open source project um check it out on GitHub. It's called open uh audit caddy. Uh, I've got, you know, those policy templates, the starter packs, a bunch of different state regulations, uh, and frameworks, everything from FURPA to CMMCC, CMMC, and then the the classification models. So, I'm starting to build foundational BERT models that will get you SOCK 2 that are focused just on that. Let's not use the crazy open AI that are trained on all of the internet. How about a small language model that can help us focus and keep on task with

certain things that we need to stay on task with? Um, and if you ask the question of why the sky blue, it's going to blurt out something stupid, right? Because you shouldn't be asking that question. What is the control domain for my code of conduct policy is what I should be asking. So, let's use this type of of framework. So, let's check it out.

Come on. Come on. I'm seeing what you see. Here's the GitHub page uh for Audit Caddy. Highly check it out. Um I had a friend who's working on a startup. He's trying to get in uh to sell to schools. he needed to get regulations uh done through the Department of Education in New York. I sent him here and he just loved it because he could just go through the templates um use those, fill in his company name and get up and running and get certified. This is the tool. Uh the dashboard will tell you how far along uh I mentioned you upload those policies. You point it at Sock 2. We ask uh a quick survey and

this gives the AI model some context about what your company is to help frame and customize and give you some context uh you know maybe hey what state you're in and what regulations you need to abide by. And then these are the snap and play um compliance notebooks that will break out each of the domains and in each domain we could ask spec specific questions about the particular control. So in this case we've got CC1.1 and then also we've got our files in here. Again transparency is key. Um and in this case we've got a information security management policy that was automatically classified uh through this classification of hey it's probably best suited in high confidence to be a part

of this domain and risk and an identification management. We've got all the the actions here when it was uploaded. And yes, your auditor should be happy to use this, can log in, and you won't be annoyed by them pestering the the questions like my friend Joe in the back who's an auditor um did an audit for me. Yeah, he was annoying, but he said, "Brendon, let's do this with AI." And here we are. Also, we can export this in a nice little report. Uh we can export it downstream JSON. And this is what that report looks like. It's English. It's an export. It explains the domains for me. And it helps me with the findings. And

it's also friendly for the auditor. Clean, clear, and also, you know, the assessment. Hey, I'm incomplete. I'm red on a lot of these things, but you know, I've got some confidence with the documents and the evidence that I've uploaded. So, you know, the hopefully not an ugly baby anymore that we have to to deal with. Um, but you know, this helps the the business. It helps understand what risk you need. It helps you get that stamp of approval to the auditor and hopefully unlocks some business deals for you. Um, you can, you know, take the the compliance part away, have this help you out and save a lot of time and a lot of money.

All right, let's wrap up here. Beer soon. You guys ready? Who's getting thirsty? I am. So, last but not least, if you're not first, you're last in AI. You cannot sit on the sideline anymore. You cannot wait for the newest model to come out. You've got to test this out for the sake of your company, for the sake of security, for the sake of our industry, right? We need to use it. Mileage is going to vary. Cost is going to vary. Push the vendors for compliance. Push them for transparency and test this out. There are use cases both in, you know, cyber security defense. We saw it in GRC. It doesn't have to be the ugly baby

anymore. and we can use it to help. All right, last but not least, some links. Scan my QR code here. You guys are first to know about AutoCAD, so welcome to the community. Feel free to to reach out. Check out LinkedIn and stickers. Come grab some. Thank you. Besides yeah. Oh, yeah. Questions. Who's got them? >> Are you gonna walk around? >> Yeah. >> Okay. >> You'll have to use your mic. >> Yeah. Yeah, I can do that. >> Mic. >> Testing. Yep.

>> Over here.

>> Okay. So, um I have a few questions, I guess. Um so, did you uh was this done for like uh NYU's uh compute and like all that stuff? >> Uh yeah, so I um using like a sandbox environment in Amazon, that's what I know. So, just like an EC2 instance and yeah, I I pay for it and you know, did all the testing, but my students were able to you know, link up and then use the the rag there. >> Yeah. So, um I guess adding on to that, um I know I don't know if uh like you guys can do local, uh LLMs and stuff, but I know Dartmouth has AI lab. So, um

could you uh alleviate cost and also add some extra, you know, reassurance to your team by, you know, running your own LLM like, uh, you know, the new uh GP the open source GPT models that open just dropped today. >> Yeah. Yeah. Yeah. Uh yeah, check those out. Um put them in like a container, right? um and run them locally if you've got, you know, fast machine GPU. Yeah. Uh all power to you. There are some freaking LLMs out there that can run on CPUs. Um so experiment, check them out and yeah, shut your internet off and see how >> I I just got that idea because uh you know, OpenAI literally just released it

like this morning. So, you know, I I have to go check it out myself. And then uh I guess one more question then more in general. Um, does this take any substance away from those getting into GRC or like, you know, does it just augment uh, junior associates with like power? >> Yeah, I think it's a tie that raises all boats, right? Like we need help. Um, yeah, it's a good selling point. I'm not going down that route with CISOs and, you know, my products. Um, it's just a a helper, right? Get faster, cheaper, um, you know, ways to to get your freaking certification, right? But also learn, right? And I can't force that. like I'm

a teacher. I'm like, hey, you need to educate yourself on, you know, what this control domain is. And the auditors are going to audit, you know, whatever policies you write. You're liable to, you know, putting those controls in place. So, the GRC field, man, it's a weird vicious circle, but they need some help. >> Okay. Thank you so much. I'm trying to learn, you know, GRC and uh, you know, it's what I got at Dartmouth, so I'm trying to figure my way out. >> Cool. Uh, next question. Thank you. Yeah. >> So, if you're a sock manager, >> Yeah. >> what would be your your top best return on value sock use case for this?

>> Yeah. I think the thread intel enrichment uh would have got with Arsenal Forge. Um start small though, right? Um start within your own container. Don't try and do it all. Um try and solve a quick use case. MITER, right? just upsert some data into a vector database, hook up a rag and see if it can automagically tag your security um detections, your logic. Start there and then build, right? So get your thread intel, uh hook it up to Slack, right? Um inform your your analyst and you really possibilities are endless. Now, when it comes to agentic [ __ ] stay away. Um don't automagically press the easy button and have it do it. I mean, test

out maybe, you know, some of the the email, the spam, the fishing, right, for awareness. Uh, but start small first. >> Hey guys, this is the last question. >> Okay, my last question. So, um, that slide, the roll your own slide, did any of that stuff that that you were talking about have to do with I missed it. Building building your training your own models or was it just using models and then plus rack? >> Yeah. Um, so you can build um there's some really good foundation models out there. They're always changing. Uh, just try the offtheshelf stuff first. Um, and then you know augment and you know use a foundation model and then train. Um, and

yeah, it can pick up on learning like I get a lot of value in the the prompt engineering in you know putting in some of those custom I don't know like acronyms that your company may use so it picks up on that and then uh using the the history uh you know functionality so it learns as it goes along. Uh another like downstream thing you can do is the reinforcement learning human feedback. So that's the up the thumbs up, thumbs down. But hey, that's all you know, manual effort and you know, humans are gonna human. So take it for what it is. I think that's the last question. But yeah, feel free come grab some stickers,

ask questions, and thanks again guys in

[Music] Heat. Heat. [Music]

[Music] [Music] be [Music] down. [Music] Bal [Music] There you go. [Music] Down. [Music] Doo hey. [Music] Heat. Heat.

[Music] Heat. Heat.

[Music]

frameworks being like NIST and ISO and all that. [Music]

[Music] by [Music] Down. [Music] Hey. [Music] Hey, [Music] hey hey. [Music] Hey.

Heat. Heat. [Music] Heat. Heat.

Heat. Heat. [Music]

Heat. [Music] Heat. [Music] [Applause] Heat. Heat. [Music] Heat. Heat. [Music]

Heat. Heat.

[Music] Heat.

Heat. [Music] Heat. [Music] Heat.

Heat. Heat. Heat.

[Music]

[Music] That's [Music] true. Heat. Heat. N.

[Music] Heat. [Music] Heat.

[Music] Heat. [Music]

[Music] Hey. Hey. Hey. [Music] Heat. Heat. [Music]

Heat. Heat. [Music]

Heat. Heat. [Music] Heat. Heat.

Heat. Heat. [Music]

Heat. Heat. [Music] Yeah. Heat.

Heat. Heat. [Music] Yeah, [Music]

[Music] hey down. [Music] Hey hey [Music] hey yeah hey yeah hey yeah hey yeah hey yeah hey yeah hey yeah hey yeah hey yeah hey yeah hey [Music] Yeah, [Music] down. [Music]

[Music]

[Music] by far. Heat. Heat. [Music] Hey. Hey. Hey. [Music] Heat.

[Music] Heat. [Music] Heat. Heat.

Heat. Heat. [Music]

Heat. [Music] Heat. [Music] Heat. Heat. [Music] Heat. Heat. [Music]

Heat. Heat.

[Music] Heat.

Heat. [Music] Heat. [Music] Heat.

Heat. Heat.

[Music]

[Music] That's [Music] true. Heat. Heat.

[Music] Heat. Heat. [Music] Woo! [Music]

Wow! [Music]

Heat. [Music] Heat. [Music] Heat. Heat. [Music] Heat. Heat. [Music] Heat.

[Music] Heat.

Heat. Heat. [Music] Heat. [Music]

Heat. [Music] Heat. Heat. [Music] Heat. Heat. N. [Music]

Yeah,

[Music] down. [Music] Hey, [Music] hey hey. [Music] Yeah, [Music] down. [Music]

down.

[Music] Heat. Heat. [Music]

[Music] by [Music] Baby, [Music] baby. [Music] There you go. [Music] Hey. [Music] Hey. Doing

[Music]

Down. [Music] Hey. Hey. [Music] Heat. Heat.

[Music] Heat.

[Music] Hey. Hey. Hey. Heat. Heat. [Music] Heat. Heat. [Music] [Applause] [Music] Heat. Heat.

Heat. Heat. [Music]

Heat. Heat.

[Music] Heat. Heat. [Music] Heat. Heat. N. [Music]

Heat. Heat. [Music]

[Music] dance. Heat. Heat. N.

[Music]

Wow. [Music]

Heat. Heat. [Music] Heat. [Music] Heat. [Music] Heat. Heat. N.

Heat. [Music] Heat. Heat. Heat.

[Music] Heat.

[Music]

Heat.

Heat. Heat. [Music]

Heat. Heat. N.

Feel

[Music] down. [Music] Yeah. Yeah. [Music]

Yeah. [Music] Hey, [Music] hey hey. [Music] Yeah, [Music] down down.

Down down down down down down

[Music] Hey, hey, hey. Heat. Heat. [Music]

[Music] Burn. [Music] [Music] Heat. Heat. N. [Music]

Hey,

hey hey. Hey hey hey hey. [Music] Down. [Music] Heat. Heat.

Heat.

[Music] [Applause]

Heat. Heat. Heat. [Music] Heat. Heat. [Music] Heat. Heat.

Heat. Heat. [Music] Heat. Heat. [Music] Heat. Heat. [Music] Heat. Heat. N. [Music]

Heat. Heat. [Music]

[Music]

[Music] That's [Music] true. Heat. Heat. N.

[Music] Heat. Hey. Hey. Hey. [Music] Heat. Heat. [Music] Heat. [Music]

[Music] Heat. [Music] Heat. Heat. [Music] Heat. Heat.

[Music]

Heat. Heat. [Music] Heat. Heat.

Heat. Heat. [Music]

Heat. [Music] Hey. Hey. Hey.

Yeah.

Heat. Yeah, [Music]

[Music]

yeah yeah yeah. [Music] Um, so I'm going to introduce you and I'll let you go off from there. >> Cool. Sounds good. >> Is this working? Yes. take that microphone from you. I'll just say on the side. Test, test, test, test, test. All right. Hi everybody. My my name is Noey. I'm the room host in here. Um good afternoon and welcome to BSI Las Vegas. Um this is uh proven grounds, right? No >> ground truth. >> Ground truth. Ah um the talk is >> Ariana. >> Yep. Ariana and predicting the lifeline of internet services falling down the rail. Um and this is given given by Ariana Mirian. A few announcements before we begin. Uh want to thank our

sponsors. Um diamond sponsors Adobe and Akido and our gold sponsors formal and run zero. Um this is support with our along with our other sponsors, donors and volunteers that make our event possible. Um the talks are being streamed live and as a courtesy we to our speaker and our audience we want you to make sure that your cell phones are down and silent. Um, if you have any questions, I'm going to be using this microphone and you will speak into it so our audience on YouTube can hear you. And uh, yeah, and final announcement, um, data science meet up at 7 p.m. and it's going to be by the pool by the entrance. All right, and with that, I'll leave the

floor to you. Thank you so much. Thanks for being here, everyone. I appreciate it. Okay, this has got to get closer. Um, hi everyone. Thanks for coming. I know it's 5:00 PM on Tuesday. We've all got long weeks ahead of us. I appreciate you all being here in person and online. My name is Ariana and today I'm going to talk about predicting the lifespan of internet services. Uh falling down the ML rabbit hole and what we learned from the thud. Um just a quick disclaimer. If someone maybe Christian could take a photo of me giving the talk at some point when it looks good, that would be great. Uh because everyone is busy and I need a

photo proof for proof of life. Thank you, sir, for that photo. Okay, great. Before I I jump into this like really long title and what it actually means, I'm just going to go over a little bit about who am I. Like I said, my name is Ariana. I'm currently working as a senior security researcher at Census. Um, my job is kind of twofold. First, I focus on data quality. So, how do we have the absolute best data so that you folks doing security investigations also have the absolute best data? Um, but I also combine the domains of internet measurement and security. or you can think about this as data science and security to answer interesting research questions about the

world and the internet. Um, before this I was at the University of California, San Diego. My PhD was unsurprisingly in driving security decisions via internet measurement. I'm also really into birds. So, if you like internet measurement, security, or birds, come talk to me. I am really into all three of those things. That's a crazy. Ask me about them later. Um, great. So I'm super excited to be here today specifically in ground truth to talk about something really near and dear to my heart. Um often when we talk about research you are hearing about the end result right you read a blog you see a paper you read about the methods that were successful. Today I am giving partly a failure talk.

So I want to talk about what happened in between the start and the end of our research project. What went wrong? how we reconfigured our solutions, how we reconfigured our thinking in order to get to a correct solution. Um, and also specifically talk about, you know, what happens when you join two subjects that require a lot of domain expertise. Cough cough, security and ML together. Um, and what can go wrong and also what can go right and so as you probably gleaned, I am approaching this from the perspective of a security researcher, right? Like I have been doing security for 10 years. I'm pretty prolific in networking as well, but this was the first time I was

kind of dipping my toes into the ML space with some help of some ML scientist colleagues. So, I wasn't like doing this alone. Um, but it was a really eye- openening experience and you know, we don't often get to talk about the outcomes and the methods and so I'm super excited to be at Ground Truth to talk about both of those things. Um, and so every time I talk about methodology, I actually put a little like light bulb on the top of the slide to just indicate like this is a methodology point and I'll also call it out um, as much as I can. So today we're going to talk about it all and specifically I'm going to

just go over a couple a couple points a couple points in the next 40 minutes. Um, I'm going to give you a quick background about the internet and like why internetwide scanning matters. uh talk about the various iterations of the problem statement and how it changed from problem V1 to problem V2. Talk about the various solutions that we tried. Um and then finally what we learned both for the project and for the future. So I'm going to really take you you know through us falling down the rabbit hole which was okay trying to find an answer to this project but also the thud at the end which was us kind of looking back up at the process and being

like wow that was kind of painful. Uh what can we learn from that? what can we share with the world about what we learn from that? And that's why the the title is so intricate is because it's uh it's been a it's been a really good learning experience. So, a little bit of background um what is the internet and why does internetwide scanning matter? I always put the internet in quotes because it is fake. It should not. It is just a strange entity that rules a large portion of my and others lives. Um but jokes aside, you know, the internet is a very fraught and dangerous place. There's a lot of bad actors out there.

And just to make sure we're all on the same page, when I say the internet, I think of the collection of hosts that make up this networking entity, right? And when I when I say a host, I think of servers or network devices that are typically comprised of an IP, port, protocol, and autonomous system. Um, and for those that aren't super familiar with networking, autonomous system is like the routing policy of the specific device. So like who controls how it gets routed to the rest of the internet. Um the hosts can tell us really interesting information about themselves. Not only are the port and protocol that they are using to speak with the rest of the

internet inherently interesting um but we can also collect other interesting metadata like what vulnerabilities are they purporting? Um what software or hardware version are they? And internetwide scanning is a way that we can get all of that data in one place. And so if you look at this graphic on the left um right we have our set of devices and then a shop like Census which is where I I currently work um will go out and scan all of these devices to collect as much public data about those devices as possible um again to enable research about various vulnerabilities um various devices software hardware products etc. And the caveat here is that the more accurate your data is, the

more accurate your insights are going to be, right? If you have scan data that is a week or a month old, your insights into that data are going to be a week or a month old. Um, but if you have really really accurate data that is super upto-date, your insights will also be uh conversely up to-date. And one thing that we have that I have realized in my tenure um with internetwide scanning is that many hosts are ephemeral. So if you imagine like a faulty Christmas light that's blinking in and out. There's a lot of hosts on the internet that we scan them and they are responsive and they're like, "Hey, here's who I am and

here's what I talk and here's all the vulnerabilities I have." And then we scan them a couple hours a couple hours later and they are just gone or they appear to be gone and then like four hours later they're back again. And I'm like, "Come on, just make up your mind. this would make my life a lot easier. Um, these really ephemeral hosts complicate this accurate picture that we are trying to depict. And so the first version of the problem is that we want an accurate representation of the internet at any given point, but we have these highly ephemeral hosts that make up a minority of the internet that are kind of mucking up the data. And so

we're like, well, okay, what what do we do about this? Well, you know, I'm I'm a scientist by training, so I'm like, let's go run an experiment. And so we we set out to quantify the ephemererality of the internet. And so in in doing so, we scanned a representative set of services every hour for a week. And if you're like, "Hey, Ariana, this is sounding really familiar." Um that's because this is the talk I gave last year in Ground Truth um at Bides Las Vegas um where we scanned the internet every hour and showed a bunch of you know interesting graphs and behaviors um in and examined the internet at these different service level attributes. So

again, poor protocol autonomous system. And I'm not going to go over that talk again, but the biggest takeaway from from that descriptive analysis and from the talk about a year ago is that the internet is highly ephemeral. Um it's a minority of services and also the internet is not uniform in its ephemerality. While we might might find that like a portion of hosts that are over port 443 are super ephemeral, that can vary widely depending on the autonomous system. That can vary widely depending on the protocol you're looking at. And so we were doing all this descriptive analyses to try and glean, you know, insights into the ephemerra internet. And the biggest thing we could

take away is like, wow, this is a mess. This is really noisy. Um, and so, you know, I I presented this work. We we continued to do some of the research and then at some point we were like man the outcomes are not clear. We are having a difficult time taking figuring out the takeaways for us to use internally for us to go publicize. Um and so at this point and this is the little methodology light bulb. We're here. We're this is you know failure step number one. Um is is taking a step back right? It's like okay the the data is messy. we're having a hard time with, you know, simple descriptive analyses trying to figure

out what are the takeaways and outcomes for us. And so we took a step back and said, what are our end goals with this data set with this project? And you know, we came up with two very specific end goals. One is to come up with a method um that allows us to identify which services should be scanned at a faster way rate that takes into account this noisiness. Right? If one autonomous system has mail servers that should be scanned at a faster rate, I don't want to scan all the mail servers at a faster rate. I don't want to scan that entire autonomous system at a faster rate. It is a specific subportion of a subpopul

of the internet. So, we wanted a method that would more easily help us identify these different small portions of the internet that we should be scanning faster. We also wanted an understanding of what affects internet service lifespans. In other words, we wanted to um to to not deduce, we wanted to figure out the feature importance of what causes a service to be ephemeral. Is it just its port? Is it its protocol? Is it a combination of those three? And this feature importance is more of an underlying understanding of the behavior of these ephemeral hosts. But it allows us as an internetwide scanning um entity to go and then figure out okay does this change our internal scanning policies.

Do we uh alter the way we think about ephemererality and how we scan internet services based on the model's outcome? So we came up with these, you know, we took a step back, re made sure we were all on the same page with our end goals and we're like, okay, well, let's reframe the problem. Because the original problem, which is like we want an accurate picture of the internet, is great. That's like our guiding light, but that's not actually helping us solve the data problem. And so the problem then evolved into how can we predict the lifespan of ephemeral services and then it was at this point that ML entered the room. It is so hot right now. It just

felt very timely given all the conversations of ML. Um and I should say that you know uh I was very hesitant to dip my feet into the uh domain that is machine learning because right I like I said I am a security expert. I am a networking person and I was like trying to figure out how to marry two different domains is going to be a a challenge and a journey in and of itself. And that is why we're here because it was in fact a journey. Um so many of you may be asking okay but like why prediction? Why didn't you reframe you know your v2 of the question to be something different? Um

prediction allows us to have smarter and more accurate scanning. So if we can predict so just to like really concretize this if we can say host A is going to change its state in 8 hours. Host B is going to change its state in 12 hours. So those need to go into different scanning buckets that are scanned at different rates. again that helps us get more accurate data. So this is where prediction is really useful. Um but also you know and and my background is also more in in statistics. I actually tried a lot of statistical models. Uh and that is just a completely different talk for another day. That's like a whole other 45 minutes. Um the

issue with a lot of the statistical models is that they had really strict assumptions that internetwide scanning data breaks. Um, and while they show relationships between variables, it was much harder to glean that feature importance, which was, you know, the the what are the service level attributes that help predict ephemererality. And then at the end of the day, we wanted both prediction and feature importance, right? Like I didn't want to be running a ton of different models and then trying to pull it all together. Um, spoiler, I did actually try that. It really didn't work well. And so we're like, okay, let's let's dip our toes into the ML space because it seems like there are some machine learning models

that can help us get both prediction and relational information um which are going to solve both of our end goals. And so this is an an explanation for why we wanted to use prediction. And then the next step was like, okay, great. Well, what which model are we going to use? Because there's a lot, right? It's not just like, ah, ML, throw it in. It's like, oh, this is an entire field where people have spent years training and understanding these models and what's going on under the hood. much like me and my colleagues have spent years understanding the internet and security and whatnot. And so um again we're like okay well what do we have? What do we

have and what are we trying to get? Uh our input features for each service for each host we have uh port protocol autonomous system which again is like the routing policy. Um yeah we'll just leave it at that for for lack of time and whether the scan was successful or not. And so you can imagine that like we have uh and this picture shows you know if we have the host we have its port protocol and also like kind of where where it's getting routed or how who it's getting routed by and then also like was it responsive or not because we were doing these hourly scans. So there were times where host just went away and I was like well we'll

check again in an hour and see if they're back. Um given these we wanted to predict the lifespan of a service again. Is it going to change state in 1 hour? Is it going to change state in 6, 12, 24 in a week? I want to know. I want to know about all the services. Um, and so we were looking at the inputs and we're like, okay, well, lifespan of services is a numerical output that's in um, uh, our chunks. And so looking at all the different model families and also again consulting with the ML scientists that we have in house, um, I was like, okay, well, we could let's try regression, right? Regression handles

numerical output and bonus. They don't always assume linear relationships because there's no way that there are linear relationships in internetwide scanning data. And so solution v1, let's use a regression. Seems super simple. My life is going to be great and we're going to have all the answers to the problem. No, that is not what happened. We actually ran into a lot of issues um with trying to use regression for this problem. And and and this is where I'm going to walk through again um the difficulties like the failure cases of of what exactly was going around because we wanted to understand why this was failing so spectacularly to figure out is there a potential solution or do we

just like abandon ship and move on to the next you know problem in the internetwide scanning space. And so the first big issue was with our input variables. So we had port protocol autonomous system they're categorical. Lots of models require numerical input. Even if they take in categorical input, they're still like converting it into numerical under the hood. But that's okay because we have a way to handle this. Um there is a popular so you can essentially encode these variables, right? And there's a popular technique called one hot encoding. This was one of the many that we tried. But just to walk you through it on the left hand side was the original data set. So each row is a

specific host and what port it what what port it purported what port it had open and what port we were scanning it on. And so when you one hot encode you're essentially taking each of the possible variables converting them into their own uh column their own variable name and then listing the true and false variables. And so if we convert the lefthand side into the right hand side we'll see that port 80 and port 44 port 443 are now their own columns. Um, and port 80, uh, since the first row had port 80 open, there is a one for the first row for port 80, a zero for port 443, and then the converse for the

second row. This is a super straightforward technique. There's other ways to encode things. Um, but this is just, you know, the the one that is most easy for me to graphically explain in the time that I have. Um, seems like this should work, right? Nope. Uh, the issue is that a lot of our variables had really high cardality. They had a lot of potential categories they could be in. And just to put some numbers behind this, we had uh 40 potential ports, uh 300 potential protocols and 19,000 potential autonomous systems. And so imagine that each of those is becoming a column. Your matrix all of a sudden is so sparse the model uh spoiler can't do much with it.

And on the right hand side I just have an example of for those who are unfamiliar with autonomous systems some example of autonomous systems. So you can have universities, financial institutions, you know, these are large organizations that own IP space um and then and then route that IP space. And so um this is where and I again we have some great ML scientists in house and we were chatting constantly about this and this is when one of our ML scientist was like well Ariana just get rid of autonomous system that's such a problem feature you don't need that and this is where having domain expertise in both domains is so helpful because me as the networking security person was like

wait autonomous system is so critical to the underlying nature of a host we can't just throw it out um and Again, my ML scientist colleague, fantastic person, did not have the networking background. So, they were just like, "Oh, yeah. We'll just get rid of it." And I was like, "No, no, no. We're not we're not getting rid of that. No, no, we cannot." Um, and so we we reached a compromise where we're like, "Okay, let's reduce the cardality. Instead of 19,000 potential autonomous system options, what if we boil down the autonomous system feature into its category?" So, you know, we say this is a uh educational autonomous system, a financial autonomous system. That dropped that feature down to about

200 options. Uh and it was still a super sparse matrix with like slightly less abysmal results. And so even with this compromise, we weren't getting getting uh great outcome. And just to put some numbers behind what I mean by not great outcome, um when you run a regression, one of the key metrics of success is your R squar. And so R squar is essentially how well the model is explaining the variance in the data. It is on a scale of 0ero to one. And the best thing I could get was about.3. Most of the time it was below 0.2. And so that means that the model was looking taking the data in um and only able to

describe at best 30% of the variance in the model based on the features that we were we were putting into it. Um in the words of Gordon Ramsay, that was just not good enough. And and you know this is where the practicality of industry kind of butts head with the the science of of academia and these mathematical models because in a lot of cases you know 30 an R square of.3 actually is quite sufficient but one of the other difficulties is that we were trying a lot of different regression models to see if they would at least have the same output. Right? So again the feature importance was the uh a secondary goal for us and these models were outputting

feature importance that was all over the place. I mean like one would say autonomous system was important, one would say port was important, one would say protocol was important and again in consulting with um my ML domain expertise uh colleagues, our understanding is that the models just weren't really grasping what was happening underneath. they were not converging in a way that made sense to any of us and we also couldn't explain it. Right? Imagine going into a meeting with a bunch of your other colleagues who are expecting you to come with outcomes and you're like so one model said this and the other one said this and the third said this and uh yeah like

what what do you take from that? Not great. um the the light at the end of the tunnel, the slightly depressing thing is that this is obviously not a unique problem to us. Um I started looking at uh published research actually in other domains um like human computer interaction has a lot of really um uh peer-reviewed published work where the authors were like yeah we have this high cardality uh high cardinal data set uh and we tried a regression and it was not great. Um and so we moved on and I was like okay that's encouraging because maybe there is a solution here. Um yeah and I was like a lot of research was using high cardinal data that had

the exact same results and the results were abysmal. It was really discouraging and so we were like okay let's take another step back our light bulb is back. um how can we reclassify the research question to then use a model that actually works with the data that is something that we can act on that we can explain to like the data engineers that we can explain to people who are like okay so why do you think we should change our scanning in the way that you're proposing and when we were poking around at some of the data we realized you know we revisited this facet of the data set which is that the output variable had

incredibly biodal output so remember our output put is um the lifespan of a host. So that can be anywhere from an hour to a week because that's how long we ran our experimental data set. Um and when we plotted the distribution of lifespans, it is uh a heavy concentration on the left side. So again, this small minority of hosts that are pretty ephemeral and then a huge chunk that are just online. They are they are homies. They are online. They are just consistently responding positively. And so we were looking at this and we're like, okay, well, can we use this to our advantage, right? And I'm going to revisit our end goals because we wanted a method to identify

which services should be scanned at a faster rate, which really meant these services should be scanned in X hours, these should be scanned in Y, etc. What if we just reframed it and made our lives easier and say, hey, these are the less ephemeral hosts and these are the more ephemeral hosts. So taking this bucket, these two obvious buckets, and instead treating our problem like a bucketing problem instead of a regression or a very exact precision prediction problem. Um, and so that led us to solution V2, which is, hey, let's use a classification. Again, this may not be the most academically rigorous thing. If I tried to submit this to a research conference, I'm sure they would

be like, wait, but like regression is the thing that you should definitely use for this problem. But practically for us, if we want more accurate data, um, and we're scanning most hosts at roughly a daily cadence, even if we can say, okay, this set of hosts should be scanned in a 12-h hour cadence or a six-hour cadence and then everything else remains the same, that is still an improvement for us. That is still improving our accuracy. And again, this is kind of the the joys of being able to work in a more practical industry setting where I don't need to worry about uh the woes of reviewer 2. I can be like, what is the thing that is pract

I'm seeing some people laugh. Um what is the thing that's practically going to help us? A classification will practically get us to an answer um to our problem. Classification went great. It was fantastic. I was like, "Thank God I can do my job." um because there was there were some dark moments in this. So we divided our output into highly ephemeral host so hosts that had a lifespan of less than 12 hours and less or low ephemeral host which is everything that was 12 hours plus right and again that's because our day our our scanning cadence is roughly daily. There's some nuance there. And so if we can even say hey this 10% we should be scanning at a 12-h

hour rate. Again, that is still going to increase our accuracy and get us um closer to the goalpost of having the most accurate representation of the internet. Um we also stumbled on this model called cat boost which natively handles categorical data. Again, it is still converting it into numerical input uh into numerical inputs. But what was really um solidifying about this decision is that we tried cap boost, we tried random forest trees, we tried a bunch a couple different classifiers. And though they were performing at slightly different accuracies and slightly different ROC's, the feature importance was always the same. They were converging on what they thought were the features that most predicted ephemererality. This was not

happening with regressions. lol Saab. Um, and so I was like, okay, there's some slight differences, but they're all mostly behaving the same way, which is what we would expect um, in a well- definfined research problem with a well- definfined data set. Um, we also didn't really need to reduce the cardality of the data either. Um, I just threw in all the autonomous systems. I didn't need to do any of this categorical stuff. That's really helpful for us because now I can be like these autonomous systems, these hosts in these autonomous systems need to be scanned uh faster or not as quickly. And the results weren't abysmal. And so when I say they weren't abysmal, um we were able to get an

accuracy of 87% and an ROC AU of 94% which is like a night and day difference from what we were seeing with the regression. Um we also got a feature importance that was su successfully ranked and reproducible against different models and this is really interesting to me. So autonomous system is the highest ranked feature when we are talking about the lifespan or the ephemererality of a service and that is super interesting to me and now we are having a lot of internal discussions and ongoing research about how that changes some of our base assumptions about how we've been scanning the internet um to then get more accurate data. There were also some other features that were

important like number of services on a host, the port, the protocol, but it was really autonomous system that that stood out above the rest. Um, and we had some really good precision and recall. So overall, this was a great success compared to the results that we were dealing with before. Um, and then we were like, hey, that worked really well. What if we modified our our classification buckets slightly and actually expanded this into a three classification problem? So now it's like we scan certain hosts at 12 hours, we scan certain hosts at 24, and then everything else we can scan at at more than a 24-hour cadence, which I think practically we'll still probably scan at

24. But this was like a nice little let's um expand this and see if this continues to work, right? In in the words of Moto Moto from Madagascar 2, so nice. You got to say it twice. Anyone? No. Great movie. Highly recommend. Thank you. Um, again, our our accuracy and ROC were like really well. I kept I was like running these and this these are with hyperparameter tuning and whatnot. I don't have the parameters up here because it didn't seem as relevant for this discussion. But um I would like run the base model with just some really basic input parameters and like the accuracy would come out at like 75% and I was like am I dreaming or does this

work? Um in this three classification problem again we also investigate our feature importance. The autonomous system was still highest by far but then port and number of services flipped. And so that's something that we're investigating. um why that is, what does that mean for us? Um and if the two classification or three classification is is more suited for for our needs and so some interesting open questions, you know, as with all sorts of research, there is always more to dig into, but this was still a great success. Um so at this point, so this was, you know, kind of the the the entire the evolution of the research project. We're we're taking a lot of this, we're digging in further.

I I'm pretty sure I won't be giving a follow-up talk on this next year, but who knows? Never say never. Um, and so at this point, I kind of want to uh take a a step back and talk about the two different sets of lessons we learned. So, lessons we learned from falling down the rabbit hole or the project specific lessons, right? Using network level features, we were able to train a classifier with high accuracy and predict service level ephemererality with feature with reproducible features. um autonomous system is the feature of highest importance which like I said helps us understand how we should address internet ephemerality better. It also is um helping us again with some of

these internal conversations that are ongoing. Um maybe that'll be the the third version of this talk is like how we change internetwide scanning. Probably not. Um the the one thing I really want to point out oh no I think that's on the next slide. I got too excited. Um so it was like great this this is a a really helpful um project for us internally. We got our findings let's move forward with it and and make our accuracy better. Let's make our data quality better. But what were the lessons that we learned from the thud, right? It this was a very long process. I distilled it into roughly 35 minutes. Um but this was not a 35inut task. Um and took a lot of

domain experts coming together. Um, you know, first off, domain expertise in both areas is super critical. If I had been tackling this on my own or if my ML colleagues had been attack tackling this on their own, I think it would have taken more time and we would have come uh it may not have been successful and we um may uh have come to different outcomes. And so being able to have both people in the room was super super useful. And so I would highly encourage if you are ever dipping your toes into a different domain, find someone who works in that domain. It's also fun, you know, working with good people, doing good research. I had a great time even though

there were some days where I was like banging my head against the metaphorical wall. Um, rethinking your analysis is necessary even in the ML domain. And I've kind of touched on this before, but one of the biggest things for us was saying, hey, how do we address the practical issue outside of the scientific one? like yes I would love to get a research paper out of this but the reality is we're not doing anything novel with the ML models really what we need is a uh a practical understanding of what uh dictates internet service ephemererality such that it helps us internally with some policy decisions. Um and so that means that we converted a regression to a classification because

of various factors. Um and that gave us really good performance which um practically and scientifically is helpful for us. Um, and so really being able to take a step back, rethink what is important, what isn't important, what can you leave on the table, what do you need to take with you was really critical here. And I will say this process has been um, it it was not foreign to me. In my 10 years of doing research, there are so many times where we have needed to take a step back, rethink the problem, what are we doing, what are the analyses? Um, and this was just the most recent case of that journey. Um, internet level data is

super noisy. Surprise. And um network level features perform much better in these models than software level features. So one thing that I I didn't include a slide on is that um we were trying different uh features. So we had the network level features because that is inherent to the internetwide scan. But then we can also add some software level features like what product is this, what vendor is this, what version is this? And we tried to include some of these other features. And even though the model was still performing well in terms of accuracy and ROC, those software level features were like at the bottom of the feature importance list. I mean, they were just so uh obviously not

important to the model output. And this is really important to us. Um because I'm like great, so now moving forward, we'll focus on network level features. And that doesn't mean I've like sworn off on software level features, but this is very useful for us in other endeavors that are going on internally. Science be sciencing. That's been great. Um, with that, I want to thank you folks for your time and I would be remiss if I didn't thank all the fantastic fol

BsidesLV 2025 - Ground Truth - Tuesday

Related talks