RAG Against the Machine: Using Retrieval-Augmented Generation & MCP to Fortify Cybersecurity Defense

Name: RAG Against the Machine: Using Retrieval-Augmented Generation & MCP to Fortify Cybersecurity Defense
Uploaded: 2025-12-08
Duration: 50 min 47 s
Description: Identifier: TKNLJQ Description: - “RAG Against the Machine: Using Retrieval-Augmented Generation and MCP to Fortify Cybersecurity Defenses” - Introduces CADDIE, a RAG engine driven by Model Context Protocol (MCP). - Demonstrates injecting real-time policy, threat intel, and log data into LLMs. - Ena

BSides Las Vegas50:4748 viewsPublished 2025-12Watch on YouTube ↗

About this talk

Identifier: TKNLJQ Description: - “RAG Against the Machine: Using Retrieval-Augmented Generation and MCP to Fortify Cybersecurity Defenses” - Introduces CADDIE, a RAG engine driven by Model Context Protocol (MCP). - Demonstrates injecting real-time policy, threat intel, and log data into LLMs. - Enables automation for gap analysis, alert triage, and regulatory mapping. - Focused on SOCs, auditors, compliance teams, and AI practitioners. Location & Metadata: - Location: Ground Truth, Siena - Date/Time: Tuesday, 15:00–15:45 - Speaker: Brennan Lodge

Show transcript [en]

introducing Brennan. >> Awesome. Thank you. Thank you. You guys hear me? All right. In the back. Cool. I know we're like a minute early. Um but I think I am holding you guys back from the the happy hour, right? So hopefully I'll make this entertaining and and quick. Um awesome to be here. Uh first time here at Besides Las Vegas. Uh I've done a few besides uh other places. Uh, I am from New York. Uh, live in Manhattan and um, yeah, made the mistake of most Manhattanites of trying to walk around Vegas. Um, right. Like, come on. Uh, it's, you know, one road usually you walk on. Uh, and I'm realizing like there are, you

know, three other types of people that walk around V Vegas. It's, you know, the drunks, right? um the the bums that aren't really doing much walking and then the the zombies like myself who think and walk like this and our brains are fried by the time you get to your destination. But lo and behold, we're we're here at Hacker Summer Camp. Um really excited to to be here. I'll be here all week. Um feel free uh to come up talk to me afterwards. Uh I've got some fancy new stickers uh to to share uh and they are fresh from the the pack. So, you guys are are very lucky. Um, they kind of smell new as well. Uh, but

anyways, all right. So, Rage Against the Machine. Let's rage on. Uh, quick who is lookup. Uh, Brennan Lodge. Um, got a a few roles. Uh, let's see. First off, I've been in the financial industry. Uh, fortunately or unfortunately, uh, the last 18 years. Uh, you know, I'm from Manhattan. Well, from Philly, live in Manhattan. Uh, worked in a few of the banks, JP Morgan, Federal Reserve Bank. uh Bloomberg, Goldman, HSBC. Uh I teach a professor at NYU. I teach IT and management and analytics. Um and you know, with that teaching, I've gone down the the rabbit hole of doing some research. Um and using my students as lab rats uh for some of my AI work. Um

you know, no bad testing or or anything like that. They willingly help. Um yeah and won a few awards in the the research um using rag uh aka rage um for cyber security uh won an award at Oxford. Anybody familiar with Kaggle competitions the data science awesome awesome awesome uh won uh ML use cases uh Kaggle competition at Oxford and then uh US Cyber Command. This is back in 2022 I think pre-Chat GPT before AI was cool and sexy. Uh if you guys remember the terms like machine learning, big data, all of that and analytics. Yeah, they're still around. Uh was using it back then for um yeah, alert fatigue and finding needles in the the haystack. Uh so

really cool uh opportunities there. Um and yeah, uh I've got a LinkedIn learning class. Uh anybody done LinkedIn learning? Really good. Uh get your company to pay for it. Uh I've got a class on rag uh using rag for cyber security use cases and I also got one coming out August 16th for MCP model context protocol and I will get into that as well. Let's do it. Uh all views are my own and not my employer. Um I do the fractional CISO stuff. Um I'm an employed at Manhattan Institute. Really great place and I get to use uh this stuff to help out and protect our company. So what are we going to talk about? Um the good, the bad, the ugly of

AI and cyber security. Hopefully it's not repetitive. I'm sure almost every talk has sprinkled in AI. Uh we'll get into rage or rag against the machine and some use cases. Uh GRC and in the sock. Um I got my start in the sock. Anybody sock analyst? Yeah, it's tough. It's grunt work, right? We've we do follow the sun. Uh so I've been there. there been in your shoes and you know when I'm implementing researching doing this stuff I think of junior analyst Brennan Lodge in the the sock coming in uh coffee you know you with a ton of caffeine um and just trying to take down the queue down to zero right I think we're familiar with that um and it's an

endless goal and that doesn't seem to be stopping so putting that in mind you know how can we we help those folks uh I'll go through some of the data workflow architecture and hell roll your own right so I'll give you all the open source tools um that I've used and even the the costs associated with it so that should be the the takeaway you should all be uh retrieval augmented generation uh experts by the end of this talk uh if not check out LinkedIn learning and yeah I mean like that's the the takeaway bring it back try it out experiment uh we can't sit on the sidelines anymore with AI high. The stakes are too

freaking high for the cyber security attacks. Breach after breach after breach. Come on, let's do better. All right. So, the good we've got information overload. Is that good? I don't know. Um, we are facing a a deluge of threats in data. There's a 27 24 by7 need. Uh, we've got talent gaps. We've got burnout. And AI is not the silver bullet, but it can help. So, and I'll get into the details about why it can help. Um, the rising attack volume, right? I I don't have to tell you guys this. Uh, we've got complex integration challenges that are just all over the place. So, AI to help. And but if we've got AI and if it works, don't touch it.

like this old network um router uh here collecting a bunch of dust, but it's still got the blinky blinks. Uh working, still working, right? Do we clean it off? Do we touch it? Do we leave it in the the closet? If it's still working, um new technology here, what do we do? So, the bad um let's not be like Salt Bay and sprinkle AI into everything. What are we going to do with AI? Is it soar 2.0? Uh, is it our co-pilot? Um, can we be more creative with it with new detections? And can we answer annoying thirdparty risk reviews? Can it be the automation? Well, but is it automation for the sake of automation? The end of the day, the

human is still in the loop. I'll repeat that. The human is still in the loop. It's not going to take away from that that expertise. Uh, one thing I experienced uh, as a a CISO of um, not Shadow IT, but Shadow AI. Anybody familiar with the annoying read AI app? That Yes. Yes. Okay. So, I'm not the only one. Um, quick story. Read AAI entered our teams uh, through somebody clicking and filling it out. Um, it's kind of addictive with it being a worm. If you sign up, um, it sends an email and, hey, you want to read this transcript of the email? It'll summarize it, give you really good metrics, analytics, content, right? Tell you

who's talking. Don't do it because then it hooks into your browser extensions. It hooks into your teams and yeah, it it's a mess. So, this is not the the first and it won't be the the last. um great business use case for it to to spread itself, but um it's going to annoy a lot of of CESOs and security folks. So, just be aware of that and the bad of AI. But what about the other thing of costs? So, this is Jensen um sprinkling in AI and tokens. Uh in the the big keynote speech he had at GTC, he mentioned tokens 85 times. uh we're all familiar with the rising cloud cost and lack of transparency. How the heck am I

supposed to know of, you know, cloud costs and now they're going to put this onto AI? What is a token? I don't know. The definition is different. It could be a phrase. It could be words. Uh it could be whatever AI model wants to come up with it to charge you mysterious funds uh in order to to use that model. So just be aware of that and push the vendors. Ask for transparency. What is this going to cost me? You know, what are the the expectations? Can you tell me how many tokens I'm using uh with the utilization of your AI? Last but not least, the ugly. Does anybody remember Clippy? Yes. Why are we

back to Clippy all over again with with AI and chat bots? Uh would you like help? No. Go away. I can do this. Um, yeah. All right. So, uh, the AI with the CISO dilemma. Uh, I've been speaking to a lot of CISOs in the industry. Um, some of the complaints, hey, we're scrambling to come up with an internal AI policy, so we don't have one yet. All of our employees are using chat GPT. Um, we can't stop our employees from doing it. Uh, we can implement it, but how do we explain it? the mysterious black box of lack of transparency again with how the heck AI is working. A lot of data scientists that are on the open AI team

that are leading and building these models have no freaking clue how it works in the the back end. Some idea, right? Um but how it understands our, you know, English language, it's able to translate uh rap, you know, lyrics, uh and write it back in email form, right? like how and if you can't understand it and it's making decisions, can you trust it? Okay, so but in security we need a secure AI solution. We needed to cut down on uh time it takes to make those decisions and we needed to be cheap. So my take here, AI as a guide, not particularly a fullon stop guard. uh we need to understand it and these models

you know analyze and generate text uh but does that text make sense in the threat perspective does it make sense with our how many alphabet soup acronyms are in cyber security it's endless right can it understand them translate it uh and give us that context back so we need an ally we need help we need an assistant not clippy um but somewhere in between And I think we need to be more transparent with our tools and democratize tools. Huge fan of open source. Uh but open source is really tough with LLMs right now. And we need cheaper, faster uh products and rapid analysis. So can we be data driven? I think you know measure what matters. Uh

can we measure ROI for cyber security especially with AI? Yes, no, maybe so, right? Um, and we need to integrate and scale. Um, there's some really cool tools out there. Uh, my friends at Deep Tempo have, you know, a really awesome product of integration at scale and it just gets out of the way, right? It runs in the the background in your your data lake and just does what it's really good at doing, finding anomalies, finding things that, you know, your standard static rules can't find. And I think that is the the next step of hey, let it run in the background, get some transparency, get some understanding, get some context, and let it do its

thing and have those be high fidelity. Uh the first time it's low fidelity, it's, you know, really tough to to prove that that value. Um we're dealing with a crap ton of false positives in the in the sock. And how do we cut that down? And more and more vendors are giving us more and more alerts, right? like that's the opposite of what we want. So, we need to get higher fidelity and get, you know, some snipers in there. Those snipers can be AI models that can do anomaly detection on our big data, but they've got to have really good high accuracy rag. Okay. So, what is it? Retrieval augmented generation. Uh, the analogy I

like to use is, you know, think of a library. You go up to the librarian. Um, yes, you should still go to a library and read books. They're free. Um, remember the simple system of a library, right? You've got like three components. It's very similar to how rag works. You have a sentence embedding model, a vector database, and a large language model. Some AI, you know, hoopla jumbled words out there, but let me make it clear for you. So, your sentence embedding model, Dewey decimal system. Who remembers that from elementary school? Nice. Cool. All right. Translating your ISBN numbers or the authors or numbers like that number in the the back of the book to a genre to a

specific section in your library that is over to the the vector database. So the ingest is hey take my query take my question let's translate that into some topical terms that can then be uh semantically searched within the vector database. Our vector database is that stack of books that st the the layers the genres the sections uh that are hopefully organized in a you know easy um manner and understand place. And then we've got the LLM or our librarian. So your librarian is going to ingest, take that question, translate it with the Dewey decimal system or sentence betting model, find the right book, find the right page, find the right, you know, paragraph, for example, speed down the

aisle, bring it back, English translation, is this what you need? I'll give me a quick blurb and also give me a link as to, you know, where I can find it. So, that is all packaged in a nice little gem uh for you all through Rag. But why should we use this? Well, we've got Kermit the Frog who is hallucinating here. Um like many static LLMs are doing these days. Don't get me wrong, they're getting better, right? But with a rag, you can ingest uh with some, you know, real-time feeds new data. with a static LLM. Yes, they're upgrading um you know different versions, different flavors, but you know once that's released, it is a binary. It is a snapshot in time. Any

new future data is not going to learn on it because it's a model that was released and done. So with rag, you can continue to ingest that data, upsert it to your vector database, and it's going to be live and it's going to be fresh. the LLM can incorporate that latest data and ensure up-to-date sources. Guys with me so far? Cool. Okay. Um, and it could be private data. Private data that you do not have to share with OpenAI even though they've scanned the internet and have many lawsuits against them. Um, and other uh models. I I tend to pick on OpenAI. I use it. Um, but there are others under that um, you know, same

legal matter. Uh, and you can, you know, keep that on prem. You could use your own local LLM. There's really good open source ones, and I'll get into uh some of those in a minute. So, roll your own. How are we going to do this? We want to funnel some of our data, the miter attacks, CE advisories, uh, policies, uh, the endless amounts of internal private data can remain internal and private. When I was building this, I set a a benchmark. Okay, I wanted it to be, you know, relatively cheap of $500 or less uh as far as processing and the resources. And yeah, this is two years ago now where I rolled this um and I wanted to respond

within 10 seconds. Um at the time, I tested some AMD uh GPU models. Anybody tested AMD? No, nobody tests AMD anymore because it's so hard to freaking use. Um, and that's why Nvidia is Nvidia and the the price of the stock is uh, you know, so sky-high. Um, you know, if you're building any hardware, uh, build something for developers. Um, easy to use and go figure like easy to integrate with Python. Um, so that's what I went with and here's the steps to do it. So to get to Rag, we've got to do the data ingestion. um lots of open source uh tools out there to parse, scrape, um ingest that those PDFs, those word

documents. Um and there's this, you know, concept called chunking. Um this is where, uh you've got to, you know, do a lot of trial and error. Uh I said, "Hey, let's just chunk by sentence. Let's chunk by paragraph. Let's chunk by page." And in order to get, you know, good efficacy, you got to figure out what works best for you. Um, if you've got, you know, a couple terabytes of data, probably best by page or, you know, maybe a couple pages. If you're working with a small amount of data, hey, maybe by the the sentence miles is going to vary. Um, but you've got to do that splitting and then creating embeddings. So, the embeddings are there

to create the the similar uh vector representations of the the words and the terms uh that you're using. Uh again, you've got to use a sentence embedding model out there. And then lang chain is the glue that helps you do the the orchestration between a vector database, your sentence embedding model, the the chunking, uh making sense out of the uh question that you ask and bringing that back all through um the yeah lang chain uh function there. Um, and then you ask a question and then you've got um an LLM. Um, I highly recommend you start to visualize your vectors or um what are in the the vector database. This again gives that transparency of you know what data

you're in and you can have the the sections. You can also search in there. So if you're getting bad results of your your rag, um you know, just take out some of those those documents that are kind of throwing it off and maybe outliers. Yeah. And then on the the right side are some of those open source tools. Um you know, hugging face, uh you know, the the GitHub of uh open source LLMs, um the sentence embedding model. I'm using the all mini L12 version two. I'm still using it. Still seems good. Uh, and yeah, check out Nomi. Got no association with them. I just think they're a really cool company about um transparency. They've got a tool called Atlas uh where

they visualize and they have like a you know an inventory of models that they continue to update and you can search the models for context as to what subject uh you want to use your model for. Um it can be a little tricky. So when I was first experimenting, I wanted to build uh like fishing detection uh using rag. Um so I searched in Nomi uh fish uh the band fish came back uh a couple times. So just um you know watch out for for that and but yeah you know then I got you know some spam and like examples and then I went down the rabbit hole like hey does it have cyber security related things? Um, and I'm

sure if there aren't already, um, there, you know, are going to be, uh, LLMs in hugging face that are built for our particular domains, whether it be GRC, cyber security. Um, they're they're probably already there, but more to come on that. Oh, and then Chroma, uh, Chroma DB is the vector database that I use, free. Um, really easy to use, really good, uh, Python help. This is the the workflow. So starting with the the data sources, the chunking, uh I was just, you know, stripping out some CC advisories from Gmail. Uh I got a really cool uh CSV from the mightier attack techniques. Uh chunk them, upsert them into the vector database. Uh get your

embeddings right with the sentence embedding model. I talked about lang chain. Um and then moving over to the LLM services. Um doing the the search, putting it on an endpoint, and then hey, having a little Slackbot. That's where I initially tested it out. Then I built a UI and lo and behold um got some rag. So this is a visualization and I've got two here of um a um on the document embeddings you know with different um you know types of of clusters. I think this one is a little bit better. We'll play it here. Um so on the left hand side I got some open source uh detections from Sigma. a couple thousand. And then on the right hand

side are um my thread intel advisories. Um and it's cool because you can, you know, just see rowby row what's in there. Um and then you know given the searches that you do and the results that come out um it'll make a reference and then yeah so it puts them in you know certain topics. So on the right hand side I've got you know my thread intel. to go through that and a really good way to proify a transparency that is lacking I think in the the AI space. So um on to security security of AI and rag uh why do it right our data is on its way out to open AAI you know with

the API services that's not good uh let's put it in a container we can you know do that um keep it on prem in a VPC and it stays locked and safe in there again model is going to vary right you got to jack up for the the resources if you're going to do on prem pay for the the GPUs um depending on you know how many queries you're doing um and this is how I benchmarked it um I think it's you know about on average from what I've seen through rag uh within uh cyber security use cases um but I used my lab rats aka [clears throat] my students um to test this out uh so around like five

analysts, if you will, are asking 20 queries per day. And if we total that up, that's about, you know, 5,000 tokens. Again, the token is like, you know, phrase or short phrase or words. Um, so we, you know, do the the math there. We're getting about 100,000 tokens each day for these questions. If you do this in a VPC, um, the cheapest at the time was with a G4DN 4 extra large GPU. um met the benchmark of responses within you know five seconds uh and it's around 500 a month right we do it with chatbt open AAI making those queries out at 100,000 tokens I think the prices have come down especially for you know the the early

models but at the time it was 100 bucks more um and our data was you know leaving the uh network and yeah so we're we're giving up some of the the privacy uh there as well. Has anybody read the book Phoenix Project? Yeah. Yeah. Nice. All right. Cool. Um I feel like we're at the Phoenix Project again with AI and the evolution of like integration. Um I followed along with the the CISO. Does anybody remember the CISO and the the Phoenix Project? Um like tons of anxiety, like under a ton of stress and also a drunk, right? um kind of checks out like for a lot of CISOs these days and yeah so what do we do right like how

do we integrate and you know help the the business um and let's just go back to our old school methods of implementing projects with some cyber security or no or this is new right is it new is it earthshattering does the old methods of user authentication guard rails um encoders encryption help I think so um it is a new you know vector if you will um but some of the the old school methods can be applied to AI so let me tell you how so user input SQL injection right we're working with prompt injection now um sanitizing the user input very similar right like let's do the same freaking thing. Um, check on is it trying to, you know, like trick

the the AI model? Um, who's heard about, um, you know, talking to the the grandmother, right? To trick the the AI uh into what was it? Giving up the Windows keys, right? Like, brilliant. Um, but let's, you know, try and look, it's going to be endless on, you know, methods to attack, but go figure, right? All it takes is one little hole to to find in the the network for attackers these days. But we need to start thinking along those same methods. Um so let's put in those guard rails. Let's encrypt our database um in transit. Um let's encrypt it at rest. So that would be our our vector database. Let's put some output guard

rails. let's log freaking everything and send some of these things to the the sock in the event it it violates it. Um and those are, you know, sort of red kind of the the guard rails here. Um really cool, you know, Matt from Galileo. Uh I think they're still in the the AI space, but um get the picture. And then so the subjective part here with how do we evaluate the results from LLM models? I mean it's next to impossible because it's very subjective. Um there is a method called hide uh which is you know hypothetical document encoding and it's freaking hypothetical right but it's another LLM that tests against the documents it tests against

the the queries um to either validate or invalidate the the answers that you're getting back. So there are some methods out there. All right. Who came here for MCP auto context protocol? All right. All right. You're in luck. Um, this is how I think of it. A USBC port, right? I've got one here uh to my my MacBook. It is the USBC port for AI that connects your models on the host with a client, your MCP port to your servers, your Slack, your Gmail, your calendar, your local data sources to make sense out of it all. And running within the the wires is the AI model. Um so Agentic yes uh a protocol we need one um to make

those these communications happen and log it with a standard one port endless tools and zero spa spaghetti code under the hood. It is um JSON RPC 2.0 0 uh in a nice little JSON format to structure it and it's pretty good. Um the architecture I mentioned the host the client the server and you know think of it like hey the host I'm asking a question show me the the current weather. Um, the client, you know, knows that I'm in New York City and the server has, you know, the the API and the feed and, uh, the data to, you know, make that connection happen and route that data back to the host. All right, who's ready for some demos?

Um, this is my open source project. Uh, it's called Arsenal Forge. It uses MCP and Rag. uh why did I do this? um not to be the next sore but orchestration and making sense with some transparency of um you know some of the alerts right um so I embedded you know similar to what you guys saw for rag I've got miter I've got a detection inventory I've got you know some advisories or threat intel typical sock data that we're dealing with and the solution is I want something fast. I want something smart and that's going to help me make more informed decisions uh without the crazy uh price tag of some sore 2.0 product, right? That, you know,

uh is going to be really expensive, really slow, and it's going to take me, you know, a year to freaking implement and integrate because I've got 1500 Splunk detections. I've got two terabytes worth of thread intel. um and I want to make sense out of MITER um and inform my analyst all at the same time. So this is the the data workflow um taking those uh data sources. I've got the the ETL says extract, transform, load. Uh so behind the scenes here of some secret sauce uh I use a schema that we're all familiar with which is you know email, right? Within email, we have subject, we have a body, we have a source, you know, usually an IP address.

Um, and we have a timestamp date. So, using that as your schema for how you parse and collect your data goes really far. Um, almost as far as like the the prompt engineering, but um, just, you know, giving you the heads up when you're you're building this on your own. So an example here um we've got this lengthy Splunk detection rule where and we want to know what MITER attack technique is associated with this many times we're doing the the manual labeling of uh this Splunk alert usually a junior analyst or an engineer um so are they right are they wrong um and don't get me wrong like there's you know multiple techniques that have some

overlap that it may be one or the other But this is going to give at least a best effort. It's going to go through the trifecta of the library what I mentioned uh for rag and output uh a mapping of the technique. In this case, it was remote desktop protocol. Um I'm sure because it picked up on, you know, some of the the network indicators here. Oh, and RDP. So, we didn't really call out remote desktop protocol. We used an acronym. picked up on the acronym good job rag and it's pretty cool. It tells us um what technique it is and then creates the link. So why is this good? Well, this is the end result that a

cyber security analyst should see um with an ability to dive a little bit deeper, get some more context, maybe even provide a playbook uh to help them through and do their good do a good job. So, let's play this. All right. Um this is what's going on behind the scenes to set up your MCP. I upload the data to Chroma DB. Um, this is all on an open source project and I'll share the link at the end. Uh, let's get those scripts. Let's upload the the MITER techniques. Let's upload CS advisory detection inventory. Come on. Come on. And get them up and running on uh a server. So now we've got our data that is complete

and we want to start some MCP servers. We start a memory server. So why are we doing a a memory server? This is going to log all of the queries. It's going to log everything that the uh MCP Arsenal Forge tool is doing to give some transparency and it's also going to log those conversations, right? And learn hopefully about what the analyst is asking and provide better context. We fire up a Streamllet app and welcome to Arsenal Forge. We're taking a look at the move it vulnerability detection. That's what came up. It first goes through open AI gives a little runbook of what um the moveit vulnerability is. Uh we've got the MITER attack technique

which is some uh scanning uh with the the moveit uh technology. We've also got our detection inventory and a link to that particular detection for additional details. So this could be on your backend database of all your logic. Uh a description here to help the the analyst, right? They want to dive deeper. Great. They want a quick summary. We've got that within Arsenal Forge. Let's check out the thread intel related to move it. We've got a CIA advisory progress software releases uh server path for the move it uh transfer vulnerability. And then last but not least, the memory uh the full result, the timestamp, the um the text that came through with it. Thank you.

All right, let's pause that. Go to the next slide. Shake and bake. Oh yeah, we've got some security detections. We got MITER. We got CES set advisories. And it's doing the automatic mapping all from MCP. Okay. How about some more cyber security use cases? governance, risk, and compliance. Ah, cringe, right? Um, >> yes, the Sorry, we'll we'll go fast, but when I think of GRC, [laughter] it's the ugly baby that is really hard to talk about with parents. Um, we've all got some skeletons in the closet of lack of policies, lack of controls, and you know, as a fractional seesaw, this is one of the most like annoying tasks that I have. Um, God bless the souls

that are doing GRC full-time, it's a tough job. And the data nerd that I am, I wanted to get into, all right, is this getting worse? you know, we hear from the current administration, we're going to get rid of all laws and regulations and, you know, the companies are going to do business themselves and we're not going to get in their way. That's not going on. Um, we see a huge rise in laws and when we break this down uh by, you know, the laws that that we've got to abide by in our industry, you know, they're they're on the rise and lo and behold, AI bot rules are up and up. the the privacy laws uh within

the states are you know staying flat and we've also got cyber security laws per state that are going up. So how do we stay a breast of these laws? Does it impact us and what does it mean? Here's the the numbers. They're on the rise and then also around the world too. Um so I worked at the the big banks. This is one of the first use cases I used rag for. Um we were in 100 plus countries and our head of GRC could not figure out um what we needed to do because we were going into uh business and banking in say Iceland, right? What what do we have covered? Uh here's Brennan the Iceland

law. Like does it map? Well, you know, we've got to you know hire the lawyers that are super expensive to to do that. And yeah, it's just on the the really good stats from the UN, you know, 158 countries have e-transaction laws, 156 have cyber uh crime laws, and then consumer protection on the up and up. And I thought like US, you know, we were kind of lax in it. Uh not the case. So how do we make sense out of it? Um we've got we talked about rag, we talked about the vector database. Um there are some other tools for classification. Um I used a foundation model called BERT. Anybody familiar with BERT? Yeah.

It's like that sentence completion on our text on our email of you know figuring out the the next word and go figure that is really good at classification and legal types of things. So when it comes to that gap analysis, why not get AI to to do it for us, right? I mean, yes, we still need that human in the loop. We still need that approval. We still need to understand the spirit of the law of the regulation, right? And why not, you know, get some visibility into that. So that visibility, transparency is key here, right? Um percentage match. So you can use BERT, you can use classification to upload your policies and tell me how

far along am I with SOCK 2. Uh how far along am I with ISO CMMC? Right. Um I've got an open source project on GitHub. I have these starter packs. I've got 50 plus policies that serve me well in getting to sock 2. Hopefully can help you guys out as well. Um, and then uh a tool that we're releasing called audit caddy that can do all this for us. But the the what I'm calling compliance notebooks are that comparison and it'll tell me you know which policies I either have or need to write. But what if we don't have uh a sock 2 or a framework like a new regulation uh and fundamentally a lot of the regulations

that are written have something related to sock 2 or have some domains or controls that we need to to focus on. So putting it in a nice little tabular format um similar to the sock 2 can be done and we can still do that comparison and policy mapping. So the open source project um check it out on GitHub. It's called open uh audit caddy. Uh, I've got, you know, those policy templates, the starter packs, a bunch of different state regulations, uh, and frameworks, everything from FURPA to CMMCC, CMMC, and then the the classification models. So, I'm starting to build foundational BERT models that will get you SOCK 2 that are focused just on that. Let's not use the crazy open AI

that are trained on all of the internet. Uh, how about a small language model that can help us focus and keep on task with certain things that we need to stay on task with? Um, and if you ask the question of why the sky blue, it's going to blurt out something stupid, right? Because you shouldn't be asking that question. What is the control domain for my code of conduct policy is what I should be asking. So, let's use this type of of framework. So, let's check it out.

Come on. Come on. I'm seeing what you see. Here's the GitHub page uh for Audit Caddy. Highly check it out. Um I had a friend who's working on a startup. He's trying to get in uh to sell to schools. he needed to get regulations uh done through the Department of Education in New York. I sent him here and he just loved it because he could just go through the templates um use those, fill in his company name and get up and running and get certified. This is the tool. Uh the dashboard will tell you how far along uh I mentioned you upload those policies. You point it at sock 2. we ask uh a quick survey and

this gives the AI model some context about what your company is to help frame and customize and give you some context uh you know maybe hey what state you're in and what regulations you need to abide by and then these are the snap in play um compliance notebooks that will break out each of the domains and in each domain we could ask spec specific questions questions about the particular control. So in this case we've got CC1.1 and then also we've got our files in here. Again transparency is key. Um and in this case we've got a information security management policy that was automatically classified uh through this classification of hey it's probably best suited in high confidence to be a part

of this domain and risk and an identification management. We've got all the the actions here when it was uploaded. And yes, your auditor should be happy to use this, can log in, and you won't be annoyed by them pestering the the questions like my friend Joe in the back who's an auditor um did an audit for me. Yeah, he was annoying, but he said, "Brenton, let's do this with AI." And here we are. Also, we can export this in a nice little report. Uh we can export it downstream JSON. And this is what that report looks like. It's English. It's an export. It explains the domains for me. And it helps me with the findings. And

it's also friendly for the auditor. Clean, clear, and also, you know, the assessment. Hey, I'm incomplete. I'm red on a lot of these things, but you know, I've got some confidence with the documents and the evidence that I've uploaded. So, you know, the hopefully not an ugly baby anymore that we have to to deal with. Um, but you know, this helps the the business. It helps understand what risk you need. It helps you get that stamp of approval to the auditor and hopefully unlocks some business deals for you. Um, you could, you know, take the the compliance part away, have this help you out and save a lot of time and a lot of money.

All right, let's wrap up here. Beer soon. You guys ready? Who's getting thirsty? I am. So, last but not least, if you're not first, you're last in AI. You cannot sit on the sideline anymore. You cannot wait for the newest model to come out. You've got to test this out for the sake of your company, for the sake of security, for the sake of our industry, right? We need to use it. Mileage is going to vary. Cost is going to vary. Push the vendors for compliance. Push them for transparency and test this out. There are use cases both in, you know, cyber security defense. We saw it in GRC. It doesn't have to be the ugly baby

anymore. and we can use it to help. All right, last but not least, some links. Scan my QR code here. You guys are first to know about AutoCAD, so welcome to the community. Feel free to to reach out. Check out LinkedIn and stickers. Come grab some. Thank you. Bides. Yeah. [applause] Oh, yeah. Questions. Who's got them? Are you gonna walk around? >> Okay. >> You'll have to use your mic. >> Yeah. Yeah, I can do that. >> Mic on >> testing. Yep.

>> Over here.

>> Okay. So, um I have a few questions, I guess. Um so, did you uh was this done for like NYU's uh compute and like all that stuff? >> Uh yeah. So, I um using like a sandbox environment in Amazon. That's what I know. So, just like an EC2 instance and yeah, I I pay for it and you know did all the testing, but my students were able to you know, link up and then use the the rag there. >> Yeah. So, um I guess adding on to that, um I know I don't know if uh like you guys can do local, uh LLMs and stuff, but I know Dartmouth has AI lab. So, um

could you uh alleviate cost and also add some extra, you know, reassurance to your team by, you know, running your own LLM like uh you know, the new uh GP the open source GPT models that you and I just dropped today. >> Yeah. Yeah. Yeah. Uh yeah, check those out. Um put them in like a container, right? um and run them locally if you've got, you know, fast machine GPU. Yeah. Uh all power to you. There are some freaking LLMs out there that can run on CPUs. Um so experiment, check them out and yeah, shut your internet off and see >> I I just got that idea because uh you know, OpenAI literally just released it

like this morning. So you know I I have to go check it out myself. And then uh I guess one more question then more in general. Um, does this take any substance away from those getting into GRC or like, you know, does it just augment uh, junior associates with like >> Yeah, I think it's a tie that raises all boats, right? Like we need help. Um, >> yeah, it's a good selling point. I'm not going down that route with CISOs and, you know, my products. Um, it's just a a helper, right? Get faster, cheaper, um, you know, ways to to get your freaking certification, right? But also learn, right? And I can't force that. like I'm

a teacher. I'm like, "Hey, you need to educate yourself on, you know, what this control domain is." And the auditors are going to audit, you know, whatever policies you write. You're liable to, you know, putting those controls in place. So, GRC field man is a weird vicious circle, but they need some help. >> Okay. Thank you so much. I'm I'm trying to learn, you know, GRC and uh, you know, it's what I got at Dartmouth, so I'm trying to figure my way out. >> Cool. Uh, next question. Thank you. Yeah. >> So, if you're a sock manager, >> Yeah. >> what would be your your top best return on value sock use case for this?

>> Yeah. I think the thread intel enrichment uh what I've got with the Arsenal Forge. Um start small though, right? Um start within your own container. Don't try and do it all. Um try and solve a quick use case. MITER, right? just upsert some data into a vector database, hook up a rag and see if it can automagically tag your security um detections, your logic. Start there and then build, right? So get your thread intel, uh hook it up to Slack, right? Um inform your your analyst and you really possibilities are endless. Now, when it comes to a gentic stay away. [laughter] Um don't automagically press the easy button and have it do it. I mean test

out maybe you know some of the the email the spam the fishing right for awareness uh but start small first. >> Hey guys the last question. >> Okay my last question. So um that slide the roll your own slide. Did any of that stuff that that you were talking about have to do with I missed it building building your training your own models or was it just using models and then plus rack? >> Yeah. Um, so you can build um there's some really good foundation models out there. They're always changing. Uh, just try the offtheshelf stuff first. Um, and then you know augment and you know use a foundation model and then train. Um, and

yeah, it can pick up on learning. Like I get a lot of value in the the prompt engineering in, you know, putting in some of those custom I don't know like acronyms that your company may use so it picks up on that and then uh using the the history uh you know functionality so it learns as it goes along. Uh another like downstream thing you can do is the reinforcement learning human feedback. So that's the up the thumbs up, thumbs down. But hey, that's all you know, manual effort and you know, humans are going to human. So take it for what it is. I think that's the last question, but yeah, feel free come grab some

stickers, ask questions, and thanks again guys.

RAG Against the Machine: Using Retrieval-Augmented Generation & MCP to Fortify Cybersecurity Defense

Related talks