Using LLMs To Accelerate Threat Detection - Richard Finlay Tweed

Name: Using LLMs To Accelerate Threat Detection - Richard Finlay Tweed
Uploaded: 2025-01-21
Duration: 15 min 45 s
Description: Security operations center analysts often need to quickly assess exposure to newly discovered vulnerabilities across heterogeneous infrastructure. This talk demonstrates how retrieval-augmented generation with large language models can accelerate threat detection by converting existing queries and r

BSides Bristol · 202515:45142 viewsPublished 2025-01Watch on YouTube ↗

Speakers

Richard Finlay Tweed

Tags

CategoryTechnical

TopicAI Security Detection Engineering

StyleTalk

Mentioned in this talk

Tools used

Ollama Open Web UI

About this talk

Security operations center analysts often need to quickly assess exposure to newly discovered vulnerabilities across heterogeneous infrastructure. This talk demonstrates how retrieval-augmented generation with large language models can accelerate threat detection by converting existing queries and runbooks across different data sources and platforms, while emphasizing the importance of human oversight and verification.

Show transcript [en]

uh right who knows what a large language model is hands please one 2 3 okay couple retrieval augmented generation okay one and who entirely legitimately mind you is actually using language models in their day job okay that's more people than I thought all right so this is what you are going to take away today this talk is not designed to tell you what to do you folks are experts in your Fields is to dispel some of the magic Aura around these systems and show you a technique that I've seen used during several High pressured incidents and you can also use this to get rid of a bit of the drudgery of uh making detections and making them

work with your various data sources we'll start with the scenario you're security operations center analyst who just found out that a widely deployed system has a debug API that's unauthenticated and on by default if it was turned on in the last month you're going to need to know how many instan of instances of this are affected so you've got a few tools at your disposal so you've got a CDE asset inventory system but you haven't used it before you've got some run books with sample queries for that system and you've got an approved language model if you're wondering how do I approve a language model chat to me afterwards that's a whole other thing so

how would you start analyzing your exposure to this threat everyone has a different approach for the scenario I'm aiming to equip you with another tool in Your Arsenal that you can use in situations like this but before that I will describe the building blocks of the tool I'm glossing over a few details but these this is all the language model is that next word completion that you've got on your smartphone's keyboard it's a souped up version of that now for another word salad let's say you want to make some banana bread you're going to ask some language model service whatever it is how do I make banana bread it's going to go over to some sort of database usually a

vector database which contains cookbooks it'll have various different recipes and it will attempt to find you the most relevant recipes maybe you know instead of banana bread it's fig bread or something and it'll return that back to whatever service you're using and then what that service will be doing for you is it will wrap up how do I make banana bread for context here's a bunch of other recipes and send that over to the language model and then the language model will send back something like you will need the following ingredients bananas bread hopefully not glue um does anybody have any questions on this all right so all right we've got these new tools um let's actually work

out how badly affected we are so for that we're going to want to know how many systems are running the vulnerable version and we're going to want to know how many of those were started in the last month CU that's that's when the API got turned on by default so whenever you look through the Run books you're able to find a query for the orange vector or orange vendor that has a check for vulnerable versions and was stared last week so you're easily able to change this to be the past month and the relevant version but you're not just on orange vendor you're also on red vendor so how do you change it to work against

both of them you can actually use the retriever augmented architecture without having extra databases without having anything else to do that conversion for you KN for a live demo this could go horribly wrong but we'll risk it anyway so in my previous role I had a variety of excellent tools at my disposal that's when I submitted this talk I've changed job I don't have those tools so I had to build my synthetic version uh so because of that we have a database please do not try and read this stuff it's all synthetic data the the key takeaway here is that for this example I have 25,000 ulated machines and they are booted from 173,000 different dis images uh I know

these laptops are powerful right anyway so we are not going to be able to go through by hand to find out how exposed we are to this threat but now we need to try and open the other tab all right here we go so here is a language model it is running on this laptop here because I do not trust conference Wi-Fi to actually work in my time of needs this is running llama 3 via a tool called o Lama and open web UI uh you can also there are loads of different tools for this they don't really matter it's just one example here uh as you can see it's feeling really Cooperative this morning

anyway I have a premiered prompt which I'm going to attempt to use to convert the queries and while it runs I'm going to show you folks what it is so this is what I was describing earlier I'm telling it using the data below make me a query for the type of database that I'm using and here's the effective version here's the time frame I want you to use it uh please only use fields that exist this will not stop it making stuff up but it does sometimes help um please don't actually use a real vendor's name because I'm trying to avoid that in this talk CU I don't know who's in the audience and I'm telling it here is an

example SQL query for the other vendor please make it work with this um and then I'm telling it here you go here is here is the SQL table of instances and here are the fields that you have available and the same thing for the machine images also it sometimes got a bit confused about how to do dats in uh SQL light so I've included a bit of the man page of like here please read this uh so anyway it do now come out with a query and we're going to try it out and see whether that works so I go over here I paste this in I click run it doesn't work because of course it

doesn't and I cannot see for the life of me what it actually is doing on screen so I'm just going to run the SQL query that it ran that it made last night which did work and it did find 1,785 which is roughly the amount of synthetically generated real data that's in here um now that I've run the of the live demo let's go back to my slides which of course are nowhere to be find anymore but that's fine oh whatever doesn't usually matter uh the next slide is supposed to be next steps but again it's not showing up while I try and fix that who has questions that I can use to dig me out

of this hole there we go all right so you've no seen this what do I suspect some of you are going to do based off the back of this um I'm hoping that you're going to use techniques like this to create updated detections so you're going to take your existing run books and go oh I want this slight flavor on it particularly if you're not great at SQL like I am um as an example here you might have detections for microwave like process based things and you want to change them over to be Linux based you also might have queries for Splunk and you want them to work against your Cabana instance or vice versa uh

that tends to work fairly well I'm hoping that you'll be able to use these techniques to accelerate time pressured investigations as well um sometimes you're just like for Pete's sake I can't get this join to work and in that case this will help you may not get it right first time but at least it you know gets you closer to something working you're also going to find out these tools are extremely limited they are not all knowing please don't assume that they're all knowing they will not understand your infrastructure they will not know what data you have available and they sure as heck won't know what the heck you're calling it um give it the information detail what

it can use what you're expecting and you're going to get much better results out of this for example tell it the table schemas tell it what this stuff means um as you saw on the prompt it didn't necessarily understand that instances and machines are synonyms in this context but by adding that into the prompt you can get better better quality out of it um also language models present text with confidence idence inevitably they they they're a bit like your drunk uncle um take care with them verify what they say uh the best way to use these tools is to have a human over the loop like this of like get it to spit something out you try running it and check whether

it did what you expect it to do if it didn't then you can refine it um if you're want to know some of the limitations of language models was already of a top 10 as they always do but it is actually quite good so have a look at that so just to recap this was a toy example to show you how you can use your existing detections to make new ones and how to quickly widen your detection coverage with that there's some other useful resources uh this is blog post I wrote where I go into more depth of roughly what I showed here but using Clyde queries data sets before they well stop being any kind of Open Source which

is really annoying but there is a fork anyway um and using that with chat GPT and it will actually do the whole retriever augmented generation thing for you rather than you're doing it by hand and on the right if you are considering making tools involving language models I talk through some of the architectures and pitfalls that you might encounter with that who has questions y um have you considered putting bunch of stuff into the system and PRP so that you can reduce how large you put restrictions Etc into yours you put across all the queries uh just to repeat the question for the recording the question was rather than putting everything into the prompt why

not add it to the system prompt which is the thing the language model gets told before your actual prompt uh the performance of that depends on the model um it as I said it do it's essentially trying to guess the next word and then we'll present that and then run itself again and do that over and over and over again for some models putting stuff into the system prompt works better uh other things they just don't care um some models aren't actually trained for conversations they're just trained on text completion so in that case it doesn't matter if it's oh you are a system that should only respond the truth or just based on this information tell me the truth it to

the model it's the same thing but yeah it's a technique that you can use and I would recommend there depending on the model people have find things that work well for those models and yes putting those into system prompts is a good idea any other questions all right have you looked at agents in mod as well so that you can start changing together to generate your out your spun outut KQ outut all that yes uh I have looked into that one there are some really cool pieces of research of agent and tool-based workflows where essentially you can just checking new kids you can do the equivalent of strapping guns to rers with this stuff um one example I saw besides Prague uh

there's a recording I thoroughly recommend people look at it they built essentially a dojo uh which is like a simulated lab environment where you hack stuff and they made it that humans can click buttons and hack the system or you can just get a language model to do it um and the behavior of well people and the language models was pretty similar and they actually generally managed to navigate their way through including a fun incident where they didn't set the IP range right and it was M map scanning a bunch of stuff outside and they had to quickly shut it down but um there are plenty of things there I would recommend reading up on the limitations of this

stuff there are some really cool things that can be done but to put it another way for a long time a lot of the documentation on the internet if you search for H to unmarked should Jason in JavaScript would just tell you to execute and stuff like that is in the bils of these models treat them with a bit of care um but yeah you can do some really advanced stuff with it as well especially if you want to do it for fun right time for one final question yeah if you want try got one more one more I well if if anybody wants to ask me further questions I'm going to be about all day come over ask around um

especially around using models and production I have some experience of that now thank you very much

Using LLMs To Accelerate Threat Detection - Richard Finlay Tweed

Related talks