
hello everybody and welcome to bsides um for this session we will have Scott J Roberts talking about building effective machine augmented intelligence uh slides will be collected over slido uh you can get details for slido at bsides SF uh.com qna uh submit questions there we'll be uh taking them once the talk is over all right take it away thanks Devin uh Daman I was gonna get that right eventually um I was just actually telling Damen I finally found a way to not be nervous before a talk which is be doing your last minute review and realize you forgot a slide it's on your laptop that's back in your hotel so I have to turn around run over to your hotel get
your laptop come back fix it and then move forward so uh that's where we're at right now if that wasn't enough I've got this light in my face and I can't see any of you so as far as I'm concerned it's just Damen and I having a nice conversation so uh this is Skynet the CTI intern I'll talk about why I got to that in a second uh first of all the obligatory who am I uh my name is Scott Roberts I am the head of threat research at interpress uh no it's not interpress it's not interpre if you're French I'll let you get away with inter pre but that's about it uh I've had a 20-year
career spanning um incident response threat detection and cyber threat intelligence uh I've worked some places you might have heard heard of isn't that great uh I'm also getting a master's in uh anticipatory intelligence at Utah State University so for the bsides folks the reason my slides were late sorry finals uh and at the same time they also let me teach cyber security at the same time at Utah State which is a lot of fun so I want to start by actually setting expectations and I'm going to start with what this talk is not so uh I am not dropping any new cool found found AAL models for cyber security uh I asked my bosses for a few hundred million dollar
in A1 100s they wouldn't give them to me so I can't generate any new models uh I'm not releasing any software or products and I'm also not going to do a series of perfect prompts so I'm I'm guessing there's some folks who are working with AI in the audience you may see some things where you go hey there was a way easier way to have done that or you could have done this or that please come talk to me afterwards this was an experiment this is a series of experiments and I'm still going through them so I'd really like it if uh we could have that conversation because I want to do these things better so what we are going to talk
about though is the pros and cons of trying to take all these new tools that are going on and integrate them into real world workflows that I was dealing with and I think that's actually a really good metaphor for kind of the entire set of talks which is this this idea of setting expectations um I have the weird place um again being around students I get to see how they use large language models a lot and I get to kind of um be a voice in the room and it's it's funny because I feel like half the time I'm the guy who's going it's not everything you want it to be it's not going to solve all your
problems it's not perfect and then at the same time I end up being the guy who's going well no actually it can do a whole lot more than you think it can and take take some time and work with it because you know I I find the people who think that these tools are going to do everything or can't do anything are generally people who haven't spent a lot of time with them and actually use them so that's what we're going to do um most of the stuff I'm doing I just happen to use open AI because it's what my company uses nothing we're doing is particularly specific to one so if you're saying hey
I'm an anthropic shop uh I think a lot of this stuff should still apply so we're going to talk about the problem we're going to talk about kind of the use cas I have and then walk through what got us there and you know we'll have a little conclusion at the end so the problem um interpress this is not an interpress ad um is a continuous threat exposure management platform uh if you want to know what that means come find me afterwards uh but the result is as the head of threat research my job is to try to understand every threat actor out there for those of you who've been following the news there's a lot of them
right now it's a very busy job and so so I did the default thing which was I sat in a corner and cried for a while at how much reading I had to do and then I pulled out a python prompt and started automating things and that's what got us to where we're at now and in particular a lot of what I'm trying to do is build miter attack representations of content miter hasn't gotten to yet um love the miter attack team it's a fantastic project they've done so many cool things but they also have to read all this stuff and they also have a lot of time to to go through it and so I need to do
it a little bit faster than they do so I'm automating but before I started automating I had Chandler um just so you know it's not the costume character dog up in the top Chandler's the guy down here at the bottom um so he was my former uh threat engineering intern at interpress uh there a Utah State senior runs our cyber Security Org absolutely fantastic and and honestly having him around as I was getting a lot of this stuff going was super useful because there were tasks I understood but didn't have time to do or was getting pulled a different direction and I could say hey Chandler can you go to XYZ or can you double check me on
this or can you work through it and so when he moved on to do other things as you know college kids are want to do I had to find some solutions so again uh being in a master's program I defaulted to a literature review and and I found three resources I'd really point you at so not surprisingly most of the cool work that's that I'm seeing at least from you know llm providers are are coming from Google and Microsoft and so uh Google Cloud put out this fantastic uh post about the five phases of threat intelligence and AI um came up with some good ideas based on that but not a lot that was really like directly
applicable uh Google uh again put out this supercharging security with generative AI again an interesting read but it was more for the links than it was for any particular like Insight I took out of it but this one uh from Thomas I'm never going to get his last name right so I'm not going to try um applying llms to threat intelligence an absolutely fantastic uh blog post and when I say it was fantastic I mean it was on medium behind a pay wall and I signed up for a medium account and paid for it just so I look at this post again so at the very least it was worth five bucks uh but he does a
really great job of going into uh a bunch of these you know really key ideas prompt engineering few shot prompting uh rag in in actual code and he does show you what he actually did so you can recreate it and build off of that so super super useful so now that I had some ideas I kind of looked at what were the problems I was having what were the things I was doing what was the stuff I used to ask Chandler to do and I started coming up with use cases and the first one was was pretty simple summary generation and and so in each of these cases we we kind of came up with a theory and said hey we
think AI might be able to help us with this problem what success criteria looked like what's the problem and then we experimented with it so in this case pretty straightforward we're pulling in a lot of news articles from different vendors maybe some of you are in this room in that case thank you um but often you know the hard part with these things is you don't know if they're going to be relevant to you until you've read all the way through them so we wanted to put better summaries of these articles out there so that you know we and our customers could evaluate whether or not a particular uh threat or campaign was relevant to them
so Microsoft put out this post a couple year ago or so about uh the zerob bot um stressor super cool and then we ran it through our original nltk based uh summarization and got that now I don't know if you've ever disappointed a designer but it's bad um they they they don't love it when you decide to take their nice you know half page text block and instead put in a bunch of sha 256s so okay perfect example of where we could we could go with this so uh I was I was actually really excited when I got to give this talk knowing it was a movie theater because I could actually put up screenshots people
will be able to read uh so in this case uh we we use a a pretty easy little scraping Library uh to to get the article we pull the text out uh and we use nltk to generate the summary this is the same way we did it previously and as you can see it's giving us sha 256s all over the place not really ideal so little bit of basic you know straight off their main page um walk through of how to do prompting for open AI just in Python and so in this case we're giving it a couple you know really specific things so we're giving it the prompts summarize this into one paragraph ignore
indicators of compromise security vendor tools and contributors by name sorry if you were one of the people who wrote the zero shot post the other thing that helped a lot was was tuning around this idea of these system messages so in this one we say you are a cyber threat intelligence analyst with a concise Bluff writing style I hope everybody knows what I mean by Bluff bottom line up front that is the way you want to read the write these exact kind of things it makes your reading easier it makes customers happier fantastic so go ahead let's generate the summary I'm not going to read this to you out loud but as you can see at the
very least it's not going to make my designer cry so big win all around so what's our take on the first one well we experimented using programmatic summary generation from the open AI conversational endpoint and to be honest this was an easy yes we moved all of our automation over to using this for summary generation so big win right there oneoff data generation so what do I mean by that uh if you're doing cyber threat intelligence or or really I mean any type of security you're going to come ac across a time when you need uh a particular piece of data that's you know different or a oneoff uh in particular when you're automating these things you
know you may need a new definitions file uh a new way of you know a list that you're going to scrape against things like that and those were the exact kind of things I'd go to Chandler and say hey I need all of the names of you know Microsoft's new threat groups because I need to map those over to something great in this particular case we ran into what are called Denon demonyms demonyms uh which I did not know was a name for a thing that exists but that is the idea of what you call somebody from a particular place and so in this case we were looking for examples where uh a particular you know
country was being mentioned and that worked really well as long as people said China but if they say Chinese we were missing it so we needed a way to map those things back to the iso 3166 3 standard that we were using to note references to countries in content so this is a one-off I only need to do it once I'm not going to go you know build a whole automation around it so I went straight to security Co or to co-pilot and said hey uh generate this for me and it did it right for five okay not ideal um Can can I have more please and it generated a lot more great everything looks good I scroll down to
the bottom uh and it stopped at M okay can I have some more please and eventually we get to you all right no please I would like them all and I finally get all the way to Zimbabwean so okay um you know throw that at a text editor it's a one-time thing uh I go through double check make sure the ones that I'm looking for are are actually there and actually match up with the value on the iso uh 3166 standard and we we've got what we need so it got integrated in was used as an automation somewhere else so for one-off stuff do these tools really do what you want them to do sometimes sometimes with work in this
case did it speed me up absolutely I would not have gotten this done nearly as fast if I was trying to look all these up and as far as I could tell the best place to get this information was literally going to each Wikipedia page I was not going to go to the world's Wikipedia pages and pull out these values and line them up so I'd count this as a win but let's do something trickier so I told you right off the bat we are doing everything with miter attack and so one of the interesting tricks and one of the things that I know miter team spends a whole lot of time doing is identifying miter attack
techniques that they see in documents okay so let's give it a try because right now what we're doing is a very regular expression based approach and so in this case you know we'll say um gatekeeper bypass and we're looking for examples of gatekeeper bypass in the text that we're we're analyzing so can we do something better than that uh Folks at Huntress actually do a really nice job of giving some relatively clean ones that make a good example for for this and in this case they actually uh give you the mapping although not like in a table or something you could easily download or input so let's see see if we can clean it up this
way so back to our our open AI prompt but at this point we've changed a couple things so the prompt now says extract the miter attack techniques from the given text in a Json format of and I kind of gave an example of what I wanted it to look like and then I changed the you know system Ro to you are a cyber threat intelligence automation outputting Json data and we get some stuff back okay so let's run a test against it and see if they uh nope they are not in there it is it is coming up with with with things that we are not finding the ones that we're looking for so I I
thought it would be easier to do this and give it as a table so in this case I have on the you know left hand side here the tech attack techniques it said were there so in two cases we had uh one and a half true positives it identified uh t11 05 correctly and it called out t159 but if you go back and look at it we didn't actually have T1 059 we actually had the sub techniques of T10 59 instead so uh 1059 003 so not quite what we were looking for and then it kind of threw in a couple other techniques that we've never seen before okay not not ideal but it was
kind of an unstructured approach I've wanted to play with Lang chain which uh is just a a automation tool for working with these llms let's see if that can get us closer to what we want so in this case I move into Lang chain which you know I was excited about because it makes you kind of large language model agnostic you can swap in and out different ones and I give it this idea of of what's called a a tool mode where you can essentially structure out what you want it to Output which will be more useful anyway cuz taking arbitrary text and kind of scraping it to get the techniques I needed was kind
of undoing everything I was just trying to do anyway so in this case I say all right I'm going to create an object I want it to have the technique name the sentence that it matched on and I want to have the technique ID and then Lang chain changes a little bit about how you give direction in terms of what you do so I say you are an expert attack extraction algorithm only extract relevant information from the text if you don't know the value you're asked to abstract just return null I'd rather it not give me bad data and then I give it the format they should be in and I don't know how to spell
should I got into computers because spelling was so bad so you know I'm just leveraging the computer and so in this case uh this is what I get back well that doesn't work now again this is where I say maybe you're like the developer of Lang chain and you have some ideas on what I did wrong please come talk to me but that's not going to work for me at this point so let's try again because I I it seems like the automation might be the thing getting in my way so I go back and start prompting directly and so in this case I literally just paste in the uh information from the post and I say extract the miter
attack techniques I give it the format that I want oh hey that's actually correct so okay what if I ask you to give it to me as Json that that works and so if I do this uh same thing again you'll see my uh found and my true positive are in perfect alignment that's pretty good now what do I take away from this well clearly I'm doing something wrong and how I'm prompting it or or how I'm doing it at least programmatically um I I go back and change the formatting a little bit just for my own uh edification but I'm going to count this one a success because you know ultimately speaking we ended up getting in the end
what we were looking for it just didn't do it quite the way we wanted it to so uh rather than claim you know I'm calling it a mixed success if you're a board gaming or a a tabletop RPG fan that might mean something to you you um and I'm going to say this one needs some continued work you know the original Call of the open AI conversation API didn't work Lang chain tool mode I don't think I was using right but we did show that in prompting it actually did work as we expected so we're going to have to keep working on it but let's do something even more complicated so in this case sticks to
merging so we very often run across situations where um we are creating miter attack or a content for attack that miter hasn't gotten to yet well here's the fun part miter eventually catches up and so we end up in this situation where we'll have two objects one that includes our data one that includes their data that are supposed to mean the same thing and so something I've honestly been putting off for a long time is the idea that eventually we're going to need to reconcile this data together we're going to need to put these things into the same form at so that's really the original problem we we've had to do this in a handful of
cases and the answer unfortunately is just a lot of manual effort so it's a lot of readjusting relationships it's a lot of deprecating old things adding new things uh but the worst part is basically taking one intrusion set object and another intrusion set object and going okay which thing do we want from which one and how do we put it together and which one's more accurate and all of that so I go into my trusty jupyter notebook and I create uh two intrusion set objects that's what miter attack calls a threat group and I do one for ap28 have a lot of good information about what that is and then I create one for Forest blizzard because Microsoft
just had to go and change all of their threatened actor names and now I've got to handle that so same format mostly the same amount of information you'll see that the ap28 one has some things like secondary motivations that the um micro or Microsoft based one doesn't there's slightly different sets of dates different goals okay so I go in create another prompt and in this case I can't quite read it because I left it a little bit too small uh but ultimately I essentially say I'm going to give you both of these objects I want you to merge them together and it turns out it does okay this this was this one actually really shocked me I did not
expect this was going to work quite as well as it did so it turns out it actually picked up a couple the values incorrectly it it it it duplicated some things but it at least merged them together into a valid sticks bundle close not what exactly what I'm looking for so let's keep playing with it so in this case we come through and uh prompted Again by tweaking to say uh I only want one object and I need you to put the um values together a little bit more efficiently at which point it gives it back to me now I'm skeptical because if you've ever played with miter attack it is incredibly finicky it is a Json
standard that's based on an XML standard if that doesn't scare you right off the bat you haven't played with XML that much but the result was the the first thing I did was basically say can I parse this as valid Json because that would be the first thing that would get wrong and it turns out absolutely we get valid Json out of it then the part I go to is what I'm doing right here which is I say I want you to parse this into actual sticks too so now not just is it valid Json but is it also valid sticks too and it turns out it is now Microsoft's gone and solved this problem for me since then but the point
of being able to show that we can take two really complex abstract uh pieces of information and have it correctly put them back together in a way that's still valid and Parable at the end I got to be honest I did not expect this one was going to work and it did so the the experimentation was doing Pro programmatic merging of sticks two objects using open Ai and it seems to work I don't know if it works enough yet to be able to put it into production but I I was really surprised at how effective it was uh I'll give you a bonus one that really kind of is throughout this entire thing um another really great AI use
case if you're not already playing with it is code assistant so uh I built almost all of this using uh GitHub co-pilot uh including the fact that those two stick two objects I just showed you I didn't write any content for it gener ated all of it for me and it's pretty good I'm a big believer that while not everyone in security necessarily needs to be a developer being a developer is kind of like a superpower you can do everything faster you can get more done you can be more verifiable all of those things and so if these types of tools like GitHub co-pilot are able to make mediocre developers like me able to be a lot more
efficient and effective that's a really great place to be so uh highly recommend playing with that as well so what's our summary summary generation was successful and is now operational oneoff data generation was mostly successful and I'm still using it as necessary uh attack technique extraction was a mixed success we're going to keep experimenting with and sticks two object merging was unexpectedly successful and might get operationalized relatively soon so my point with all this is this didn't even meet my initial expectations uh and I'm really glad it didn't because what it showed was the experimentation was worth the time again I I've I've had people all over the place who think they know what these tools are going to do
but it's honestly so much easier and so much more effective to just take the time pick a workflow and see if it'll be effective rather than speculate because the speculation on one hand is that we're never going to do anything with this and the other one is that it's already taken over the world and the answer I'm sorry somewhere in between so what about Chandler he's still at Utah State uh he has another internship this summer uh I I tried to replace him by AI I don't think it necessarily worked um but the other thing it did show me and something I want to just say to to this audience is he is a senior in College we do
thread intelligence work you know generally speaking that is not a place you see straight out of of school kids going into and being effective and I just want to put out there that Chandler was remarkably effective to the point that I just spent three months trying to replace him and kind of failed and I say that just to say that like this next generation is really doing some amazing things they've got some incredible skills um and I highly recommend taking the time to you know work with students or you know bring in uh someone new understanding that you know they might not have all the skills you expect right off the bat but the amount they can pick up and the amount
they can do is impressive so in conclusion llms can't really do everything but they are great for certain specific tasks uh I really recommend you know avoiding speculation and focusing on experimentation uh and if at first you don't succeed try different stuff thank you
you all right everybody we have three minutes left uh we have Q&A available through slido you can go to bides sf.org qna to submit your questions uh I'll be relaying them to our wonderful speaker Scott here uh and if there's anything we can't cover you will be able to catch up with Scott up in the mezzanine after this talk is over first question we have uh through sidu is if you had to do this again would you use an llm to manipulate Json or use a separate process to combine or format llm output yes I I mean I think it depends on the the given task for for what I was doing I think again I was I would have
normally said I didn't think it would have worked as effectively as it did so given the fact that it did work I guess I'll say I'll let the llm do it CU why reinvent the wheel if it seems to be working again every every time I tested the output I kind of was going this is where it's going to break and I I kept being wrong in this the sticks 2 merging example so kind of depends on the task next question um have you tried automatically extracting ioc's from threat Intel reports with llm so I have not done that uh only because ioc's tend to be across really defined formats and so you know looking
for a sha 256 like that's an easy thing to put a Rex in a address is a good thing to put a Rex in you know where I was really going ultimately with the attack technique extraction is the fact that you know you can describe a fishing email as a fishing email a malicious email an email with a malicious attachment all of those things are trying to describe the same thing and I'm trying to find ways to extract those types of relationships so um at the same token you know every time you introduce a new llm element you're adding latency you're adding time you're adding cost cuz you know to pay for your number of
tokens and all that so that is the exact kind of thing I would instead really focus on using something that would be a lot faster like a regx ah our last question in from roit car uh what's going to be your favorite demo of 2025 if this keeps going as well as you have so far wow that is a great question um so I I'm really fascinated with the idea of trying to use these types of tools to build uh like breach and attack simulation stuff so if I could take something like this and take a threat you know report and then be able to say here's the atomic red team data to be able to run a simulation of this
particular thing without having to wait for someone to generate it for me uh I think that would be a really cool thing to be able to show all right uh I think that's it for the Q&A uh everyone please give Scott a wonderful Round of Applause fantastic top