← All talks

Advanced Threat Modeling with GenAI

BSides Philly · 202320:43161 viewsPublished 2024-01Watch on YouTube ↗
Speakers
Tags
About this talk
Explores how generative AI can accelerate threat modeling by automating data-flow diagram generation, triaging tool outputs with context, and improving mitigation recommendations. Covers risk-based threat modeling challenges in agile development and demonstrates proof-of-concepts using GPT-4 and PyTM to reduce analyst effort while maintaining actionable security insights.
Show original YouTube description
Vladimir Fedotov Advanced Threat Modeling with GenAI The presentation aims to showcase how generative AI can help application security experts overcome challenges in threat modeling. These challenges include limited time for comprehensive threat modeling due to agile development and the overwhelming outputs from threat modeling tools that lack context. The presentation will cover the essence of risk-based threat modeling, the challenges faced by modern threat modeling, and how generative AI can provide a solution. It will also include a POC. Bsides Philly 2023
Show transcript [en]

Okay. Good. Awesome. Hi colleagues. I hope you can hear me well because I cannot even hear myself. So a little bit challenging, but I hope we will have fun. Okay. I see your hands on the behalf. So let's run a quick test. Who have been doing thread modeling in the last couple of years? Yeah, you, if you cannot hear me, you better to take headsets

Yeah, unfortunately. So, okay, let's get back. Who have done threat modeling in the last couple of years at least once? Okay, that's awesome, great audience. So I hope you'll have some fun. So let's go. Today we'll be talking about several things. We'll set up some expectations about the talk. We'll do quick recap for people not familiar with threat modeling, discuss advanced threat modeling, and then jump to the interesting stuff, what like Generative AI can help, can do to us to help with threat modeling. So first of all, set up expectations and disclaimer guys, not be too excited. So these talks, this talk is not aiming to deliver a production ready solution for you. It was mostly done like

with what if in mind and just for fun. It can be, it will be. in production eventually, but so far just take it as

what-if solution. Secondary, uploading confidential data to any kind of provider is like your own security risk, so evaluate. I have been using GPT-4 via chat. The same functionality is available for API, so eventually it will be available via Azure, so it's kind of More safe, but still evaluate your own risks before using anything in this deck. And finally, advanced is subjective. So don't judge me. And very finally, I'm really looking into your critic. Please prove me wrong. Catch me after the talk and tell me why I did bad things. So I really excited to hear a critic from you. Anyway, so what is threat modeling? There are a bunch of definitions for this, but I mostly like the definition from the Avasp cheat

sheet. It basically says that threat modeling is a process to gain actionable insights into your security, into security state of your application. So the main thing is actionable and repeatable. So I like it. There are a lot of methodologies, tools, and processes, frameworks you can use for threat modeling. Some of them are on the presentation, but in general, we're answering four main questions. So what we are assessing, meaning the scoping, decomposition, and et cetera. What can go wrong mean on all threat modeling part, threat identification, vulnerabilities, validation, et cetera. Then preparation of mitigation. So what can we do to improve our security and address what can go wrong? And finally, did we do a good job? So it's feedback loop,

re-evaluation and assessment. It's from threat modeling manifesto. So very common stuff and it happens across all threat models you may potentially do. If you're not doing this, probably you have to improve your process. So, and in this case, oops, What is advanced threat modeling? So based on my experience, using all these processes, methodologies, you can move in several dimensions. So from amount of effort, from complex to something simple and efficient, and by result, so from something noisy, unuseful, irrelevant, generic, to something useful for developers, because basically the whole goal of our work is to deliver some results to developers and be useful. So, and our target in general and for this presentation is to be somewhere in the right top corner to

do something simple, but at the same time useful for the developers. So, there are challenges with threat modeling. There are a lot of them, but I most likely like two of them. First of all, it's a limited time and resources to deliver threat modeling due to a speed of development, you know. The other developers, architects, all the teams have a limited time to deliver their work. And now we add in this, that modeling work to them, other scans, all other applications security work. So it's a lot of pressure and the time is limited. So it's forcing us and other people to use more tools and the tools have their own challenge because they're too generic and they usually lack of context and deliver, you know,

kilograms or what is, ounces tons of the irrelevant threads and you like spend all your time triaging rather than threat modeling. So in this question, at this moment, the question arise, can generative AI help us? So yeah, it cannot. I'm just kidding. It's obviously going after you and it will take your job eventually, probably very soon. But jokes aside,

Let's run some C theories. What can Generative IA do for us? And it's good because you cannot see the cool pictures on the right. And they are there, believe me, it's just thumbs up, thumbs down, and something. So basically, there are several theories in each phase of threat modeling. So that LLM can generate a description of the architecture for us based on the picture. that it can generate data flow diagram based on description or based on other picture. That LLM can help us triage after generated findings. It help us to fine tune them for our needs and for our context, or it can completely generate it for us. The same with mitigation. So it can take whatever it is out there and tune

it for our context. or it can generate completely like new mitigations for our specific case. And finally it can help us to evaluate threat model. So I ran all these examples and let's review together and evaluate if it's like reasonable, not reasonable for you. So first I'm personally excited, you know, it's crazy. So the first example is,

generative AI generates a description of data flow diagram. So it was like, I spent one minute on this, I just pick up data flow diagram from some thread modeling document, dropped it into chat GPT and asked it to generate the description. It picked all elements and like describe what they are doing. So it's insane. But you know, it's not very helpful. What is really helpful is to generate data flows. So I spend a lot of time doing the thread models, like, just taking some different architecture diagrams, putting them together, and with more and more automation, diagram-driven threat modeling is going away. So we need a more robust way to store information about the threats, and that's not a data flow. It's some kind of a

tree or whatever, so we need data flow as a code. And I picked up a PyTM framework as an example of threat modeling automation but you can use any other thing that will work more or less the same. Up, now should be this diagram while I was talking. But anyway, so you can pick up a random diagram like your diagram, AWS diagram, and put it in the chat and ask to generate a PyTM code. So on the bottom you can see it's like AWS diagram to GPT-4 model. and generates by TM code which produces data flow plus all the threads and it worked for any other tool. So for me, it's exciting. Yeah, it made a few mistakes. If you'll take a look,

it's assumed trust boundary wrong. It didn't pick up a reverse data flow, but still I spent like few minutes doing this and tweaking such kind of diagram is way more faster than producing it from scratch, even on Python. So it's a big thumbs up. The next one is thread triage. So okay, I received this PyTM code and what, guess what, it produced tons of the threads. For example, it recognized Cognito on the diagram, defined it as a process, and based on the rules, I received 34 threats identified for Cognita. And as a security expert, you can say that they're irrelevant. But come on, you don't want to spend time describing why they're irrelevant. So why

don't ask Generative AI to do it for you? And you know, it did a pretty decent job. If you take a look on the right, it like described, it picked up each threat, it helped like define if it applicable or not, and provide an explanation. Yes, it's a question of prompting, but still it did pretty decent job. And basically 34 threads become one applicable, 14 partially applicable and 19 not applicable. Actually pick up the applicable very wrong. So you can see that it is fluid. I'm not sure how fluid applicable to Cognita is like managed service. But again, it's easier to check than start from the blank page. And the beauty of this is that

you definitely don't need to use the chat GPT or like whatever with the chat interface. You can use API directly and integrate it in your tool and it will happen in like an automated way. So big thumbs up. And another example is like one thread. So the previous one was a batch review, but you can do the same basically with individual threads, just through some context. about your application through definition on thread and ask to define if it is applicable or not. Pretty decent job, again. The next one, contextualization. So PyTM, great tool, it produced some generic thread to me. And let me explain, so it was, it is not AWS example, it was like some random process element, I just

put some description about some Spring Boot application, provide some API specification into description of the PyTM code, and then ask the API of ChatGPT like, hey, can you please contextualize this generic thread based on the description of the element, and bam, what you have. Instead of generic, okay, you have a problem with authentication, authorization, it's like, okay, you have a problem in your sprint boot process with your authentication authorization, attacker can compromise these endpoints which were in the description and can do this, this and that. Yes, it's still a little bit generic, but you know, even like without access to the code or there's like limited time to do it, I'm not sure that application security engineer in general will do better than this one. in a

limited timeframe. So this one generated in like a minutes. Again, with some prompt engineering, but it can be improved. From my perspective, it's very well, so worth developing. Then, the crazy example. So what about like not using any tool at all and just dump the description of your application? Again, after generated from a diagram and just tweaked a little bit and ask, um, GPT to threat model your application. Yeah. From my perspective, it's too generic, you know, some, some threats. Okay. And attacker might temper database, DynamoDB. Yeah, it can, but still, uh, I wouldn't say that it is too useful, but it's somehow useful. And again, if you're talking about like, uh, minutes of spent

work, it's very good stuff. So, uh, pretty cool. The next one. So we have tried generating diagrams. We have tried generating threats. We have tried generated mitigation, mitigations along with threats. And now the evaluation. So let's just dump all our PyTM produced reports to chat GPT with all fancy details you may see on the diagram. I have asked, I've added like data definition, description of each service, so it was like a pretty comprehensive threat model. Again, produced in a short time frame. And chat, I mean, GPT-4 picked up very decent and very good reviews. So, going one by one, it's highlighted, oops, the wrong stuff. So it's highlighted the clarity and details and suggested to improve it. Yes, they were too generic.

It suggested to improve the risks because now, okay, good. We had only like medium, high, without no explanation, no summarization, so agree. It's kind of tentative, but might be really good. Also, real world scenarios. Again, examples in this threat model were not improved and were not consistent. maybe without context. So, and it picked up it and say that, okay, seems that your case are not relevant. So pretty cool. You can see the rest of the comparison on the slide, so I won't be reading it, but in summary, it did like really good work. So if I would be the one who would do peer review of my threat model, or this threat model, I would love to have this start prepared

for me to then tweak it rather than start from like a white paper. So big thumbs up. The next one you might have seen the stride GPT. So it's pretty cool stuff and I really appreciate the author of this tool. So the general idea that you dump some description of your application and it produce a full thread model. I have tried it out in different way, different scenarios. I uploaded the same description I have used to generate a PyTM thread model. And I actually didn't like work a lot. So it produced a product level threads, basically picked up the stride, tweaked a little bit stride to like context of my application. So I didn't find it very helpful. But the idea of generating

attack trees, again, they were generated from the code is pretty good. And I believe that with more effort to tweak and do prompt engineering for this stuff, some good results can be achieved. So let's refine our theories. Basically applying generative AI to your threat modeling, he did most of the points. It's like produce data flow diagrams. Yes, not the best one, but someone. It's added context to your threats and mitigations and it's super valuable because it's the place where I spent most of my time doing threat models. And finally it provided a good review. So, Potentially, if we apply all the feedback loop and ask itself to improve the threat model it has been doing, so it can lead us to some cool future.

And based on this, some conclusions. So challenges we have named as like a limited time addressed because this stuff is definitely improving like a productivity of a sec engineer. Again, advanced term is like

different for everyone. Yes, you can do a mind-blowing like thread models using pasta which are very heavy, very thorough. I love them. I use pasta all the time but like you don't usually have time to do them on the projects. You need to something fast. So and for this it worked super cool. Tool processes and people are still needed so it's the tool cannot comprehend the whole like

application the whole architecture it still has some structure it still needs some structured place to store information it still needs some initial seed to produce a good threats or tweak the threats so you still need it but I don't know for how long to be honest probably at

the capabilities of the models and like appetites of security engineers and the intent to apply them will meet at some point and you know, we'll delegate most of the work there. I know that it sounds like a general talk. Okay, I will take your job, but no, it's really happening. So I've been doing the same experiments like earlier this year and the results were worse.

than now. So it's improving and it's improving very fast. So we have to like look into it and try to use it to improve our efficiency and focus on really important stuff. Finally, there are challenges in applying Generative IA itself. So definitely you cannot come to a customer or to your own organization and just say, okay, let's upload all our stuff to chat GPT or GPT-4, GPI, whatever. It will be like, you know, You would be a bad security professional if you would do this. So there are challenges because all these secure models, they are not so advanced. But again, it's a matter of time. So I believe it will be fixed soon. And future is coming. So please look into this. This talk

was intended to provide you with some ideas that it's possible and suggest you to try it out if you have not yet. So yeah, and that's basically it. So thank you so much for being here. I'm super excited to be with the audience and I hope you enjoyed the talk.

Cool. I was told that we have five minutes for questions. So in case you have any, let's discuss them now or if not, I feel free to catch up me after the talk anywhere around.

Okay cool.