← All talks

Role Of Security Expert During Cognitive Revolution - Dmitrijs Trizna

BSides Prague 202629:23100 viewsPublished 2026-06Watch on YouTube ↗
Speakers
Tags
StyleTalk
About this talk
Dmitrijs Trizna reflects on how the AI revolution is reshaping the cybersecurity profession: which skills are becoming obsolete, which are gaining value, and how to cultivate good taste as an engineer when agents can do much of the work. He draws on experience building AI security products, covering supply chain threats, evaluation-driven development, and context engineering for security agents.
Show transcript [en]

Thank you Martin that's like super warm welcome like I I'll try to keep it up but like in a sense living up to the expectations and uh yeah like I don't know I like the event it's like amazing cozy feeling so thank you guys for organizing it and uh let me bootstrap with the with this kind of broader discussion it will not specifically about like product or something it's like like guys where we are living through like what what happens around um in this case uh like a few words about me in a sense like for those who don't know so being part of a blue team being part of a red team for several years

heavily in infosc but uh maybe last 5 years uh switching to AI a lot we could seen the the kind of potential coming even before LMS um yeah working Microsoft working avast right now in AL from the first days and um talked at blackhead devcon so like quite quite an experience so um Let me share you what what I see at least happening in AL. We we kind of I believe grow a team from from a like few folks to the 60 right now. So we kind of see what works what not what the skill set that that people possess has a has a potential what's maybe not needed as we go on and um like

all starts with a paradigm shifts that we are all of us observe in last I don't know like 6 8 10 months which is um I know every one of us feel it but I will explicitize it then the the skills that actually like we have to preserve some of them become obsolete some of them actually become more valuable and then then we'll We'll dive specifically to more cyber sec applications. Uh how how this how this link is is valuable. So seismic shifts what do we see? Uh I believe uh like before what do like who of you knows what this subcomand would do? Like if I type this nice I I like to see uh guys congrats like we posess some

nonrelevant knowledge anymore right? It's like because this can be done with something like this and um yeah like what to do about this right I I don't know from like for some reason this is stuck in my head that that I can do it that way but um I don't really need it anymore and this is like observation number one that we have is that knowledge shelf life kind of goes to I don't know like where it goes so like do you invest in learning something do you don't how deep you go for like for stuff to to to kind of teach yourself and this applies to many things I don't know like it's not specific example it across the

fields do you need like to know some details um I will argue about that but that's kind of first observation right one of the examples in from the AI world uh like again I now shift heavily towards AI how to use agents how to make them useful for others one of the observation that we see or like people say that harness is that what's harness it's um kind of this complimentary engine around the the model. For example, it's like set of skills instructions that you give to the model or it's a wiring of different agents together. For example, Antropic has this amazing blog post called building effective agents. Um it it has so many kind of architectures how you can wire

them together. For example, this approach with an main agent and then critic agent is something we used kind of in prod heavily. But then we've seen it become obsolete. And that's back to the point of like knowledge shelf life. So many things that we learn kind of become nonrelevant just after a few months. Uh because kind of models themselves become better and better, clever and clever. And uh basically two weeks ago this guy who knows the this guy he called nice. So like Kolini is super like prominent person. If you don't know I kind of advise you maybe to follow his his research. One of the main kind of people in the intersection of AI

and security. So he basically said guys um like we don't have harness at all. That's that's all our harness. We just give it an instruction and model is so good that it solves the tasks and uh I don't know. So harness is that maybe not. So I don't know the the thing is there's another evidence that it's not that like for example cloud code leak two weeks ago if you see it's like such a heavy system it uses so many memory mechanisms different approaches so like the the harness is still important and that's what we see as well in a for example stan published again how how just model is not enough there are so

many things and we believe the system is the ultimate still like paradigm mode so again so many things changing so past. Um, another observation, uh, slope is everywhere, right? I know if you open a LinkedIn page, like, uh, I like this blog post by Florian Roth. He basically said like, "Hey guys, these are the like, um, I don't know categories where I like detect slope right away." I know like this is not opinion, this is math. It's not a feature. It's a paradigm shift like not hype reality. It's like when I open LinkedIn, I see it. I don't know, every second blog post has something like this. And like I feel myself as a blank mind sweeper here,

right? It's uh do you invest in reading this stuff or like it seems legit but after you read it for 10 minutes you feel okay that was empty stuff. So like you you hit a mine but some of the things are still relevant that's you want to keep like consuming information that's valuable and uh yeah it's it's tough. And observation number three, it's that code is cheap. Like uh for now producing something it's so easy. I know like you can wipe code UI just for your needs in like I don't know 30 minutes 2 hours. I've seen folks that um created like ticketing system just for themselves for this specific task that they are using and and like it works

nicely just for them. And u yeah this this has some interesting consequences. Well, uh this is a chief operating officer from from GitHub. He says like guys, we had 1 billion comets in 2025. this year like we having two 250 millions like 25% just a week and they expect like 14 billion comments in in a whole year without like any expectations on on rising project like that's wild like people coding so much thanks to agents and um yeah this resulted in kind of in emergence of so-called I I call them empical coders is when you kind of prompt you run stuff you observe results and you're prompt again and you just keep loop and you basically cannot don't

really understand what happens under the hood. You kind of don't read code sometimes maybe and uh it kind of we might say dude like what you're doing but uh the truth is actually we don't know maybe it's a winning strategy may like uh one thing this allows to iterate as fast as possible. You can like clap on ideas um super like really really quickly and um the the thing here is again to figure out when you have to go deep when not. Uh, another interesting observation that it enables non-coders to to create. Um, I I've seen product managers creating amazing kind of prototypes solutions that you can take and and like expand on. One of the

examples for example this CEO of of Y Combinator basically BC uh group and and he started maintaining a library just like from from this year since January thanks to this and that's amazing. Okay. Believe this will enable so many people to to kind of create something to to build that can manifest in a in a good solutions. So given these three I don't know like I don't know significant observations like changes how we do stuff at least it kind of it's it's so significant like for for us in in a moment we we kind of lag the trajectory. we have to zoom out and see that this really changes a lot. So what are the skills that survive these I

don't know significant shifts in in in in what how how things happen? I'd say uh one is back to the point on like of this empirical coder when do you actually decide when to go deep and I I like this kind of type of of memes or like it's my favorite I don't know but uh this applies to this right level obstruction really well in a sense like the the naive user can say like I I just wipe code then majority of us thinks no no the the needs our guidance but like the most enlightened one says no no come on just just just let agents do it and that's example of maybe Karpati as well

kind of trying to push some of these ideas. Of course, not blindly, not radically, but saying that sometimes we as a humans already in this age in the implementation space, we are kind of bottlenecked. For example, one of his latest um creations on on is basically auto research prototype or like where he kind of proves that if you feed agents towards a well- definfined um you know metric which is a validation loss they actually figure out techniques that he wouldn't come himself with like how to optimize further in this case it's in a small GPT model so yeah and I would say this like generally broadly it falls under category of like skill set of tech

In this case, it's going to take leads know when they have to trust and delegate both humans and agents. You can ask, "Hey, hey, create this and I will not interfere with you. Just produce me results." And again, this this can work with humans, but so more and more this works with agents and and you kind of have to trust them in a sense to create stuff and you kind of when you need you go deeply. Um at the same time as a tech lead you still possess a broad knowledge that's is deep enough to allowing you to think about the the the concepts. That's why the things that we discussed in the beginning and map or like SSH and and

like others are still relevant because they allow you to think about this thing like you have to know that SSH can do tunneling in order to like reason about for example attack chain. So um and this applies to basically any domain you're expert I don't know like reversing um I know sock and stuff. So like it's still worth investing in fundamentals and uh yeah back to the point that you can float on a higher level obstruction but when it's needed you can go deep and solve it all the way to the to the like crux of the problem. Um, second skill I'd say that is like distinguishing is the taste and uh it's kind of often not answering how to do

things but what to do right um and um usually if you have a good taste then then you kind of pick a problems that matter that or approaches that will lead to solutions and if you don't like um of course it's radicals it's always a spectrum but um you might may spend months on something that is not relevant to the industry to the company and um or or for example you pick a method that will eventually not work. It's not the best to to to to work on this problem. Uh let me show you two examples of a good taste and a bad taste both coming from me. So in this case uh one of the

good taste is for example I I had a talk and besides New York uh last October actually and these are kind of slides from my talk is actually hey guys the the code and language automation and back at the time it wasn't still such a big deal. We've seen like for example sonet 3.7 doing stuff but not so autonomously but already kind of it was a feeling that damn supply chain attacks become a problem. the whole ecosystem is not ready and we see more and more threat actors automating their models operandi in this space and uh yeah basically saying hey guys we actually detected few of these things applying quite easily and like it will it will

become worse it was October and then in November or December we've seen this second manifestation of shy holute which like compromised thousands of npm libraries right it's like crowd strike and and and so many and now uh like Right. Team PCP basically rampaging around works. Yeah. Uh light LLM and uh I know three week kicks just yesterday or day before yesterday right we had the second compromise of of check marks libraries. I know it's wild and I I don't think this will end soon because they are sitting on so many tokens right now. It's just a question when they can of weaponize the next wave of the of the existing accesses they have already. So,

and the problem is not users. The problem is not I'd say check marks of Oraqua security with 3B is the whole ecosystem. It's really not ready like the only way to protect is like pinning the the hashes and that's it. And like a lot of people dismiss this as nonrelevant and yeah, so it's it's like having an active directory u ecosystem in 2010, right? Like hashes are flying everywhere in the network. You just just pick them and and use them. So like so many protections still should be added to kind of make this ecosystem richer and again good taste here resulted in a in a good kind of direction for ale for example we preemptively started working

on the module that would detect uh not scan only first party code but third party code and consider a supply chain attacks in your dependencies and that's kind of module that we already already rolling out for the first early adopters and kind of I feel the good taste here allowed to be kind of preemptive in this rise of threat threat attacks. Second is maybe a bad taste and back to the like reference that Martin made in the beginning talking on the first bite actually talked about kind of pre-training a malware detector with the GPT like style and uh it it was resulting in this small language model. So and back in a time people thought

damn like small small language model can like conquer out the large ones. For example, like significant people in the industry like Clem who's CEO of Hugging Face or or Lun who invented CNN's they kind of basically said in a sense hey in 2024 like smaller cheaper models will will be used for 99% of AI tasks which like in hindsight we obviously say no it's not and I believe it's kind of bad taste or maybe like I for example invested time created this nice malware detector or published the paper but it kind of is not resulting in anything right maybe the world is not ready to adopt those small language models maybe it's not valuable direction at all and in this

case is like how do you actually invest in good uh and train a good taste and I believe there are multiple directions where you can kind of systematically explore to make your judgments better one is again back to fundamentals like nowadays we uh we feel ourselves do we have to invest in like I don't know learning linear algebra if you want to dig into AI or do you learn a networking stack like those TCP UDP oint layers oy layers and uh yes you have to invest because this allows you kind of in your mind the the the elements the atoms to pick and and chain thinking and that's why you have to know those kind of

fundamentals at least on some level of abstraction maybe not all the way down but at least to kind of what they mean what what is a sysmon what what did it events look like yeah like again for npm what are pre-install scripts what are post-install scripts uh like and and this allows you to think in terms of thread metrics and and stuff uh second direction is um you have to read a lot and uh in in the current world it's both shallow and and deep reading uh shallow because right the the observation number two slope is everywhere you want to filter this thing out like don't spend your time on this it's like you can sp

like it looks legit. It looks good unless you read 20 minutes and figure out that it doesn't teach you anything. So you have to have the shallow reading to kind of filter this out. But then when you find those like gems, then you go deeply especially if it relates to your topic from a good author from a from a touching really good like cruxes. So go deep, understand the gaps and this allows you to feel where's the status quo, what are the weaknesses in the current I don't know ecosystem and then yeah this gives you this this taste And the last one, that's why these events are amazing talking to people, asking questions and listening to what

they say because like you have so much awareness as as you can and other people have their own kind of bubbles they they live in and figuring out what happens in their lives is is basically representing touching the cyber security world and and like feeling. So these are kind of three directions that I would advise to to kind of deliberately invest to have this good taste in a world when agents can do everything for you. And maybe the last one more >> mundane, but I think this actually distinguishes actual results. And the most like the coolest part that the agency is something that you don't have to I don't know have to 10 years in in meta. Like

I've seen people coming out of universities having high agency being amazing like collaborators. They're the best people to work with because they have this amazing high agency and they can make it happen. They don't care in a sense. they dig deeper. They understand like they I don't know they can filter out the three other like tasks on their list they can prioritize properly do a proper time management and deliver stuff deliver a good PR deliver a good feature so I don't know this is something for for you like not to make excuses for yourself if you want something happen that can happen and like it's only in your hands uh yeah so like how how do you kind of

apply all these like shifts and and what matters into specific specific use cases. Um let me um talk about not testdriven development. It's actually 10 years ago driven development right that's uh that's think of today and uh this is how you actually develop identic like anything basically whenever you use agents for something it can fall paradigm of driven de development. Let me show you on a simple case um like really simple zero day discovery with fuzzing. It's like super simple topic, right? And which is like sarcasm. Yeah. But if you take a code and the goal of the fuzzing is to produce a bug, right? Or like that's one of the ways to do this leading to to some vulnerability.

But how conventional uh flow does it is that you take a human who writes a harness or like seed and and a build. It's a basically a components needed to bootstrap a fuzzing and then you feed this into a fuzzer which then like runs for many many hours or days like smoke is coming off out of your machine and but then finally it spits out okay I have a crash that's probably something worth attention explore it and uh what you do is like or like people do in many directions specifically with fuzzing you plug an agent for you to to generate these things right just observe But uh the and what are the problems arising? You kind of figure out okay I

want to tweak the system. Uh okay it still kind of is good but I want to tweak it more. Okay I like ask please please God's sake produce a good harness for me. Like you created version 1.2 and like how do you know whether it helps whether those tweaks result in something? And that applies to any problem with the with the agents. It's like whether like do you solve I don't know some reporting do you solve some uh I don't know scraping uh anything that like AI component nondeterministic component is used for it's like how do you know if you tweak system and it helps and that's where come in so you plug basically inputs outputs in in a

well- definfined scope you define for example and they call benchmarks or evils in in the in the in the public space and you say okay uh inputs here are source code and target logic that that I want to fuzz and the output here is um for example coverage that I use that this set of uh settings produced uh result in or like any bugs and you keep iterating through this and you keep running this experiments and you see this results like the amount of coverage on and bucks and if you tweak something and it increases this means it helped if not it doesn't and uh for example let me show you amazing example we did here with Luigi

Um we we took a well- definfined project that has um fuzzing harnesses and uh like has over thousands of them and so what we built we we took a human generated harnesses in this slide is gold gold ones and um the the LM produced it's like generated and take a look at this example for example um gold line human produced results in 20% coverage um of off lines in a code base but produced harness results in 39. So like it bumps so much the the coverage of this fuzz. So you see right away that okay this one was a good example. Uh another one here for the like if conditions it reached it's like 4,000

here you get a 10,000 of like with LLM generated. So it kind of surpasses the human level produced results so heavy but at the same time you can take a look and for example some of them are failing or like others are like super poorly and here you can feel okay why they fail how to improve and in this case so we floated on quite high level of abstraction before but for these cases okay I have to go deep and that's where your deep expertise matters and again back to the code is cheap Luigi coded a viewer just for himself to kind of compare two harnesses and see like deeply enough what happens why it fails

and how to tweak system further. So yeah that's uh that's kind of cracks I believe that a lot of well uh defined kind of AI applications are needed is like how do you develop it to to become better is like you use this approach you define a quick like fast maybe like 12 examples you don't need hundreds 10 12 examples and you iterate over them quickly you see results on on many cases at once you see when they fail uh you understand weak points of your system and you can add examples as you go like you solve them one by one and uh on those cases that matter you can deep dive deeply. Uh yeah uh maybe second thing that that is

kind of applies to any problem that you solve with agents or with AI is that everything is a context problem. And I like to think about this in in in the terms of example of this u like thinking experiment. You have this dude who is a president of unnamed country. He's sitting and thinking should I invade another unnamed country or like attack it and basically answer is never in his head. Answer is always in a papers that in front of him like the intelligence officer comes and puts all this information amount of troops where they're residing and the answer of like whether success is is will happen or not is in intelligence. It's like and uh in

this sense it comes to anything. So for example um problem we are trying to solve in ale is like hey dude do do you think this CV needs attention and then okay like you send it to your security analyst that's a tough question so he thinks a lot and what he consumes obviously code base you see where the like where the v vulnerability lives its reachability maybe context part of it but then it's never the codebase it's it's always something else so if you have a CD you you check maybe public sources like cisakf um like github that does it have exploits? Then you maybe check your infra like where's the where's the application resides? Obviously, it's in many cases

in your head already, but it got got there somehow, right, from some sources and agents have to get this information somehow. And um and back to the notion, it might be I don't know wikis uh like ticketing systems like any anywhere that information that's answering this question matters can be added to the solution space. And the subtlety here is to not add a lot. It's actually adding what matters. So you have to filter the noise and it's kind of having a good context hygiene. It's like filtering really non-relevant pieces amplifying relevant and this allows you to kind of solve an issue. That's at least what we're trying to solve with ALE now. But it applies again to any domain. I don't

know whether it's um reversing with agents network I don't know detection uh triage or or something it all matters on a context that you put. So uh like these are ideas that kind of I see or like the observations that matter today the skills the solution space will it matter next year I don't know uh in in this case the things like the the most frightening part for me is that things move so fast it's like so like keeping the pace is kind of hard and we are actually in this moment when can AI can do a lot uh they a lot of better than than are than us experts on some domains they still miss on on others and like

soonish like they obviously will become better whether we will reach this last stage I don't know but at least kind of we see from from some good sources from matter that length of tasks that the AI can do double doubles every seven months I believe that the skill set that that we defined basically taste being a technical lead having high agency will survive no matter how good agents are these are like so fundamental to any task uh delivery that they will survive but still uh I don't know guys and uh for example I wanted to put like where we are in cyber security having like opposing um some tasks on like exploitation but not really for example

reversing and it was like I I created presentation two weeks ago and back to the point that things move quickly uh they released a new plot with the meters now and basically mitos is cracking kind of some reverse engineering task end to end and um I don't know yeah like we probably will get this amplitude much higher and yeah where we be in in 6 months in one year we'll see like we have to be vigilant and although I think this is amazing uh amazing advantage for for us although this gives some some benefits to thread actors obviously even more than than the fenders and just kind of as a light reading for the I don't

know three evenings is this I AI 2027 uh creation it's actually by by good sources the the guys are like ex open AI researchers and stuff and they brought the article a year ago 3rd April 2025 and they predicted so by May 2026 so basically now we'll have um reliable agents back in the time agents were were garbage they will like they produce something one time they didn't produce another so by this time we'll have a reliable agents and kind of hacking capabilities This will approach human professional level and I would say they are kind of are approaching if you like I don't know drop a cloud code in a network and say hey dude explore it

report me findings it will actually do a kind of decent pentest report for for this for example uh like case of course it's like limited so it's not really there that's what they say as well but uh the thing is again they like for October 2027 according to their estimates the things will go wild I don't know like look it will be super asmtotic you will get a superhuman researchers and yeah I don't know so so far they were being accurate where we'll be in a year and a half if they're still accurate I don't know so it it might be something like this right so and back to the topic back back to the main topic

like I imagine like if we had a time machine dropping back in a time to I don't know London 18 something and going over the street asking people hey dudes how is it living through through industrial revol revolution And it's like amazing guys you'll come with medicine with but you'll come with so many things and and people would say like come on we want to destroy these machines they they taking our jobs and I would say let's not be pessimistic the thing is we are living through a cognitive revolution right now and that's us that's us right now and yay so you don't have to have a time machine to feel this and uh like but it feels for

me like like the this meme on the right it's like damn uh so unstable So living through revolutions is not easy. So many things change so fast, so quickly. And yeah, let's let's see what happens. That's it. Yeah.

[ feedback ]