Pitfalls of Poor Remediation: How Companies Sabotage Incident Response Efforts

BSides Dublin38:3836 viewsPublished 2025-10Watch on YouTube ↗

Speakers

Giorgio Perticone

Tags

CategoryWar Stories

StyleTalk

Show transcript [en]

Okay. So, hi everyone and uh thank you for joining today. First of all, um what you're going to hear today is just about some thoughts and research I've been doing uh while uh basically helping customers and companies all over the world and um through multiple industries in facing uh cyber security incidents. Uh and basically it's going just to be uh some random votes. Nothing very new. Uh just um um you know uh my point of view on what companies are doing good, what companies could do better and um what are my suggestions uh for to just leverage what we already have uh to to improve on instant response. So without waiting if I'm able to switch my slides.

Perfect. Um this is the boring part. Um so I'm Georgia. I'm from Italy. I'm currently working as a team lead inction response uh for Treyi. Few of my colleagues are here today. So hopefully if the talk is good, I will still have my job after this. Um, I've been also helping um the Sans Institute as a teaching assistant for their incident management uh course. I have a little podcast of mine on YouTube uh which if you um you know if you're interested you could search for. Um I have been a contributor for uh quite some uh different communities around Italy. Uh maybe you don't know this. If you know uh that's great. And uh little uh fun

fact I'm also digital nomad which is an interesting thing to say during a security conference because if any of you comes to me after the talk and the first thing that you say to me is oh nice being a digital nomad then I know if you liked my talk or not. Um let's say let's go to the hopefully interesting part. Um this is just a very general agenda of my talk today. I'm going through three main points. Uh first of which is uh what do I think that companies are doing much better in the last few years in terms of incident response. Uh what they are not doing as good as they could and finally what are

my uh suggestions in order to improve on those uh those topics. First of all let's go with uh what's the problem. Um so I've been uh as you do reading a lot of uh reports from multiple companies. Uh this uh specific uh information is taken from the uh man and trend uh report from last year. Didn't read the the one from this year didn't have the time. Um this is basically telling that the industry has got much much better in uh identifying uh incidents. So we are basically uh just in the last 12 years um we have gone from more than 400 days to 10 days in identifying threats in our environment. So this is technically a good thing. We

have you know more than 90% of uh improvement here even though I agree with you probably that this still uh is not enough. Uh you have admit that uh this is a pretty good improvement. probably we're going into the right directions and we are just going to be uh better and better in detection essentially. Uh now the reason why I think personally uh this is happening is that uh well we have a lot of technology that we can leverage uh both commercial and open source and a lot of so uh knowledge that uh people are building inside companies on also building uh internal tools as well. uh we have a lot of um people working in uh security

operations which when I started was not that um um that normal honestly now every company has their own sock or they're paying some kind of outsourcing or managed services. Uh I do think this this is a a great thing uh that we kind of kind of um take it granted right now. Uh and finally the bad thing is that one of the main reason why this is going down uh as a metric is that um a lot of the incidents we are facing are um perpetrated by ransomware gangs which means they are basically telling us that an incident is going on eventually because they are going to ask for money to get uh our data uh back to us or

maybe just to decrypt uh our own system. So there are you know different reasons why uh but still you know the the fact that we are at least aware of the incidents going on in our environment I think it's a good thing. Uh now the other metric that I want to you know um briefly speak about is uh what crowd called the breakout time. So essentially the time that takes an attacker uh to go from the initial initial foothold in your environment. So the very first system that is going to compromise to eventually lateral move uh to something else inside your network. Now why this is important because first of all um the moment they um already got access in

your network is probably let's say too late. You already want to uh be quick into identifying and responding to the threat. But the moment they are starting to lateral move, uh the risk that you are facing is much higher than before. It's not just the single system that you have to I don't know rebuild from scratch or fix or whatever. Um you don't know how many system they have already compromised. You don't know how many parts of the network you're going to uh fix and rebuild. And um uh the efforts that that is going to take you, it's it's much much higher than before. So we really ideally would like to intervene uh some somewhere in the middle um in

those two steps of the kill. Um now the uh once again I think the main reason why uh this is currently just around 1 hour is uh probably because most of the attacks we were uh let's see investigating and analyzing something like 10 years ago were really still attacks uh where the attacker really wanted to uh not be detected there way they were waiting a lot of time from a step to the other so that uh they pulled uh possible threes. Uh while today if uh we have a cyber gang uh which final goal is actually to tell you know I breached you and I want some money from you uh they won't be as fast

as possible. Uh they want to reach their goal. They want to gather uh your important data and eventually uh you know um uh ransom you uh because of that. Now you can see how the uh correlation this parameters um is uh a little concerning. So it takes us as an industry 10 days to detect the average incident and it takes an attacker just around 1 hour uh to you know um let's say uh cause a substantial damage to your to your company. Uh so even though we really like to uh to show good metrics and there was a a talk this morning that I really like that was talking about metrics and was um mentioning this we should focus on

something that uh it should let us improve and do better. So the reason um the reason I'm I'm mentioning this is because I feel like the tree is not focusing as much as we should on remediation to uh actually doing something when we detect the incident. Are we actually um thinking about how to contain the incident? Are we actually thinking and focusing on how do to eradicate the attacker uh in a comprehensive way after we detected it? Uh I don't think so. And in my experience, a lot of companies out out there are not doing this in the best way possible. Even though the tools, the resources, the knowledge is already there. It's not something that we still

need to develop. Um it's just the uh um um um a problem of uh processes in place of people not knowing exactly what to do or just feel pressures from um other uh aspects of just the uh daily company work. Now that being said, let's just go through some of the options that as I said we already have um that we can use and this is going to be for both detection and um I'm going to focus especially on containment uh because that's I think a crucial part that we're missing most of the time. So first of all um detection wise uh we can have multiple sources of detections and I'm going to divide them into two

macro categories uh internal and external sources. Uh we have internal sources when we also have let's say an employee or a user of the company reaching out to the security team because something odd is happening on their device. Uh, I don't know, they maybe received a sketchy email. Maybe they're um they found about a, you know, black screen um popping up on their their device while they um switch on their their device or something else is not really working as expected. And um I mean that's a good thing if trained our users to actually listen now about it. It's a good thing, but most most of the time it's not really the best reliable uh source of detection. Uh and for that

reason we probably all of you have a lot of different systems inside your environment which are uh aiming to give you a list of different alerts uh to let you know about any u misfunction or suspicious activity or behavior uh in your systems. We have XDRs, NDRs, EDRs, uh CMS uh antiviruses and uh you name it. Um and these are you know most of the time uh pretty good in my opinion. It takes a lot of time and efforts to actually deploy them well and to actually be able to use it in the best way possible but still are generally speaking the most reliable uh source of detection. Now for what it takes the um second

macro category uh what I called external uh sources uh we may have third party uh organizations um reaching out to us because something uh is being identified on our systems on our data just on our um um you know surface. uh we could have just a partner of ours uh let letting us know that uh they received uh some you know weird data or once again some kind of error on their side uh and they think that we may have uh been compromised. Uh that's the case of uh what we call supply chain um you know breaches and indents or we may also have uh law enforcement reached out to us whenever maybe that has been already infiltrated.

law enforcement is monitoring the internet for uh you know um company data and uh they they eventually uh identify it for us. Now um once again it is a good thing if someone else is also contributing to our overall detection system. The thing is that if we waiting for that it's probably already too late that is already being infiltrated or um our system was already uh combined so much that they called out. So move laterally to further environments outside ours and that is uh I'm sorry but too bad. Uh finally the worst case scenario is that our source of detection is actually attacker itself. We mentioned it before um ransomware want to ask a ransom um to to you know obtain

that money uh that they want. So that could be our last opportunity to detect an incident. But that's really um the moment where we cannot do uh um you know much about it. So this was just to give an idea of the fact that we will probably rely mostly on the internal sources compared to the external ones. Uh and the reason is pretty simple. We want to be as fast as possible in detecting the incident. Um and the reason um is that the quicker and faster we are in detecting the incident the more time we will have to actually be able to analyze scope incident uh and contain and eradicate eventually. Now moving to the uh main topic of the

talk. Um talking about containment, we also have multiple options. Again, nothing new. Something that a multiple um uh different tools and brands out there let do is use automation. uh with automation I mean setting up um thresholds and rules and parameters in order to uh automatically apply some actions that will contain the overall incident. We're talking about isolating host isolating um uh networks servers or just removing individual files if they are suspicious for some reason. And uh we also have of course manual intervention when uh we don't want won't rely or we cannot rely on automation uh and we need every time to u ask individual operator in our security team to go in log into the system make sure

uh to uh comprehend what what is really going on and eventually take take a decision and uh remove the file uh disconnected in the network or whatever. uh this uh was in terms of velocity. So of course uh I think you agree with me the automation is going to be much faster than uh the mar activity most of the time. Uh we're going to talk about what are the uh issues with it and why we are not all and every time using just automation. Uh when it comes to precision we mostly have two uh two big options. Uh the first one most of the time we we have a lot of um uh you know

agents installed in our systems that led us to uh end point isolation or lockdown which means we identify that a specific device is being compromised and then we go and precisely isolate only that device so that only the device is being impacted by our isolation. So we are not um um you know uh messing up with too many users in our environment and not too many people be uh med us because of that. Uh when it's too late otherwise when that breakout time has been already reached um the attacker is already u jumping multiple systems. Therefore, just isolating a a single O is not enough and we are forced to isolate multiple OS individually or at some

point uh completely uh disconnect an entire subnet or entire network or just you know uh we all know that in the worst case scenario they may compromise our domain control or something. If they got access and we need to isolate the domain control then a lot of users are being impacted anyway even if we are uh acting on a single device. Now once again just to summarize uh we have multiple op options but in terms of velocity we would and should prefer in my opinion uh to use automation as as much as possible and we want to be also as precise as possible. So to um um let's say limit the uh impact that we

are having our own systems. So now uh you're probably asking yeah we know about this uh what we can do about it. Uh why everyone is not doing it. Uh I don't know if you um ever asked yourself the question. Maybe in your company you are great and you're automating everything and you're very fast and you're identifying uh every incidents out there. The thing is that in my experience maybe you know I be biased but you know hundreds of companies out there are not doing it and I think I've been um you know identifying a few issues and a few reasons why they're doing this and one of these is the um um the fact that we

are uh putting security and business continuity on two different things that are not going the right directions. uh we uh and you know generally speaking the business continuity teams in uh uh in the companies don't want us security people uh to isolate things and shut down system uh very easily uh because it's a risk because of course we are so limiting our company you know business time and revenue if we are shutting down service that are providing uh to our customers. Um and uh you know this indeed a risk that we need to uh take into consideration uh and um uh you know uh uh nevertheless I think that we uh we need to have this kind of communication

inside of our company and understand make you know everyone understand that security and business continuity have the same uh you know final goal which is make sure that the company has no uh big issues. in terms of uh um uh you know continuity of our systems because if we are isolating something nowadays to prevent that tomorrow I will spend weeks and weeks you know rippling everything and you know still having business down for a while uh I think everyone uh will be happy if everyone will understand this now um what's the you know solutions I'm personally suggesting to do this which I don't see companies doing in even if probably everyone is aware of it is to implement something

that I called uh a business criticality rating. So this is uh being called in a multiple different ways out there depending on the framework you're following depending on the person you're speaking with or the uh you know uh law or uh certification you're going through with your specific company in this industry. But basically what we're talking about is just um identifying and uh systems in our environment and identifying the risk of those systems being down for 1 hour, one week uh potentially uh multiple weeks and how that is going to impact our environment and our business. So basically saying you know speaking um how much money is our company losing if that specific system is going down for a

certain amount of time. Um if we have that I'm not saying that this is a simple thing to do but you know if you start eventually you'll have you know a decent amount of system being uh identified and score if you never start with that you never have it. Uh if we have that kind of information then we can um separate those systems that are very critical uh for our our business and systems that are not actually that impactful if they are done for a few hours. If we have that kind of distinction in our systems then we can uh decide to apply automation on those systems that are less critical uh therefore the risk in using that

automation is uh much lower. Um and maybe even more important we can uh let our security team and our professional people uh to actually spend their you know uh precious time on the most critical systems uh so that um you know they are not wasting uh much time on investigating every single laptop that has suspicious behavior going on. Um I think uh this is theoretically a very simple thing uh that uh no one or almost no one is really implementing there. uh but if you start from you know identifying the most critical uh systems especially if you're uh not too big as a company uh I think this is going to make a big difference and will let you

leverage all of those great tools that everyone is building but but no one is really using. Now the other problem I think is uh for u you know many many cases and many many scenarios the lack of authority inside the security team. So what I'm talking about is that I've been working with uh you know uh countless um cases and companies where uh we actually identify the um uh the uh we reached out to the customer. we speak with them, we uh determine what what are the next steps and then no one does anything. So why is that? reason is that if we are if I'm talking just with technical people most of the time they are just scared to do

anything because okay what if this is a false positive okay uh if I'm being scolded tomorrow by my CEO because I did something and that was actually uh I don't know an executive laptop I'm going to be uh you know um appointed like the bad guy here because I'm doing something and uh you know this is really happening uh very very often. Uh so no one is taking the decision. And the other problem is what if you cannot actually take the decision? What if you are the analyst inside your environment but you don't have any uh practical means to go in there and the actions the only thing you can do is to escalate this reach out

to your uh so energy that has to reach out to your CEO that has to speak with the executives and eventually going down to the IT team to actually identify the system and then do something about it. maybe they have to travel to another site uh to do the activity and that's in forever and then ransom is already there. So uh this um you know uh this few uh things are basically everywhere in our in every company I've been working with. Uh and um one thing that theoretically is already there few companies are already doing is appointing what I call an instant command role. Not because I invented, it's already there. But it's mostly used

by companies uh resal to IT incident but not security incidents. So whenever you your system are down because there's a bug or something that was put in production uh but uh and they have a full you know team dedicated to it but this is not um as often uh being replicated in this team. So this kind of role, this kind of person I believe should be um hopefully not a people manager because people managers has a lot of things to do, a lot of meetings to to run and uh people to manage. So don't have much time to dedicate to every single incident that is happening in the u in the company. Uh it should be

someone uh technical enough to understand what's going on but not technical in terms of they have yet to run the incident response itself. That takes a lot of time. You have operators doing that. you need someone that is going to supervise the incident and then helping take the decisions uh that have to be taken uh they have to take the to to act as a relay uh between the technical team and the CEO or the executives to make them them understand what's the risk that the company's facing and why a decision was taken because that's the other the last thing I believe this role should be nominated and appointed from the very beginning with the authority

of taking the decision of shutting down that central domain controller when it's too late because if it's late and you're waiting still you know uh much more take a decision to rebuild that active directory that's just going to be worse for your company. So you need someone that can take the decision and can then justify later on that with executives. So explaining this was the risk we were ringing to have the potentially company down for uh countless days and um just putting that down for a few hours. We made sure that uh nothing was um wrong with it or we actually catched the bad guy before it uh it did even more damage to my company. Uh so uh once again it

might be like uh this was something uh you know very clear for everyone that's not what I see on a daily basis working with a lot of companies. If you are luck here in your organizations and you already doing all of this, you know, kudos to all of you. Uh but um you know if you are doing in your company then we probably should put more effort into uh helping the other organizations doing uh something like this because they are not doing it as well. That was really it. Uh thank you so much for taking the time to listen to me. If you have any questions right now, I happy to to take them and

hopefully be able to answer them and if not, I'll be uh around uh for this conference. So feel free to to reach out. Thank you.

Any questions? >> Yes. [Music] >> So there are actually a few of them. uh while doing this research I found them I don't personally have any special suggestion okay and that is especially because if you are building this from scratch uh as a you know role and completing your environment you want to understand what works best inside your specific organization and if there's a tool out there that already fits exactly what you're doing that's great but if you need to start from what tool do I need then probably that's not the right thing to do either you're already doing instant uh management and then you want a tool to simplify it and that's a thing. But if you are if you don't have

instant management in your environment, you're just doing either technical stuff or management stuff. Uh then you start with the process and with the people and then if you want to optimize it, my suggestion will be search for a tool later. >> Yes.

right?

>> So if I got the the question right, who is the right person to you know actually run that in the management program? Uh that's the the the point of my talk is I don't see that person being there at all. Right? I think that's most of the times someone that you need to hire on purpose or identify on purpose in your um organization that can completely switch from what they were doing before and do something new that wasn't there before. Right? And then once again it depends on your organization because many companies have CISOs, many companies have no CEO at all. Some companies the CESO is doing also some technical stuff is also maybe the sock

manager you know other companies just you know talking with the uh sea levels and they don't have time to you know just uh talk with technical people so uh you know maybe you need let's say someone in between maybe you already have someone that has the time to dedicate to it most of the time I've seen where um you know companies have maybe the stock manager being al also the incident commander. But the thing is what if the sock manager is uh already you know um busy doing something else because if you look for the the agenda of of sock manager you know the average manager is already full they need to cancel things to start doing something

else. What I'm talking about here having someone who actually has the time that for each instant of a certain priority and criticality in your environment can view it understand what's going on and decide okay this is something that doesn't need any containment or any you know containment and we can go on or this is something where I should start you know um maybe supervising something. So it once again I know it's not really satisfactory but it depends. >> Yes please. Do you have any tips or advice for matrix collection instant response like any that are important tools that use to collect things like time? So I mean once again I'm not inventing anything from scratch but uh in terms of

what you are already measuring before or if you're not measuring anything so far you probably need to start with how much time did I dedicate to or do I take to actually identify incident uh and what's the percentage of I'm actually identifying and I'm actually just you know passively um taking and then after that you should as I mentioned at the training focus on uh containment and remediation because once again we we I think pretty much all the companies out there are making some efforts in identifying um incidents maybe because of compliance whatever uh you know the important thing is that you actually doing but then you you need to measure your capability to actually do

something to slow down the attacker and eventually completely eradicate it because that's the thing I I've seen the same thing on and on on and on and either you rush into I want to say to my co that there's no problem anymore. So I take that first machine and I formatted I rebuilt it from scratch. Okay. Now you feel safe but have you really scoped and investigated the issue so that you are 100% sure there are no other 10 systems in your mind where the attacker lateral moved on. So maybe you then have to rush to that one and at the same time you want to do some containment something that is slowing the attacker while you

have the time to to investigate and then if you measure the time to detect the time to containment and eventually time to eradication then you can work on those to of course get better and better uh every time but first of all you need to make sure you are going through all of the individual steps one after the other and you're not really rushing into okay I want to come to the uh you know end meeting saying, "Oh, I already solved the issue." Because this not works. I hope that answer has the question. >> Yes. If you go back

[Music]

[Music] action reaction whatever. But I think that it's betterational

more. [Music]

[Music] So not So well let's start with I kind of agree with you right but once again it depends on which organization we're talking about I know why you're asking this in a specific region we could have you know kind of the same hierarchy in most of the companies and then you have security silos the IT silos and the operation is being done by it and the detection part is being done by security what I discovered is that it's not always like that okay with just with the security operation terminology. You could have security teams where there are two sub teams, someone engineers applying things and just people looking at detections and then somewhere in the middle you

have the incident responders doing you know something in the middle. So it really depends. I mean I agree with you. I don't want the same person doing everything. That's not going to to do in my you know just I know this is very general. You know it must be deployed in the specific content scenario. But my really real suggestion here is to introduce a new person that is kind that is inside security team as an organization not the detection team right but as an organization is inside security. um this kind of uh not really daily operations, not looking at actions, not doing the actual investigation, not uh applying the remediation, but actually being there. Okay, so that the technical people can

go to him or her and ask, okay, I think this is bad. What do we do about this? Okay, let's review this. What do you think? I'll get your input as a technical person. I'll get maybe um I'll talk to the person or the employee who uh write this in the first place and I'll consider okay what are what is the risk at stake right uh should I start the process and involve the people talk to legal and talk to uh communication and marketing and start everything or I think you know what let's do some more investigation first and then we'll you know circle back to that in one hour or tomorrow or whatever it's more I wanted to be free to work

with the security team on a daily basis but not to be the security person running security operations if that makes sense. >> Right. Right. >> Yeah. [Music]

>> Yes.

which is what I

[Music]

>> think is the power to do I mean it should be that that's the point of all this. >> That's the point of all of this. It's good if you have the role but then if that person is not autonomous in taking the decision at the right moment then you know it still makes sense but is not really using all its potential because that's while you as the you know um head of security or CISO are away doing something else because you are going to be away doing something else can that that person take the decision by by itself that that's the real point here because if you need to wait for the person to come back from the sea level

meeting that will last I don't know one week at kickoff or whatever that's too late [Music] I think some lows down being outside uh the security is there's a lot usually of cross functional skill set so if you are the a die all the part you want and I come from wherever I'm trying to face something in order for you apply the change you have to understand me then you got to have someone that is on both you know with one foot of both shoes that it's forever I can't people don't just trust you to make the decision on a segment is not owned by >> that's the thing they do they need to trust you or they are they entitled to

say to you you need to do it this now >> I I understand this This isn't simple. This takes a lot of work, a lot of efforts. But that's the thing. If you need to discuss and convince people to do something while you think that we need to do this now, that's too late to to have that kind of discussion. You should have the discussion first. You should clarify for everyone what's the role of that person because if the you know, you don't have this to discuss with the head of it if to do something that they are asking you to do because he's already entitled to do so, right? I'm sorry. We We can discuss it.

[Music] >> They just because they are looking at me very mad. Uh they are helping but they're usually too slow. Cannot wait for that. We cannot wait for regulations and then started something. [Music] >> Yeah. Yeah, but that's the thing that that's actually slowing down during the instant risk process. I had to alert someone else. I want to do operations in my company first and then, you know, communicate that with third parts. I mean, they help because they are making me rush in doing this, but that's actually slowing things down a little bit. Thank you so much.

Pitfalls of Poor Remediation: How Companies Sabotage Incident Response Efforts

Related talks