← All talks

Ready For (Nearly) Anything: Preparing Your Organisation For A Cyber Incident

BSides London · 202225:06137 viewsPublished 2022-01Watch on YouTube ↗
Speakers
Tags
CategoryCareer
StyleTalk
About this talk
Every organisation has experienced, or will experience, a cyber security incident; depending on how you define the term, most have multiple every day. Increasingly punitive data protection law, plus increasing public awareness and scrutiny of organisations’ responses, means that it’s more important than ever to respond effectively. However, many cyber security teams still struggle to do so. In this talk, Gabriel Currie covers the five key things that cyber security teams should have in place to prepare for an incident, which will improve the efficiency and effectiveness of their response. (1) Documented processes with the considerations, decisions and actions to be taken in an incident (2) Skilled and experienced people to lead and deliver the response (3) Logs to gain an understanding of what has happened, when, and how (4) Containment and eradication technology to take actions that mitigate risk (5) Coordination technology to communicate and collaborate, delegate and track actions, and manage delivery
Show transcript [en]

cool thank you very much uh so hi everyone my name is gabriel curry and today i'm going to be talking to you about how to prepare for a cyber security incident uh so first of all just a super quick introduction so as mentioned i work for the uk government's cabinet office where i lead the cyber defence team we're responsible for threat intelligence threat detection instant response and a couple of other things for the kind of internal infrastructure of the cabinet office and our citizen facing infrastructure like gov uk notify pay all of the kind of central government services that things like the government digital service provide previously i worked for pdbc doing incident management and over my time in

security i've worked on building security operations capabilities in government uh protecting and responding protecting organizations from and helping them to respond to human operated ransomware attacks i was also part of the team that led the uh investigation into apt tens targeting of managed it service providers uh called cloudhopper and are on ransomwareresponse.org so today i'm going to be talking about cyber instance as i said and how you can prepare for them so first of all just to kind of set the scene um obviously i think we'll all be pretty familiar with cyber security instance as a concept and it's all over the press at the minute and and so every single organization out there will experience a

cyber security incident uh kind of almost every day depending on what you particularly define the term as and they're kind of becoming more frequent more sophisticated and having uh higher impact and alongside all of that and there's a lot of increasing expectations about how organizations respond to those incidents that might be expectations from the public who are now kind of quite aware of what organizations should be doing and shouldn't be doing after a response and will kind of happily take you to task on twitter for what they perceive to be a poor response to things like the regulator so the ico has obviously got the gdpr and can take quite punitive enforcement action against organizations who don't do this

well so uh ba being fined 20 million for their handling of the uh of the data breach they had in 2018 and the kind of underlying causes for that is perhaps the most notable example so i suppose with all of this right that just means it's really important to be ready to respond as and when this happens so i reckon that there's five things that every organization needs to do in order to effectively prepare so it's having documented processes having the right people who've got the right skills and experience having logs in place to enable the investigation having containment and eradication technology to help you actually manage the risk and coordination technology that's going to

help you to manage the response to that incident and over the next kind of 20 25 minutes i'm going to go through each of those and tell you a little bit more about what i think that means for the things that you actually need to do so first of all talking about arguably the coolest and sexiest of all of the things in cyber security processes so i think there are three types of processes that any instant response team needs to make sure they've got down so the first of those is an instant response plan so it's just an overarching document that tells the team or the organization that is responsible for that plan what they need to do in

the event of a cyber security incident now that plan shouldn't exist in isolation and that needs to work with all of the other stuff that exists in that organization so it might be a crisis management framework it might be a i.t incident response plan might be all kinds of kind of other policies and governance it can't exist in isolation and it needs to work with those then sitting underneath that there are technical run books so things that actually tell you what to do in the event of a specific incident scenario and that will sit underneath your incident response plan and kind to go into more detail so your instant response plan might tell you that when

you're dealing with an incident you need to categorize and prioritize your incident and then a technical run book will be specific to a scenario like a malware outbreak and it'll tell you how to do that in the event of that particular scenario so help you by saying okay when you do categorize the incident this is what your category probably is if you do prioritize the incident these are the things that you want to consider so you know how many endpoints is that piece of mario infected what type of malware is it what do we see the attacker doing with that malware kind of helping you to actually apply that instant response plan in that scenario

and lots of organizations do run books or playbooks whatever you want to call them in different ways so in some organizations you'll see that they have a specific runbook for every single detection use case that their sock has that tells the analyst exactly what to do or another in other organizations you might just have like five or ten different overarching run books that kind of provide general guidance and when that's the case when you don't have kind of all the documentation in the world you need to decide what you're going to document and generally that's going to be decided based on risk so either the highest impact incidents that you're most worried about to ensure

that when something really bad happens your team has got a process that they can follow that's telling them what to do all the really likely things that they're going to be dealing with day in day out making sure they're doing the right things and then finally sitting underneath that you've got knowledge articles so it's detailed guidance for completing specific tasks and those tasks will underpin the actual things that the team is going to need to do in one or more different incident response scenarios and while all of the other things that we talked about might be quite controlled and governed and might go through lots of levels of review knowledge articles are typically just a

wiki right that everyone in the team can contribute to and you know people can spin up a new article or kind of change things if things aren't particularly working that might be something like uh how to analyze an ip address or how to review a alert firing in the antivirus provider whatever and that will help to enable the response to lots of different runbook scenarios potentially so this is what it looks like right so you've got your instant response plan at the top then hanging off the bottom of that you've got all of your different runbooks however many you decide and those will feed into the incident response plan and then you've got lots of knowledge based articles that all sit

under that which support one or more run books so that's it for processes now thinking about people so when you're resourcing an instant response function so when you're thinking about who's actually going to be in that function there are a couple of different models that you can choose and that's essentially determined by how big your organization is what the risk is and how much money you've got and so at the kind of the far end of the spectrum you might have a dedicated internal incident response team his job is to respond to incidents and help do instant response related stuff all the time in the middle you might have a virtual incident response team so it might be a

smaller organization you don't have enough incidents or you don't have enough budget to justify having someone on instant response all the time but you know who the team is that's going to be formed in the event of there being an instant so it might be your kind of head of cyber security and a sock analyst or a threat intelligence handlers that can come together and form that instant response function if you need it and then you might not even have an instant response function internally at all you can outsource that and there's loads of different ways you can do that so you might have an mssp um and you could just hand over your instant

response function to them i think a lot of organizations moving away from that model at the minute because especially when you're doing real hands-on instant response it requires quite an in-depth knowledge of your environment or indeed if you're really really small you might just have that kind of expertise on retainer now there are lots of different things to think about once you've decided your resourcing model and if you're in sourcing it then you might want to think about the knowledge skills and behaviors that you want that team to have so what do you actually want them to know and be able to do then you can take those things and you can think about how that turns into

actual like roles that they're going to do so what's their job going to be and in job descriptions so you can recruit that you can recruit against and then once you've got those people into the organization what's the framework that you can give them for learning and development so that their careers can actually progress and even if you're outsourcing you're going to want to think about some of those same things as well and so you want to be able to give your supplier requirements to the kind of knowledge knowledge and skills that you're going to be want you're going to want available to you as and when you call for it and some of it also you

might just want to ensure that your providers thinking about it so the the service that they deliver to you is effective and a lot of the resources available to help you do this uh already exist so uh in the cabinet office for example we use the government security career framework that gives us loads and loads of information about the specific roles that the government thinks exists within security both within cyber security and also kind of wider things and then says you know what are the kind of key skills and behaviors we'd expect each of those roles to be able to do say an instant handler against all the different levels and grades that we have and what does

the training pathway look like and you can just take that and you can use that and you can apply it to your organization and tweak as needed and there's loads of other frameworks out there that you can use to help you do that kind of thing so nist from the us government has got another one that goes into huge amounts of detail the australian signals directorate has got another one which is kind of quite high level and then the skills framework for the information age it is one that's kind of not aligned to any particular government is a uh a not-for-profit organization that does that and so those are things that you can use to help you define and structure the the

progression and uh the training for those people so now moving on to talk about logs i think it's basically two questions when we're thinking about logs what logs do we actually want to store and how long do we want to store them for so first of all thinking about what logs we're going to store so we need to come up with actual logging requirements for those systems that we're protecting to tell them to generate them and send them to the scene that we're then going to store i think in my mind there's three things that we want to think about there it's the real world threat so what is the attacker actually likely to be doing or trying to

do in my environment and so kind of taking those kind of threat models and then actually applying them so using things like threat intelligence and threat modeling to work that out then what are the investigative questions that we're going to have in the event of an incident so once i've found out that there's an incident on my environment what do i actually want to know about what the attacker has done or baby hasn't done on that environment and that can drive what specific logs i want then also there's probably going to be external requirements of some form and that's going to help shape what logs you want and also maybe what logs you don't want so for example the gdpr talks

about the need to like minimize the amount of personal data or have a kind of reason for holding personal data so for example if there's logs with huge amounts of personal data in them and they don't really have a huge amount of security benefit then maybe let's not keep them or we might have a regulation that tells us what we must capture so for example if you're handling payment card data then pci dss might tell you exactly what logs they need you to store there might also be internal policies that you need to think about as well and then how long are you going to store those logs for so coming up with a retention period you obviously don't

want to keep those logs indefinitely so you can go back to things like the gdpr and also cost and so i think there's three things here one is real-world threat and also the capability that we have to detect those threats the investigative requirements so pretty much similar as last time and then also the cost that it's gonna that we're gonna incur by virtue of having those logs so um when we were doing this internally uh we had a look at mandates m trends report and so that talked about the kind of dwell time for an attack so how long it takes for an attacker to get into the environment versus people detecting it and they found that for non-ransomware

attacks the average dwell time was 45 days and 25 of non-ransomware attacks had a drill time of over 200 days so if we say that our logging the wegener store logs for 14 days but most attacks are only found after 45 days that's not a whole load of use so we can use that to try and drive our logging requirements and also to justify it you can also use of course data from your own internal systems that as well in your own internal cases so yeah if instances are going to be investigated that occurred 45 or 200 days ago then the logs should be available to enable that but you also need to balance that against business

pressures right because you know storage isn't necessarily free and you can think about ways to try and reduce that cost so it might be that some logs have greater security benefit than others so you might want to store some logs for a short amount of time because they have a lot of value but they're very verbose but other logs aren't that verbose and incredibly useful and so you might keep them for say a year or you might think about moving things into cold storage and then if you do that there's a lot of things to think about as well so how do you transition them over how do you make sure that you delete them out of cold

storage at the right time etc etc so those are the two questions i see around logs now talking about containment eradication so i think there's probably three different types of containment and eradication action that we want to take so host based network based and identity and i'm just going to step through some of the different actions that i think you're probably going to want to be able to take and also some of the technology that you can put in place to enable you to do that so first of all thinking about host-based containment and eradication so the kind of actions that you might want to take here are switching off systems or restarting them isolating

your host from the environment identifying and removing files from hosts whether that's based on a file name or a hash or something else blocking files from executing on hosts again by a file name or a hash or removing persistence mechanisms like registry keys or scheduled tasks from your environment you're going to want to be able to do all of those things and technology like classic signature based av might help in some ways but typically we'd use something like edr or xdr to help to enable that and they're also often tools built into operating systems that can help you do this to an extent as well or existing management tools that exist on the infrastructure next thinking about

network-based containment and radication so kind of the two key things you're going to want to think about here blocking known iocs on your external infrastructure so stopping malware from calling home if you've got an infection on your environment or stopping it from kind of connecting back or the attacker from connect again so kind of blocking things at the perimeter you also might want to think about how you can kind of pull the plug on all or bits of your network i think that's especially relevant at the moment when we're seeing this huge rise in human operated ransomware attacks where if you see a ransomware attack impacting one area of your state you might want to think about

isolating that area of the state as quickly as possible so that it can't spread into other areas especially more critical ones and so in this way you can uh you can use your existing security tooling so things like firewalls ids ips you can also use the network infrastructure that you've got in place you're kind of switching and routers uh and then there's also their kind of good old-fashioned option of just pulling the plug on stuff finally thinking about identity based containment eradication so the things you might want to think about doing in the event of an incident changing account permissions access or privileges so if the attackers provision themselves with their unnecessary or unauthorized privileges

given themselves a domain admin account how can you take that away how can you reset reset individual user credentials uh and including both user credentials and like service accounts and service accounts especially can be really really hard to reset especially when those credentials are hard coded into applications so making sure that you're able to do that and then also how you can do that but do it at scale so um i remember seeing i think last year or the year before a university in germany had to reset credentials and uh all the students had to turn up to the service desk all like 45 000 of them and had their password printed out for them and

handed to them and so that's obviously not sustainable and that doesn't scale so how can you make sure that you can actually reset credentials at scale and then how can you actually disable and delete accounts if the attacker is going to provision themselves with new accounts so lots of things to do there and i think all of those really are going to be within your existing identity systems so typically using active directory and then any kind of additional functionality that you might want to put into that or any scripts that you want to build an app to enable you to do those things quickly and at scale and especially around the kind of resetting credentials a lot of that is around like

tying in with the service desk and talking to users and telling them about things and having mechanisms that you can for example securely tell them about their new password so finally coordination technology so i think there's four things that we want to do here and i think we need to be able to ensure we can do those effectively and also ensure that we can do them in a resilient manner and in a secure manner so for example if our email environment is completely compromised are we still able to ensure that we can communicate as a team or if we don't have an email environment anymore how can we ensure that we can still communicate you know

so after a human-operated ransomware attack uh often that will take down huge amounts of internal infrastructure so uh i've worked with organizations before who in the aftermath of a human operative ransomware attack haven't had email their internal phone directories have all been wiped and so nobody knew how to get in contact with each other unless they happen to know each other's kind of personal mobiles and we're all talking together and whatsapp and that just doesn't work so how can you ensure you do all of these in a resilient and secure manner so we've talked about communication and both like synchronous and asynchronous communication within your team and also with external partners so that might be your retained incident

response partner or could be the government if you're working with us tracking tasks so how can you actually make sure that you document the things that you need to do to respond to the incident and kind of do that basic project management so tracking the due dates of things the level of effort that's going to be required the status of those tasks they're gonna people who are gonna be working on them in the next steps because an incident is a project like any other and we can't just have chaos we need to make sure that we're all working together collaboration so we might need to be collaborating on actual analysis so there might be kind of a big spreadsheet

which has exported all of the forensic artifacts from a system how can we all work on that together um or there might be response and documentation tasks so we might have to write a report how can we all do that together and then off the back of all of that reporting so both within an incident and across multiple incidents how can you capture key statistics within the incident so how many systems are compromised how many users are compromised what's the state of each compromised system or across multiple incidents so how many incidents have we had this year what were the actors involved what were their motivations what was the impact how long did it take

us to respond to remediate all of that kind of things and then output those to enable management reporting so that's us finished five things i think you need to prepare for a cyber incident documented processes the right people with the right skills and experience logs containment and eradication technology and coordination technology so that's me done um before i say uh before i open the floor to questions just let you know the cabinet office is hiring across red and blue team rules we're a growing team working at heart of government doing some really cool stuff dm me on twitter my dms are open if you want to find any more information or come and chat to me afterwards that's me

done any questions

[Applause]

[Music] how are you going to change the appetite within government type organizations for spending on it equipment tools cyber rather than the other priority investment areas and staff and upgrading their pay that's a very good question and a very hard question to answer thank you um so yeah i think security is obviously a problem and that is exemplified by some recent incidents which have notably impacted government but i think the government as a whole is doing a whole load of really really cool stuff to try and improve security and dedicating a lot of time and effort to doing that um so you've got organizations which are kind of relatively new and are kind of increasingly delivering a lot of uh

value in terms of security so the ncsc i think is really ramped up over the past couple years and is doing some really cool stuff to engage with industry and also within government itself within government we've also got what's called the government security group which sits within cabinet office which is all around like actually kind of doing that hands-on securing of government and getting people to buy into security um and uh i think that is in many ways working i think we've got a lot i think we've got a long way to go difficult for me to say a huge amount more without getting fired um so hopefully that answers your question but i think yeah

it's about talking to people it's getting buy-in for security and it's about uh showing uh what happens when it perhaps doesn't go quite as we'd hope hello great talk regarding the containment of like um posts that have been compromised and stuff like that you mentioned kind of pulling the plug and yeah that's more colloquially or literally but obviously uh what did you mean by that because i imagine if you actually pull the plug you stand the chance of losing a lot of potential evidence that might exist in memory and other such kind of unfair sorry i should clarify not necessarily power plug the network bug um but but yes i think um but there might be situations right

where you do want to pull the power plug and so you know if for example you have a human operator ransomware actor on your environment who is actively deploying ransomware and it's the case of either you switch off a system to prevent them from accessing that any further or the rest of your environment gets owned and in the process of doing so you lose some evidence then i think it's a case of weighing up the pros and cons of that particular decision and so yeah absolutely i think the priority has always got to be enabling the investigation and you know i talked in the beginning about the like scrutiny that regulators are increasingly putting on the response

i think there have been a couple of instances where the ico has criticized organizations for not taking and maintaining uh forensic copies of compromise systems and so obviously you've got to kind of balance the pressures of that and but i think in my mind it's just about being able to take a decision that's defensible so to show that you've kind of considered the risks and rewards of you are you doing that particular thing and being able to justify it at a later date um so yeah i agree there's something you need to be very careful of but there are situations where absolutely you might want to pull the plug if anyone's seen the scene in

ncis where gibbs pulls the plug on the monitor very similar thing so yeah um good thank you um so there is an ins so sometimes it happens that there is a incident which we do not have a run book for and you you might always run into such kind of situations what are the things that we can do to say that despite not having a run book we ran through a minimum standard of that incident essentially uh i'm not sure i quite understand the question are you able to clarify sorry so for example if you don't have a run book for it um what what kind of other things that can we can do so that the incident is handled

well despite not having a run book ah sure okay yeah so i mean i think run books are an interesting one and processors i think can be um like can vary in their usefulness right so if you're a really experienced team then i think a pro a documented process is just like a kind of handrail for you right and make sure you're going in the right direction and but if you're a less experienced team or you're doing something that you're not experienced in then those documented processes can be absolutely vital and i'd go back to that hierarchy right so we spoke about having that instant response plan at the top and i think in my mind right that's a non-negotiable

now it's probably unlikely that you're going to have a run book for every single instant scenario that you're ever going to come across right and if you do have that it just turns into a cottage industry and actually you end up in a situation where you either spend all of your life updating run books that you never use or you write run books and by the time you actually come to use them they're pointless so i think it's having the right documentation at multiple levels so if you don't have a runbook then your instant response plan should still be able to tell you the kind of things that you need to be doing and the questions

that you should be asking um and then you know if there's and so you should kind of be able to work it out right alongside all of the other training that you've got um so hopefully it should still work out ideally you'd want to run book there right yeah do we have any other questions no okay then please show our appreciation for gabriel

[ feedback ]