Before the Breach: The Security Essentials

Name: Before the Breach: The Security Essentials
Uploaded: 2025-05-03
Duration: 30 min 8 s
Description: Peter Ukanov, a senior consultant at Google Public Sector Mandiant, shares practical security essentials developed from frontline incident response investigations. The talk covers fundamental hardening measures—Windows event log sizing, Linux bash history configuration, web server privilege separati

BSides Charlotte · 202530:0875 viewsPublished 2025-05Watch on YouTube ↗

Speakers

Peter Ukanov

Tags

CategoryTechnical

TopicDetection Engineering DFIR

DifficultyIntermediary

TeamBlue

ResearchCase Studies and Incidents Analysis

StyleTalk

About this talk

Peter Ukanov, a senior consultant at Google Public Sector Mandiant, shares practical security essentials developed from frontline incident response investigations. The talk covers fundamental hardening measures—Windows event log sizing, Linux bash history configuration, web server privilege separation, and log aggregation—that organizations consistently overlook but that dramatically improve breach detection and response capabilities. Through real-world case studies, Ukanov demonstrates how these basics, when implemented, prevent attackers from operating undetected for extended periods.

Show original YouTube description

“Don’t wait for the siren - fortify your defenses now.” Whether you're a sprawling enterprise or a nimble small organization, the basics of positioning oneself to effectively respond to a cyber incident often times end up being overlooked. In this talk, we’ll look at several simple security essentials, developed from countless front-line investigations by Mandiant into real-world breach, that when implemented would give you a significant leg up in detecting and responding to the eventual breach. As each of the essentials is discussed, examples from specific instances from investigations of where they were found missing will be highlighted.

Show transcript [en]

Welcome everybody um to my talk before the breach the security essentials. Uh in this talk we'll briefly look at some simple security essentials developed from countless uh frontline investigations conducted by mandient into various real world security breaches um that when implemented would give you a significant leg up in detecting and responding to the eventual breach. Um a quick disclaimer uh for this presentation uh I might be talking about some case studies or examples. Uh these are just from various engagements that I've done over my professional career and it does not reflect a specific client um that I worked with. So from our agenda perspective um broken into couple things um so a little bit about Windows um one thing to talk about

in the Linux world uh regarding security essentials I'm going to touch upon web servers in general touch on the procedural aspect and then talk about everybody's favorite what to do beyond the essentials this is the long-term look at a security program or how incident response and the cyber security program in your organization might be developed. From a quick intro perspective, um I'm Peter Yukonov. I am a senior consultant at Google public sector mandant incident response team. Uh previously to joining Mandian and then Google, I worked at DRAOS. Um as well as the defense information systems agency, which is a sub agency within the defense department. Um, and then outside of work, uh, I enjoy mountain biking,

doing DIY, both artsy as well as house stuff, and then cooking. Um, in one of the slides, I'll be going over some like scripts and configurations. Uh, that will be available up on my GitHub. Um, so if you want to take a look at that and copy it and play around with it, feel free to download it. All right, jumping into the security essentials. So the first one, Windows. And here we're talking about Windows event log sizes. Probably everybody has heard this a million times in their career or in school, but a lot of organizations still do not increase the default size of Windows event logs. Uh by default, the critical event logs are limited to

usually 20 megabytes in size. And that just means that as you enable various auditing, you're not going to be keeping data on the host for a long time. So from a recommendations perspective, usually this is one of the most common recommendations we give after an incident as well as proactive work where you should leverage group policy to configure critical event logs to increase their size to at least a gigabyte in size. Um this can be done multiple ways. for some of the core events logs that have existed since Windows NT days pretty much there's group policy templates in place that you just click increase size type in the new value and boom you're in business for

some of the more newer event logs that is when you have to leverage the registry and increase the specific channel names to their specific max sizes. Um and then from a recommendations perspective, some of the core event logs that we always recommend increasing are security system application. So these are going to be across all your Windows operating systems. And then starting with the Windows PowerShell, uh this PowerShell operational log, that's where various script logs get recorded. So whenever somebody executes PowerShell on the command line, you'll get a copy of the payload that was getting executed. Now from a pros and cons perspective, pros should be pretty obvious. Uh longer retention, so bigger size means that you

have a lot more data to work with should an incident occur. Um potentially um I've seen with the one gig size uh logs, especially for security, that could potentially give you over 30 days of process creation events if that is configured in your systems. Um, and then from a like con, bigger log sizes means bigger uh more data usage. But nowadays with how big integrated storage is, that usually isn't a concern. Um, one quick example from this perspective uh that we've seen um in incidents um whenever and so an example, an organization had all the various security configurations enabled. So process creations were getting audited, various other file access audits getting uh logged. So there was a

lot of events getting generated. However, they forgot to increase the log size. So when an incident did occur, the security event log in Windows for the compromised hosts literally had 60 minutes worth of data in them. And that is far beyond what how long the attacker was living in that environment. So a quick look at how to actually configure these um events using group policy. So got two screenshots here. Uh the one on the left that just shows the four core uh core critical events um in group policy where it's a literally a toggle enable and enter the size and you're in business uh for that event. The one on the right that just shows an

overview in group policy where you actually have to go and create this new registry object specifying the full registry path to the channel that you're increasing the size to and then you hit okay and apply it to your group policy. Usually this is potentially done across all computer objects in an environment but you can always tweak it depending on if your servers need to use more spa uh have longer retention requirements kind of thing. The other thing also on Windows security um enabling 468 uh by default for whatever reason Windows does not log process creation events. So the 4688 event ID except when it boots up. So if you do look at the security event

log and you see 4688 you get all excited and then actually it's like five events. Uh that just happens whenever the Windows uh logging system is turning on. And then additionally when process creation auditing is enabled 468 events don't actually log the command line for the executing process. And this again from a recommendations perspective leverage group policy enable process creation uh logging as well as enable that secondary step where your command line gets logged in it. So the pros from this perspective, uh, if you enable both of those features, you get the full process, um, path as well as any command arguments that were passed into that specific process that gets logged to the event log. And then during an incident,

this just makes identifying malicious activity so much easier. You're not no longer guessing like, hey, what kind of parameters got passed into this binary, for example. Now, from a con, this is again more data getting logged to the event log. So you need to increase that event log. So this ties into the first one. And then something to consider should this get enabled. Sometimes people put in sensitive arguments like passwords into their command lines. Not best practice, but still happens. Now for a configurations perspective, pretty straightforward. Group policy under detailed tracking, there is an option for process creations. You just enable it to success and boom. Now if you look at the screenshot on the right, this just shows

me executing ping. And one thing that you'll notice here is that the process command line value is empty. This can be remediated by enabling the include command line feature. And now all of a sudden you would see that hey I was actually pinging my local gateway here. So that just makes life a lot simpler from a defender perspective. And if you're a CIS admin and you're trying to troubleshoot like why is an application for example not working? So not even security related this also benefits you. Now the third security essential for Windows is monitoring defender alerts. Uh by default Microsoft defender is enabled in all Windows installations. Um, and even if you have a third-party

AV solution, EDR, oftent times that defender will go into what's they what Microsoft calls passive mode, whereas they're just monitoring what's happening on the system and alerting versus blocking outright. Um, and when Defender detects a malicious file or a process, that gets written to this Microsoft Windows Windows Defender operational event log. And the multiple events that you would potentially be interested in this event log are 1116 and 1117. These are the various events that get marked whenever there's a detection. And then the various 5,000 events that just say, hey, defender is online. Something got disabled kind of thing. Now, from a monitoring perspective, you might be asking, hey, Microsoft doesn't provide us any tools out of the box to monitor it. Well,

actually, they kind of sort of do. Uh this requires a little bit of leveraging like scheduled built-in scheduled tasks where you can actually create a scheduled task that fires when a certain event is generated. And you can tie that into with a PowerShell script to either send it to your like an email, maybe to a Slack Teams channel that says, "Hey, this box had a detection. You should go remediate it." Um the pros of having this enabled, you would get detected whenever Defender pops on a malicious file. Uh but the downside is bunch of admin overhead potentially. You have to be testing that and making sure it works. U a perfect example of actually monitoring defender recently uh worked a

case where a public web server um that an organization had got compromised by the attacker and defender started alerting pretty much right off the bat as soon as the attacker dropped their first web shell. And this server was compromised for over a year. Um, and throughout that year period, uh, Defender was continuously alerting on either various web shells getting deployed or various attacker binaries deployed. And nobody was actually monitoring those alerts. And what ended up happening with this client is that the attackers leveraged this web server to pivot further into the environment. And then in the end, they ransomware the entire environment. Had they been monitoring this web server and monitoring the alerts from defender,

the client would have potentially known much earlier that hey there's somebody bad is in the environment. And then from this again this this uh schedule task as well as PowerShell script will be available on my GitHub. This just shows how to build out a query um in the schedule task to monitor for a specific event ID and then pull out the various parameters that are in that event ID and pass it off into a PowerShell script for whatever you want to do with it. And here I have actual example of the whole thing set up. Um for example, I'm downloading the IICAR file and I have my schedule task and my PowerShell script hooked up. So then whenever defender

actually alerts that hey malicious file downloaded I get a notification in my slack channel say you defender detected ICAR file in this path on this host. Now from a Linux essentials perspective there's really only one I want to hit upon and and this is enabling timestamps in bash history. Uh bash history is typically a very valuable source of forensic evidence in an investigation assuming the attacker didn't go in and delete it. Uh so this will log all the various commands that were executed on a shell. However, uh by default um a lot of the settings with bash history are pretty simplistic in what gets logged um and as well as how command logs get um how the commands get

logged. Uh so from a recommendations perspective, the first thing definitely recommend is enabling this his time format setting uh for all users on a system. This just allows bash to actually record the time stamp when an event occurred so that you're no longer guessing, hey, was this for example PS command executed by an admin eight months ago or was it executed by an attacker a couple weeks ago? Additionally, once you actually get timestamp um enabled, you can also enable various configuration settings here to increase the number of um commands that get recorded as well as how big the actual history file gets sized. And one thing um also to look into is enabling this prompt command uh

modifying it a little bit. That way whenever a command gets issued in bash history, it actually gets written directly to uh the history file versus waiting for the user's shell or session be terminated before it gets written to that specific history file. Um the pros of having this enabled is you get much better visibility on what commands are getting executed when potentially keeping them for a longer as well. Um and then a con again this is more of an admin overhead perspective. It requires testing configuration and validating the changes to make sure you're not negatively impacting the systems. Um and here's just an example of how just enabling the timestamping in bash history would show up. uh

the screenshots that we have on the left with the red arrow just shows hey the default history only the commands and as soon as I've enabled uh time logging in it we actually see the specific time stamp with the time zone um of the system that shows whenever a command was getting executed and when we actually just cat the bash history file we'll see um these epoch timestamps prepended with a pound sign that shows was whenever the next command got fired off. And perfect example of why timestamps are critical in a bash history worked a investigation that was focusing on network appliances that were exploited and on some of them the client enabled bash history. So that

this is they went beyond above and beyond what was required for setting it up. And there we had full by second activity log of when attacker was doing recon in the environment running commands to deploy their pivoting tools and then couple other appliances of the same brand. For whatever reason the client didn't make this change and in that instance it was really difficult to find out when exactly certain commands were executed. Was it even executed by an attacker? Um because at that point the attacker was just running very similar commands that an admin trying to troubleshoot why a device was offline uh was running. Now pivoting over to web server essentials. Got a couple got two of them

here. But the first one stop running your web servers as admin. There's been so many investigations that we've done where a publicly facing web server gets compromised. whether it's unpatched software, there's a known zero day in a software or you know bad programming um and the application that gets compromised is running either as root system and if it's worst case scenario potentially even as domain admin. Uh for this there's a couple ways around um making sure your web servers aren't running as admin. You can use either service accounts or restricted user accounts to run set applications. And in the world of Windows, Microsoft has something called group managed service accounts uh GMSAs. And these are like

these special accounts that automatically rotate passwords and allow you to set permissions to that specific uh service account. And then from the pros perspective should be pretty straightforward. You know if a web server doesn't compromise gets compromised the attacker don't instantaneously have access to domain admin for example. That does happen. We had an investigation where a WordPress site got popped and for whatever reason you know either it was from a troubleshooting perspective or you know get the typical hey I need this website stood up really quick. Uh the client had their WordPress site running as domain admin and pretty much instantaneously after the attacker got remote access to this box, they were domain admin. And you can imagine they just went to

town without having to spend extra time performing recon and establishing escalation privileges. And from a con perspective, again, depending on your workflows, this could be a little bit of extra work uh to go through all your public websites, web servers to make sure that the user that that specific application is running has those specific necessary permissions. could be tedious if it hasn't been done, but it pays off dividends in the event that you have a publicly facing application that could potentially get exploited and likely, you know, it's probably going to get exploited at some point. All right, the next security essential concerning web servers uh focuses on the X forwardered for header. Uh it is best practice to not directly

expose your web servers to the public internet. they'll get scanned, exploited potentially, but instead deploying them behind a web application firewall or a load balancer for either performance reasons or or security reasons. So from our recommendations perspective for web servers, this should be one of the very first things after making sure you're not running as admin um is to enable this header on both the web server as well as any network devices that are in front of them handling those web requests. So the pro uh pro from this perspective having this enabled in uh will ensure that the originating client IP from outside of your network would get logged in the web server request logs instead of just the IP of the

upstream network device. Um and the values for this specific header will actually be comma delimited for every single network device that that specific application uh web request passes through. Um, the con again enabling more stuff for logging just increases the event log uh size. A perfect example for this header worked with an organization where it had a really robust security stack. Um, and they were just concerned about, hey, why is our web server just logging all these failed web requests, failed web exploits? Um, and we were looking at the various logs. And in this case, the client had these web servers sitting behind a load balancer and they forgot to enable this X44 header. So all

we saw in the web logs were hey the load balancer is performing a request. Load balancer is performing a request. So you couldn't even see who the originating client for that request was. And in this slide this is just the two different ways of enabling this header. uh the first top image that is an Apache configuration. So you would actually have to go into the one of the Apache configuration files either for the site or for the global and add this thing in the curly brackets x forwarded 4. Um and then the one on the left that is for IIS and here you would have to go into logging select fields and then you would

actually have to manually type in this X4 as a custom field and then save it. Uh once that done, this is what will start happening to your logs. The first record, this is just a generic, hey, I got a request came from this 10.001 IP. Who knows what that originating client IP is? Now, once that request has X4 headers, we'll actually see the originating client IP at the very end. So, even though it's still logging that, hey, request came in from 10.001, you'll actually also have a 100.64 64 in this example IP. So you'll be able to pivot as well as identify any other potential web malicious web requests that originated from the original client IP.

So this is a couple of security essentials from Windows, Linux and then web servers. And then the next thing is these are more from a procedural as well as hey this is in the future when everything is robust and you're ready to explore additionally from a procedural aspect. The big thing is develop a basic incident response plan. Um, various organizations will have various levels of this instant response plan. Sometimes you might not even have it and it's all just institutional knowledge that Bob over in XYZ department knows everything about everything and like when an incident happens, you go to him. But what happens if Bob is on vacation? Then you're just like flailing all around. So

from an instant response uh plan perspective leverage an existing model if you're going to be developing um this plan something like NIST maybe pickerl pisurel people say it differently um also this dire model that is becoming um it's an up and cominging model so pickerl it's a pretty linear and static model it focuses on each of the phases of an instance so preparation identification containment eradication recovery and lessons learned this is very similar to a instant response plan that might happen outside of the cyber realm uh in like your business impact or there is a hurricane coming kind of thing. Um and then this other uh model the dare dire model however you pronounce it um is the

dynamic approach to incident response. Um this is a relatively new model where focuses on the various milestones activities and outcomes. So it's more cyclical versus a linear static model. And when you're developing this instant response plan, you want to involve various key stakeholders in your organization during this creation of this plan. This could be senior leadership, might be CIS admins, maybe application owners. And as you're developing this plan, you would want to write down who those system owners are, potentially identifying any critical business processes as well that you know you need to prioritize restoring your payroll system for example or your paycheck processing if you're like handling paychecks for thirdparty organizations. Um, and you also want to

include a listing of all the various internal stakeholders as well as third-party vendors. So this something to think about. This could also include internal external council. Hey, do you want to do the investigation underprivilege so that it's a little bit harder to um you know protect your sensitive information? Uh include your public relations team. You know, if your organization goes down, all your clients are going to be like, "Hey, what's going on?" And you want to make sure that your PR team is actually aware of what's going on and how uh you're trying to remediate the issue. And also include any other like thirdparty providers you might have where whether it's your instant response retainer uh your SAS

security provider etc. have that as like a a simple Excel spreadsheet will be works quite well most times now beyond the basics. So this is once you know you have stuff set up, you think you're good. Um you're ready to move on to the next step. This is when log aggregation comes into place. Um I don't like to say that hey off the bat new security program let's get logs ingesting. You want to start with the basics before you even get to uh collecting logs. And from a log collection perspective there's probably two camps. start collecting the critical logs or just hey let's collect all logs and then figure out what what we're doing late uh later and from a log

perspective you're you should be prioritizing what you're collecting depending on what are your regulatory requirements potentially uh so you can collect your AV logs you know relatively low volume logs uh all the way to host logs file share access if that is something that is you know you want to monitor your shared um your sensitive file share access and from prioritization perspective. Again, prioritize those core critical servers versus workstations. And then once you're actually collecting stuff off the workstations, if you have sensitive workstations such as, let's say you are a power plant and you have HMIs, collect them off those uh human machine interface computers that are super vital for the power plant to work versus, you know, regular users.

And the other thing is to actually test those logs to make sure that they're actually making it into your SIM or log aggregator. There's been multiple investigations that we've done where client replies, hey, we have, let's say, Splunk or some other uh SIM in place. And when you actually log in, there is, you know, the data that's there is not actually what the data that you're looking for. And then finally, from a log aggregations perspective, know your retentions for your logs. This is as part of that, you know, log development program that you potentially are implementing. Um, identifying how long various logs get stored in your environment, whether on disk, in your SIM, or if you're relying on a vendor,

how long does do you know how long your vendor stores your security logs? vendor could be an EDR provider, a SAS provider or you might be using some third-party um solution for hosting something like do they even store logs that can be readily accessible. And then once you have some log aggregation stuff done, stuff is working, you know, you you have a relatively mature security program, this is way beyond the basics, develop a collection management framework. Um so a collection management framework is a like a a Excel spreadsheet right that says hey these are all the various applications tools technologies that we have in the environment and this is what kind of different logs they might be

generating that can be used for either incident response thread hunts or for all you know your CIS admins might be using it to make sure that the systems are working as intended. So this is a living document that gets regularly updated maybe on a quarterly basis maybe whenever a new tool gets deployed uh that addresses the various business needs new threats that are envir uh that have emerged or new tech that gets deployed. So hey you you say hey we replaced EDR vendor XYZ with EDR vendor XYB. Uh let's see what kind of logs can get be ingested into our SIM. And this collection management framework just helps you answer questions such as hey what kind of data do we just have in the

environment? There's with in modern security stacks there's just so much data. So this allows you to narrow down to what is critical for investigative purposes. Are the fields normalized? A lot of the times whenever you're just throwing logs into a SIM, nobody's normalizing fields. That makes pivoting between data sources a little bit harder. This answers the question from the previous slide. How long is data actually available? You can real time see your retention periods and then just identifying where logs are. Maybe for whatever reason you don't need to collect your data into a sim or a log aggregator. Perfect. Then you have a node saying hey if you need to see specific sensitive logs potentially you

go directly to the device. And then finally, and this is from a maturity perspective, as you having these logs document, you can potentially create a coverage map of let's say MITER attack to see what areas you might be lacking in a detections perspective or you might be really good in one specific category that you are going to be if something happens in this uh MITER technique, then you will always detect it. So as an example, here's a quick simple basic example of what a collection management framework could look like. You would identify the devices or assets in the environment. Um highlight the various data types that are present on there. So for example, we have a Windows server

that has Windows event logs, specifically the process creation. From a minor perspective, um this would log anything regarding execution, persistence, lateral movements. Um and then based on the various settings of your the size of your event log, the enabled audits, the data retention would be 10 days and you don't actually have a SIM in this instance. So all your data is going to be local. So that means a as an analyst either internal analyst, external analyst that you brought in, they would have to manually go on to every single server and collect that data. as a follow on you can identify, hey, we're also kind of sort of interested in Windows Defender logs. So, this could be then another column that

gets added to your uh collection management framework so that whenever an incident does occur, analysts can quickly take a look at what logs are actually available. And this can be done for all sorts of things. your VPN servers, hey, are we monitor? Like, how would I take a look at VPN sessions? Firewalls, again, intrusion detection alerts, various other alerts, yada yada. So, this can be just expanded both vertically and horizontally. So, potentially create a collection management framework for every single type of device or just have something at the high level. And with that, these are probably the top things that we've always run into uh during investigations. And if more people enabled even just the first three things

in the Windows Server um essentials aspect, it would make live a little bit easier for both investigators and CIS admins. And with that, thank you everybody.

Before the Breach: The Security Essentials

Related talks