Incomplete Views: Network Incident Response in a Data-Poor Environment

Name: Incomplete Views: Network Incident Response in a Data-Poor Environment
Uploaded: 2019-02-10
Duration: 54 min 31 s
Description: Malcolm Heath Incomplete Views: Network Incident Response in a Data-Poor Environment Quite often, when doing network incident response, we find that we either don’t have, or can’t get, adequate information to determine what the actual situation is. While it would be great if we lived in a world wh

BSides PDX · 201854:31150 viewsPublished 2019-02Watch on YouTube ↗

Speakers

Malcolm Heath

Tags

CategoryTechnical

StyleTalk

About this talk

Malcolm Heath Incomplete Views: Network Incident Response in a Data-Poor Environment Quite often, when doing network incident response, we find that we either don’t have, or can’t get, adequate information to determine what the actual situation is. While it would be great if we lived in a world where we had all the information all the time, the fact of the matter is that we often need to take action based on an incomplete picture. This talk will focus on what sorts of network data and data collection systems you might want to have, how to analyze the data you do have, and how to use some innovative techniques to mine data you already collect for interesting, and actionable, items. Malcolm a security engineer with a company that makes networking and security products. He does incident response, PSIRT, and some research.

Show transcript [en]

all right so hopefully everyone can hear me okay or should I grab the mic it's good okay so um yeah this is titled incomplete views I'm gonna be talking about network incident response sort of as it actually happens rather than maybe how we think it might Who am I well I'm a senior security engineer and I'm with that five networks and I'm also a member of the f5 security incident response team which is essentially a group within our support organization that helps customers when they are under attack it's actually something that we just offer to everybody it's not a charged for service they can call in and they can say help we need some assistance and

we try to help them out as best as we can so because of that I end up talking to a lot of different customers under a lot of different kinds of attacks with a lot of different strategies and capabilities in terms of network response so essentially this or Network Incident Response so this is essentially me trying to tell you some things that I've seen and learned from that experience of being a fly on the wall in a bunch of different kinds of contexts and I should say this is not a vendor presentation I'm not going to be mentioning specific vendor technologies certainly not my the company that I work for nor anybody else's this is going to

be relatively high level and process focused to a great extent although we will get into some specifics of specific broad tactical tool sets and so forth that you might want to use so the next question that I have is I don't actually really know who attends besides I mean I know we're all interested in information security in some way or another so I want to just ask a couple of questions to kind of get a sense of where you're all at and what you're interested in so how many of you have done or do incident or incident response in some way or another great how many of you think you do it really really really well

oh come on are you now okay and let's see what's another good question to ask how many of you feel that the biggest issue that you have with the way that you do Incident Response is your tool set like what products you have or what capabilities you have in a technical level okay what are two right and how many of you feel that it's much more of a matter of communication and how do I want to put it communication we were hearing groups it's essentially human resource stuff okay yeah okay okay so we got the right audience that's great so what I'm going to be talking about it's not about forensics I'm not gonna be going into

anything about determining what happened you know after the attack I'm not gonna be talking about attribution and I'm not going to be talking about detecting implants or threat hunting or any of these sorts of things because the sort of incidents that I've typically been involved in don't really go into those areas those happen off scene the main thing that what I've been able to do is help customers respond to attacks while they are happening and the goal of that is essentially just emergency response right they are suffering some sort of a situation that is inhibiting their ability to perform their business their sites down their database is slow their accounts are getting locked out whatever

it happens to be it's a it's a direct and impactful situation for them so the goal that we are trying to do is just get them to the point where they can stop the bleeding take a breath and then figure out what to do next right so that's that's really the scope of the Incident Response there are of course lots of other parts of Incident Response that you have to go through but the talk is going to be focused on this and it a lot of it has to do with preparation but this is the sort of thing that I've had the chance to see some of these probably everybody's seen some floods UDP floods ICMP floods get

floods right pseudo random subdomain attacks against DNS that's another good one but we've also seen things like kind of massive SQL injection attempts against Web Apps we see a lot of brute force authentication attacks certainly against SSH I mean everybody anybody who runs and exposed as a safe server gets brute force like constantly but also against Web Apps and pretty much anything that you can possibly stick a username and password into we've seen some very interesting things with computational attacks against databases and and in case anybody know it doesn't really quite know what I mean by a computational attack I mean a denial of service attack that leverages the ability to make something do more

computations and you wanted to write so a really good example of this was a company that had a web-based API that was for finding a provider and I'm sure you've seen these sort of things like type in your zip code type in the kind of provider you want like a medical provider or whatever and we'll show you the doctors offices in your area well somebody and I don't know who it was but whoever was attacking them had decided that the best thing for them to do was to set they figured out that they could actually use coordinates in the API call to determine where you were they set the coordinates to be in the middle of the Indian Ocean right and

they said the kind of provider that they wanted to get was like everybody yo show me everything show me unliving you know all of them so one little get request to that API and suddenly their database was having to scan every single row and their database blew up okay that's a really good example of a computational attack we also occasionally see essentially vulnerability based denial of service you know obviously we're thinking about vulnerabilities in terms of real remote code execution but there is a huge classification of vulnerabilities that people have to deal with that are basically just I can send you a particular kind of network flow and your stuff will break your switch will fall over your firewall will reboot

your proxy device will crash whatever happens to be and these are really great because they're from an attacker perspective because even if you have high availability infrastructure where you've got fail overs and all this sort of things going on I can just send one here and same one there and same one there insane one thing and just hold everything down for as long as I want so also interestingly enough and this one I I was actually kind of surprised by I'm web scraping is kind of a thing now competitors scrape each other's websites to find out how much the opposition is charging for things or for a whole variety of reasons in fact there's an

entire I think I'll call it a botnet sneaker and they actually some people call it sneaker net that exists solely to troll sneaker seller sites to find the most awesome sneakers to buy it's all very very automated because the sneaker heads love their sneakers it is legitimate in the sense that they just want to get the sneakers before everybody else does right they're customers but a misconfigured sneaker net can certainly take down a website if the traffic level is high enough or if there's some if they trip over some sort of ulnar ability that happens we've also seen some stuff that's been in the news like GRE flooding with the Mirai botnet that was what hit Krebs on security I

don't know if you all heard about that it was like 600 Giga bits a second or something insane like that and a lot more so this is the obligatory warning slide no one in this room needs to hear this but I'm gonna say it anyway is just in case if you haven't been hit by some of these things you probably gonna get hit by some of these things at some point some of the companies that I work for or work with just this is a cost of business they are under constant attack I had a kind of fun discussion with one of them they sent me a packet caption they said can you tell me what's wrong and tell me

what's going on and I looked at it and immediately I was like oh you're under a massive syn flood and they're like oh yeah we know about that like okay yeah we don't care like 40% of our network traffic is is floods but it doesn't impact our business so we don't care we have some mitigations whatever you know what we actually want to know is what is this particular HTTP request doing okay well but for most companies this doesn't happen all the time right it happens every so often and that and it's that that lack of predictable mists of it that is really one of the key things that I want to talk about because it

makes it really hard to prepare for it if you don't have a plan in place to try to address these attack these sorts of attacks every time they hit you it's gonna be a mad scramble right you're just gonna everyone's gonna freak out and have to run around and figure out what to do and what I'm trying to say is that probably there's a better way to do it so um one of the things that surprised me though about some of these things is the lack of detection detection is apparently really hard people into detection super well and and part of it is because we are so operationally focused that until something becomes a problem we don't

really notice or care about it in some ways right so we don't necessarily put the effort into trying to detect the slow and incremental rise in the number of authentication attempts we think about that or we may not care that our network bandwidth is slowly creeping up over the course of several hours until it actually gets to the point where the servers are getting slow um so for example I've seen numerous brute-force attacks that were detected when the authentication systems stopped working that is probably not when you want to detect the brute force I have seen many web scraping situations where in this particular case there was a company that was providing a service that leveraged

another service in December opening was implicit back-end database or something that they had that company called the first company and said why are you dossing us and they were like what do you mean turns out that somebody was web scraping through this first company and that was generating just a ton of traffic on the back end the first company was better set up to handle that level of load they didn't really notice that there was a problem but the back end fell over and so it rapidly became a problem because the the back end company was gonna be like unless you stop this we are cutting off your contract you are going to go out of business okay maybe

it's an emergency now generally speaking right detection comes after or because there's some kind of a problem as I said so this tends to lead to a mad scramble and we should probably do better than that um the other things that surprised me is like once I get on the phone and I should say that there are quite a few customers who really know what they're doing and they call up my company because they have one specific question about how to configure one of our devices to do a particular thing and those are great those are super easy we just hand them the documentation or point you know tell them what to do and then they go away and it's lovely and

they know exactly what they're doing but there are an awful lot of other companies that I've been on where I go on to a conference call with like 30 or 40 people all at once including vendors other vendors IT staff security staff occasionally there's even like a CFO or a CEO who's actually on the call yelling at people because there's some psych down situations that they're handling and it's a complete mess right it's a very strange experience to be a vendor who is helping out a company and being kind of the calm one who can kind of try to advise what steps we should be doing because to be honest with you it's not that big of a deal to me I want to help

them but it's not my job on the line I guess maybe that's one way I can be calmer than everybody else but again it doesn't really seem like we should be doing it another thing we run into all the time is in order to help out the customers we need data right we need packet captures we need configurations we need some sort of in information to base our recommendations on and it is surprisingly difficult for people to get this to us they don't have pre-existing capabilities to get on to a switch or a router or a proxy or even a host to get a peek app they don't have log aggregation of any kind setup they don't

they and especially a lot of times the reason is is because I'm talking to the security person and the security person has to go over and ask the IT person who has to go ask the network ops person to get the data for them and so it's this long chain of trying to get things and it slows us down and because of that usually more often than I want to admit the attack is over before we've actually gotten any data and a lot of times the attack is over before we've actually been able to get any data that was about the attack in other words by the time we get to the point where we can get a peak

app the attacks already gone so the peak app shows nothing right the attacks done and we have no evidence and we have no way to say anything about it so we also have a lack of analysis capabilities we have a lack of knowledge in environments about how to remediate things like what points in their network they can actually implement controls sure a firewall but you gotta get a hold of the firewall team the firewall team may not know how to do this specific kind of remediation you want and I'm not really blaming anybody for this this is hard I get it their organizations are complex but I am trying to say we should probably try to improve it one way in

the industry we have typically tried to improve it is by providing SuperDuper vendor tools to do a lot of this for us right and I am very aware being a vendor that our tools are expensive and they're hard to configure and I'm not speaking about f5 in particular I'm speaking in general about vendor stuff they are expensive they're hard to figure out how to use you often have to change things to integrate them into your environment you can also try to shift the responsibility for dealing with the tax to things like sock as a service or scrubbing services or things like that those can be very expensive however of course if you are actually going to get

subject to these attacks they are costly they can be called same terms of downtime lost revenue stress on your employees I have been on several situations where the people that on the phone that I was talking to had literally been up for 36 hours straight those people are not going to be any good for anything else for a week after so can we prepare ourselves without having to spend a ton of money or commit ourselves to very specific vendor models or what have you and I think we can so what do we have to help ourselves out an incident responsible we may not have expensive tools and we may not have deep security knowledge in our teams we may

have very complex networks very complex organizations but what we do have is or we can get is data we don't always have it at our fingertips and I think that's one of the biggest problems because what we need to do is get the right data which is I'll go into this a little bit later the rightness of the data depends on the context in which it's going to be used we need to get it fast and we need to get a lot of people who can understand it and do something about it and then we need to do something take action right so we have the ability to generate a lot of data if we want we all

know this right I mean you could and some companies in fact do capture every single packet that goes in or comes out of their network now they do this sometimes with things like sflow which is more of a metadata thing about source IP destination IP ports and so forth some people do actual like full-on everything on the wire gets logged that's great for forensics it is probably less good for incident response than you might imagine mainly because a 10 gigabit per second connection fully utilized will generate 1.25 gigabytes of capture file per second at least I have been handed 80 gig capture files in my job and the biggest problem was finding a machine that I could open them on double click

on it open it up in Wireshark come back some time the next day when it's finished loading right so along with that we can capture debug logging from every single process if you want that might not do great things for application performance but we could get it almost every system you're dealing with has SNMP or could be made to use it we could capture and graph every single nib we can do all the vlogging from our APs and firewalls and ids systems and everything else okay we don't collect all of it because it's way too much and we don't have the room to store it and we don't have any analysis capabilities or dealing with it maybe you have a sim

and maybe your sim is awesome and maybe you can take all this in then you still need to analyze it and you still need to figure out how to interpret it so what we're trying to look for really is what's the sweet spot for incident response now I'm not saying that that gathering all that information isn't useful in a whole bunch of different contexts right but we need to narrow down the focus of what information we're going to gather to deal with incidents and have that at our fingertips so I should say to the mitigation tools or mitigation points are kind of something that we need to think about um obviously you don't want a block

well sometimes you do actually I I had one customer who was a vendor in Canada and for some reason or another they only sold to Canada that was their customer base they could actually literally say that we are only for Canadians they were under attack from Pakistan and southern Sudan and Italy I don't know something like that they were actually make the decision we can use geolocation to just block anything that isn't coming from Canada that was acceptable to them most people are not going to be in that situation clearly so we need to have good medication techniques that will allow us to just block the stuff that we want to block and not interrupt any other businesses

but sometimes we just have to bring the hammer down and just say these are bad IPs and we're gonna block him or whatever so we can mitigate at different layers so this is based on what you know kind of like I know all my customers are just Canadians that's a good example of that sometimes it's what you have access to clearly there are tons of places within a network infrastructure that you can block things you as a security person may not have access to all of those and I'll get into a little bit more about how we can fix that later but you use what you can right and you generally want a block as far forward as

you can because if you block it on say if you block it on a firewall and this is pretty basic if you block it on a firewall it's gonna be more efficient than trying to block it on you were 50 web servers individually right right okay so one of the funny things about data is that it's not value neutral in general one might argue that a wireshark capture wide open Wireshark capture is really just the bytes on the wire and there's no inherent meaning in that one might say that s enemy SNMP counter data could be considered the same way it's really just a fact that you can look at but almost all the other logging that we

do is done for some specific purpose and it is filtered in some way by the people who created that logging right I hope I'm making a sense with this so applications for example log things that are useful to application developers those may not be the things that security engineers want to see people who log network performance data log it to tune the performance of their applications they don't log it to detect attacks necessarily um quite often applications just simply say I'm working within parameters and maybe they're oh maybe I didn't handle one request right and they'll log that that's great it doesn't say I'm misbehaving because I'm under attack or because I'm receiving a hundred thousand

more get requests than I usually am like that sort of thing does not get logged so help our developers not so much for incident response so we have to take into account into what kind of data we're gonna look at more over interpretation is actually a pretty specialized task network admins hopefully know how to interpret traffic graphs and pcaps and logs from their switches and routers Simmons know how to deal with system logs devs and Apple owners should know what their applications are saying architects oh so now we're gonna get into some other stuff architects should know when major changes have been planned vendors know this should at least know the log files that their device is generating and all

of these folks can dig deeper into logs and have expert knowledge to be able to interpret them that may not be present on your security team so you're gonna leverage all of these people I think to get the best bang for your buck because they know what normal looks like now you have to you have a choice here you can either have your security team be isolated and try to collect all the data themselves and keep it all secret and not tell anybody and say we're in charge of security or you can reach out and try to make relationships with all these other teams now Estevan who's in the audience to give a really interesting

talk about this very thing I'm not gonna quote Esteban's talk extensively but I will point out that one of the things that he brings up is security is often this sort of walled off little group of people who do security stuff and we don't talk or communicate as much as we should with the various other groups within our companies this is kind of a bad idea in my opinion because if I really want to know how an app works and how that app can be exploited or how that app can be endangered in some way I need to talk to the application developer I need to have a relationship with that person and we need to have a

conversation and I need to be able to say hey we've got this really cool control system like this laughs or whatever let's work together to figure out how to configure the laughs that it protects your app and likewise I need to know from you what kind of things I should be watching out for if you're see your app start misbehaving tell me because then maybe maybe we can find an attack earlier right okay there are some other people too that you might want to talk to marketing people and I should say all of these are based on things that I've actually seen the attacks right the marketing people just did a big campaign big sales campaign

and I get a call from the network house people saying our utilization on our websites has gone up by 45% what the hell is going on right turns out not an attack just a good day PR knows when they just did a press release and sometimes that's because people are interested in your company and sometimes is because they've actually managed to anger somebody okay sales when they just did a bake sale right finance earnings reports etc all of these different things they all have situational awareness that might be useful to you as an incident responder to know about so that you can accurately assess whether or not something's an attack or even better you might actually

find out about these things ahead of time so that you can actually get your monitoring get your scaling ready to go in case something happens oh and of course the most important one here is down at the bottom executives really do like to be kept in the loop I think we all know that it avoids having them come on to your conference calls and yell at you a lot okay so how do we make this happen um the first step is clearly communication who is the security team what kind of attacks is they have they seen do a little bit of Education I would recommend not doing it in such a way that scares people but just say hey we

live in a world where we get attacked a lot we need to prepare for it let's calmly and rationally figure out how we're gonna handle this also not blaming anybody like your web app is terrible and we're gonna get you did know just we're all trying to improve the business we're all trying to support business goals let's work together to get this done and mainly saying the security team will need your help and that's something else that I see a lot of security teams and in fact security teams that I've been on in my career we have a tendency to kind of stride into the room and say we're the security team and you're gonna do what

we say nobody likes that nobody likes being told what to do make it a cooperation make it a making a partnership and of course pre-established ways that you can get a hold of people this has been a really big thing in the incident responses that I've been involved in I tell somebody ok it would be really great to get a packet capture from the switch that's upstream from the device that we're looking at oh that's the network ops team okay well can you get ahold of them maybe just a second ten minutes goes by the one guy I know is on vacation I don't know who else to call right or it's or it's 3:00 in the

morning and they're all at home or whatever right working those things out ahead of time with your staff with your non IT staff with your vendors with your ISP especially for volumetric attacks is going to be really important depending on what kind of stuff your company does maybe even law enforcement you might want to have the phone number of the FBI agent locally who might need to know about this coordination is super important too and I would recommend and I've seen some success with this playing through attack scenarios on paper wargaming it basically I hate that term let's call it board gaming it yeah okay so basically it is attack simulate yeah it's still a little aggressive a

tabletop okay we got that tabletop tabletop the game out and try to figure out what kinds of capabilities you need but don't have and then how you can get them try to figure out what kind of data you'd want to see but baby you don't have quite at your fingertips yet and figure out how to get it figure out what devices you don't have access to as a security team and maybe get that access or at least coordinate somebody who can get the data off of it that you need and then you of course document document document document and then everybody gets a copy of it and everybody knows what's supposed to go on that would be

super helpful I think because different groups of people need to be involved in different things you know a syn flood okay your ISP and your net ops probably those are the people who are going to be able to handle it no need to necessarily involve the app people in that but if it's a sophisticated attack against your customized web app you might need to get everybody involved you might need to get the PR group involved because your site's down and they gotta figure out how they're gonna respond to the media requests about why is your super cool site down right I mean I don't know if you work for Twitter and everyone's gonna freak out if Twitter goes down in

the third step of course is practice and this is one of the things that it is actually a little bit difficult sometimes to sell a useful metaphor here is the way that emergency first responders deal with fires specifically now EMTs and firefighters respond to calls constantly right they're very very busy but you might be surprised to find out that only one point five of them one point five eight percent of them per year are actually structure fires firefighters don't spend a great deal of time fighting fires they are mostly doing other things but when structure fire does happen they really need to know what they're doing because the downside of it is that maybe a whole block burns down right it's bad

so they spend a great deal of time drilling about how to fight fires and I would say that drilling how to fight fires in your network drilling how to do Incident Response is really really critical to making it something that you can just do that you don't have to freak out about that it doesn't have to be super stressful that you can respond better and faster and you already have worked out the kinks in your problems before before you get to this point and make it a lot easier to deal with now how often well two things actually one thing is this involves going to your boss and saying I want to das our network can I do that

okay that can be kind of a hard sell some companies are better about this than others but I think I think there's ways to justified or going to a boners and saying the same thing but hopefully most of us have tests in dev environments that we could maybe use for this kind of thing once in a while that might be helpful um so how often often enough to make it routine right I think we're doing it once a year probably is not enough doing it once a week is probably too much although maybe not for your actual incident response team like your security team but if you're gonna involve everybody else maybe too much once a month once a

quarter I don't know something but enough enough so that people don't forget between times what they're supposed to do there's actually some really interesting research mainly in the linguistic world about memory retention it turns out that that you can try to memorize say a bunch of words and you'll remember them all in the short term very well and then it you'll forget almost all of them but if you then refresh your memory you'll remember them a little bit longer so there's this idea of spaced repetition if you space it out if you do it enough you will actually memorize something but it takes like six iterations usually before you do that and you can't do them all at once you

have to do them once and then wait like a day or two and then do it like a week later and then do it a month later and that's how you get things baked in your memory I would suggest that that doing this sort of thing is probably amenable to that as well do it frequently enough so that it doesn't completely fall out of people's brains but not so much that they get overwhelmed by it um just as a side note working with vendors vendors can be critical partners there is one big thing that we run into a lot with our customers is that some customers have security protocols that say we can't share it a with you that makes my

job hard and I understand I mean if you're if you're you know some kind of black ops site somewhere in the government or something like that maybe I I don't know I for whatever reason some companies don't feel comfortable or can't legally share data with me because it might contain personally identifiable information or IP or whatever they're protecting and that's fine um that does put me into a difficult position so if you are in this particular place work out ahead of time how you can scrub or redact data so that you can share it with your vendors you may not be able to give them full packet captures but maybe you can give them that flow data which

is just you know IP addresses imports if you want to get sophisticated there are a bunch of tools that can edit pcaps and like say you can give it a list of stuff that you want torn out of it regex is to clean it right you can you can do some cool stuff with that um maybe you can get your legal department to help you get NDA's so that you can share something with us that'd be useful as well and also and I'll just appeal to everybody here we're supposed to have SLA as vendors hold us to them if we say we're gonna respond on an hour make sure that we respond in an hour it's it's it's our job to do that

so hold just hold our feet to the fire about that please okay so the goal unless it's a in case I haven't been clear enough we want to try to make sure that that handling network incidents gets easier and the ways that we do that is by making it something that is more than just the security operations team or the incident handling team or the security team or just the network team or what-have-you more hands are going to make lighter work what we are striving for here is quicker identification quicker resolution and making things not so much of an emergency now that's kind of the the the high-level basic stuff so let's dig into a little

bit more interesting thanks because running scenarios will help you get the basics sorted and if you're doing that you're gonna be in a pretty good spot but what about the weird things what about the sophisticated attackers and in that kind of situation you have to leverage something which is which is honestly difficult to train for this is where a quality people on your teams is going to really make a difference because you need to start getting clever with the data that you have maybe you don't have data that shows exactly what's going on but maybe you can infer what kind of attack it is by what little lady you do have so I'm gonna talk about

a couple of different case studies here the customer called they said they were under attack that's usually how the call the calls start for me now this is also something that tends to happen the attack had already happened and was done with it had subsided before they had actually gotten anything to take any action on the attack was gone and this happens for a bunch of reasons but um it's happening I think rather more and more we're seeing an increase in in spiky attacks they hit you for a little bit and then they go away and they hit you for another loan go away um this may be an actual technique to try to screw

around with Incident Response like it may also be due to some technical things which I can get into in a second but the customer needed to prepare for the predicted return of the attack and that's a pretty good thing I mean generally speaking and they are going to come back it was this is a very very very big high-traffic network and it was what we would generally classify as a service provider of some kind anyways giant network tons of connections and they actually did have package capturing capabilities they'd had some sis central system that was going to capture packets on demand and they seemed like they were pretty well set up except that it was

down during the attack they didn't say why maybe it was maintenance I don't know but it was down Murphy's Law pays a big law you know pay plays a big role in sort of things so they didn't have any of the attack traffic captured and I said did you have anything like is there any data that you can share with us that that took place during the time of the attack so I can help you out and they said actually you know what we're testing this new monitoring system that uses ICMP we have full like CMP captures during the time and I was like wow really nobody uses ICMP cool okay send me Orem so I took a look at him it turns

out and that was actually really cool because the dumps that they gave me were full of ICMP error messages just reams of them right ipv4 and ipv6 and it was a large capture file and was almost entirely ICMP errors and more importantly it was almost entirely host unreachable port unreachable errors so at this point I started thinking about what could possibly cause that in a network well those errors are generated when you get a traffic some kind of traffic that comes in and it's trying to go to a particular IP import that doesn't have a listener on it right so okay that gives me one little bit of information the funny thing or the cool thing about

ICMP error messages of this type is that not only do they include the ICMP error message and of course all of the IP headers that go along with that but they also include the IP header of the traffic that's stimulated the error and they include at least 576 bytes of the data portion of that as well not sure about that number if I were six bits I don't know there's some amount so what did I know so far well I've got traffic from port 53 UDP going to some random AHA high port in their network oh I see here it says eight bytes okay I should probably read my slides more carefully anyways it can be it can be a variable

amount so what did I do well because I'm old I fired up Perl and I loaded a bunch of net modules and I tore apart the packet capture and I got rid of the IP header and I got rid of the ICMP header and what was I left with well I was left with the actual attack traffic cool um I had enough it wasn't full I couldn't reconstruct a peak app from it but I had the mid enough of the attack traffic to be able to say this is DNS traffic okay moreover this was a and quad a records and there was DNS responses that was coming out so I was able to conclude that what I was seeing

was evidence of a reflected DNS attack in other words the attacker was from a spoofed IP that was spoofed to be the target sending a request to Adeena server that was then sending a response to hit the target right and this showed up in the ICMP error logs it's a very common attack technique you see it all the time but it did give us some ability to be able to say okay well what can we do to help with this right well we can we can block DNS responses from coming into your network unless they're actually going to your DNS server that has a legitimate reason for making them or we can improve your state with stable

stateful firewall config to knotek's to just drop packets that don't have some kind of a connection table entry that would indicate that you know it was an actual legitimate response it's pretty happy about that that one worked out pretty well umm it's actually gotten me to the point to say you know ICMP that's been around since the beginning of the internet why are we turning that off everywhere well there are some good reasons to not permit ICMP to be emitted from your edge except for maybe doing path MTU discovery but inside your land it can be very useful you might want to think about not just turning it off the next one was was a fun one I was on the

phone with the customer and they were like our switches and the proxy devices that I support were just kind of blowing up they were having to reboot them all the time the load was surprised what's going on we think we're under a massive dose help help help okay this was affecting their production this was affecting their ability to serve up the webpages they needed to serve up the customer really did not know how to go about handling it at all but they knew at least that the the main indicator that they had was that their switches and this proxy was just freaking out right so that's where they focus their attention so what I ended up

doing was checking around the problem essentially right because it's good to get enough awareness of what's going on throughout the entire network to try to figure out how to deal with it um the network device load was super high but the inbound edge of their network was just normal they had graphs for that it was just the same kind of traffic that they always saw and their application servers weren't heavily loaded either even though people weren't able to get to the web pages so that gave me a little bit of a tip that maybe it was actually constrained to just these two devices right or at least somewhere in between the edge and the core so I managed to

get a quick peak app and I managed to see that the TCP sequence numbers were the same a lot even though there was you know from inbound and outbound but the Macan source addresses were swapping back and forth well okay so anybody care to guess what this was kind of routing loop well so broadly speaking a routing loop I guess essentially what was happening was the switch was sending to the proxy which was sending to the switch was just like and did the proxy which is going to be this way so on and so forth we corrected the configuration issue that was causing that and everything went away all right okay on Tara through the rest of this some

general discussion of specific issues these are a few little techniques that I found useful in doing these sort of things if you do not if you do not have the budget to do full-on packet captures for everything or you don't have a ton of ability to do this or actually this is really specifically useful when you're doing a lot of cloud deployments you can do packet you can do constant packet captures into essentially circular buffers at each our supports this you can also just have it configured to dump into a bunch of files you can limit the file size automatically and you can have it you know so you can have six files that are

all ten Meg's or whatever if you have that constantly running and you suddenly think oh I'm under attack you've actually got attack data now you don't actually have to go and start the capture and get it you actually have the data that you need so you just copy that over to snap it over whatever you need to do to get it can be very useful and simple counters are often overlooked but they are often the way that we diagnose these things how many HTTP requests are you getting how many how many TCP requests are you getting you can have graph it see trending data right because all of a sudden HTTP traffic HTTP traffic right now you know it's an HTTP

attack it seems really simple but the number of times that I see people struggling to get this kind of data ahead of time is pretty big other things like average size request I mean you can go really down the rabbit hole with this depending on what your applications are and what your sort of things are doing but this sort of simple counters and graphing can be super super useful for a general methodology of how to assess Network attacks that at least I've developed when I'm talking to people is I ask the following questions in the following order are your inbound or up on pipes saturated that tells me a bunch that tells me whether or not I'm dealing

with a giant volumetric attack or am I'm dealing with something else attacks can be both volumetric and something else like a I've seen brute force attacks against web servers that have actually filled network pipes not all attackers understand that they need to kind of keep it low and slow some people just blast you I'm also volumetric attacks are a great way to conceal other things because you have the network ops team running around trying to deal with the with the massive UDP flood and they're not going to see the attempts to compromise the web server using a node a or whatever so it's useful thing to know but it gives you a some chance to prioritize what's

important then I move up a level is it TCP is that UDP is that ICMP is it GRE whatever and then if that doesn't give me enough information then I say what else having broker hole is it web is it as sage is it and then finally if I get to that point do I need to actually get into the into encrypted traffic to figure out what's going on so essentially what I'm doing here is I'm going from the easiest to the hardest if at any point during this I can say bingo that is traffic that I can identify as being a hostile in a very positive sense like I can distinguish that traffic from legitimate

traffic that's where I stop that's where I start looking for mitigations right because I'm not gonna go and do a full-on investigation of this if what I really want to do is just make it stop so if I can get to the point where I can say okay we have something we can we can hang our hat on now let's immediately go and try to figure out what the best way to remediate this is this is where I stop and usually that's by basing it on some kind of unique characteristic maybe all the attack traffic is coming from one IP we actually see that that's kind of weird but we see that maybe it has to

do with all of the attack traffic is HTTP and the host header is blank we see that or it has a bizarre user agent string or whatever you can generally eventually get to the point where you can figure out some way to block it and if you can't keep digging I know this is a big thing encrypted traffic everybody wants to encrypt everything I am very much a positive fan of that I think that's great but you do actually need the ability to look into it sometimes you can do that in a number of different ways depending on your security policy this is where application logging if you if you happen to run custom apps this

might actually be super useful if you have some logging capability in your app to actually dumped out payloads of the requests that it's handling that way you don't have to like mess around with decrypting and Riaan crypting to satisfy those requirements you can use proxies such as the one that I work for to do that though and get insight into those traffic's or you can do things which I wouldn't recommend which is like downgrading the crypto so that you can capture the poll and shake and use the RSA keys to get into it everyone should really be using a femoral difficult an ACC or something along those lines that offers perfect forward secrecy so what's let's try to avoid doing that

along with that um cloud issues everybody's moving to the cloud the cloud is just a bunch of computers that you don't control but they do have dashboards and stats and you should know what those are and be able to leverage them one of the funny things about the cloud is that mitigations that you might have in your network or things that they offer services hey you might need to pay some money for those there's basic firewalling obviously there's some other things like that but I know Amazon and other cloud providers are providing web application firewalls those may or may not meet your needs but you should definitely check into them and figure out whether they're worth the money

especially in DevOps environments which I get to a little bit more in a second um you may not know where your cloud stuff is deployed depending on your development model and you they may or may not know how to turn on logging or do packet captures on those so but if you can figure that out scripted use orchestration use kubernetes use ansible whatever to be able to like push a config that says turn on debug logging and get me a pack of capture that would be useful right you can just do it like that DevOps or as I've recently heard it called dev net sec ops apparently we're converging all of us is great but it's

increasingly replacing traditional functions of system administration and network administration but it's replacing it with people who may not have the full set of skills in either one of those things right you mean we have Deb's and Deb's are able to now spin up instances in the cloud and do a bunch of networking with Software Defined Networking and so forth and it works generally really really well and they don't have to worry about it and that's great because we can reduce the amount of money that we're spending on network teams and system in teams and so forth ok ok but what happens when it's a security thing they don't necessarily know enough about it they know how to

spin them up they know how to auto scale them and that can be useful during a DAWs but what happens if somebody is actually triggering a vulnerability in your app and hanging down all your instances who's the person that you go to talk to about that how do you get access to the configs how do you make the changes that you need to do quickly that's the question that essentially that I'm asking and sophisticated attacks here really it's just distinguishable that l7 is how I would describe that and able to change quickly oh this is what I was going to say so what we're seeing a lot of is wave attacks there is some debate

in the community about whether or not this is actually a technique or whether this is just a side effect if it's a technique what the attackers are doing is they're blasting you super hard there's no ramp up it's just like from 0 to 100 boom for a short period of time and then they go away and then 6 hours later they come back boom go away it's super disruptive to your business it reduces the amount of time that you have to do anything it's actually a pretty effective attack technique some people who study this stuff think that it's an actual technique some people think it's because the botnets that are doing this are just blasting giant net blocks and

when they sweep around you you're a little bit you see it and then they go on to the next part and that goes away don't know either way the effect is the same though but in these situations especially and especially attackers who are actually trying to trigger vulnerabilities you absolutely need to have tight relationships with your app teams to be able to address these ok that was a little bit unfocused in all over the place but I hope you enjoyed it essentially the main things that I'd like you to take away from it are there there aren't any vendor magic bullets for this stuff there are tools that can be helpful there are definitely tools that can give

you insight and the ability to manipulate traffic in your networks in ways that can be very useful but you still obviously need people who know how to use them and can do this sort of thing and that incident response absolutely needs to be a function not only of your security team but also of just about everybody else that analysis and response is better done by the subject matter experts than the security team so if you work in cooperation you're going to get better results and that the means to do this is by communicating ahead of time establishing relationships with those groups and doing some we're gonna pretend we're under attack while not being under attack scenarios so that you

can get all the bugs worked out early you can deal with the easy to deal with attacks quicker and better you can be less stressed out and then you'll still have the energy to be able to deal with the really sophisticated attacks that you might get so that's what I've got to say we have two minutes for questions if anybody has any or you can all go out and get some coffee or whatever you want so thanks very much for coming [Applause] anybody have a question yes um I can get you these slides have you found them useful that we are also being recorded so depending on how well the recording turns out I probably could get you a

copy of that too yeah and and if you want to give me your card or whatever I'll make sure I can get you this material anybody else no all right cool thanks

you

Incomplete Views: Network Incident Response in a Data-Poor Environment

Related talks