
All right, last presentation of the day here and then we can go to the afterparty. We got Jameson Budaki and he's going to be talking about Net Flow for incident response. [Music] Can anybody hear me? Good. Good. Good. More. Better. Good. All right, good enough. Uh, so obviously I'm going to be talking about Net Flow for instant response. My name is Jamesonaki. I'm a senior information security analyst with Erie Insurance. Uh, before that, um, I worked five years at Cisco Systems. I was on information security engineer on their incident response team. Uh, my primary role there was, uh, the CERT's um, net flow deployment. So let's get started. Um so today's agenda first I'm going to
talk about what net flow is give a general overview and how it fits into incident response. Um then we're going to move on to talking about uh collecting flows. This will include choosing a collector uh where to collect the flows in the network. um as well as an example of a net flow cache uh as well as the performance impacts of turning on net flow and exporting net flow. Then we'll lastly discuss various examples uh example cases with net flow and we'll finish off with a few lessons learned from past experiences. So what is net flow? Net flow is a form of telemetry that is pushed by various networking devices. It was first developed by Cisco in 96 and embedded in
most IOS platforms. Um it was first used for monitoring traffic and network planning. Um the primary it was primarily used by uh network service providers for QoS. Uh but net flow can actually also be used for security monitoring and is a great way to gain situational awareness within your environment. The best analogy for net flow is net flow is like a phone bill. Um whereas traffic capture is more like a wire tap. So we can learn a lot from studying a network phone bill. Uh we can learn who's talking to who uh over what ports and protocols, how much data was transferred, uh what was the speed and for what duration. So over the years Cisco has created uh
nine or depending on who you ask 10 versions of net flow. Uh V1 was the original implementation. Uh V2, three and four were never released. Six, seven, and eight were kind of never really adopted either. The two key ones are V5 and V9. The most common is V5, but the only setback on that is it's restricted to IPv4. Um V9 is template based. Uh it does support IPv6 and it's actually the basis for Net Flow's cousin, uh IP fix. Uh net flow can come in sampled or unsampled. Um sampled is basically a subset of the conversation. It's like reading every 15th word on a page. Um unsampled is full collection. Um it's like reading every word on the page.
Sampled is good for network performance monitoring. Um excuse me. Unsampled is ideal for security monitoring. So, Neflow works best with in a combination with other technologies such as IDS, IPS, vulnerability scanners, event logs, anti virus, and even full cap packet capture. So, a typical flow typical flow of events would you get an alert from your IDS. It's a primary alert source with its passive signature based. The same is true for AV to some extent. You get an alert. Then you would take the alert and run a net flows a net flow report for and we'll kind of get into that later on. But during the investigation you also you know figure out what hosts are affected and you can
go into your event logs as an investigation tool. They can be filtered down as needed. Um vulnerability scanners can kind of give you an idea about any potential attack vectors or unpatched systems. Uh full packet capture can be utilized for any deeper analytics. So there's other vendors that have flowbased technologies. Juniper has JFlow, Huawei, HP and Threecom use Netstream and even Citrix has Appflow. Now we'll move on to talking about collecting the flows. So in basic terms, flows are exported from a router and captured by a collector. They're sent over UDP on port 2055 by default. They can be analyzed on the fly or they can be stored in binary files. Flows can also be forwarded or fanned to other
collectors. Most routers have a limitation on the numbers of destinations you can output to. Cisco in most cases is two. Uh you can also send multiple flows through a single IP. Um this is also a good method for improving redundancy and flow collection. Data files like I alluded to earlier can be analyzed by various tools. They can be loaded into a database or you can have ad hoc queries run against them. So now the first step is choosing the system itself. Some concerns around it are scalability. What's your flow rate per minute in your network? Maintenance. How supportable is your solution? Are you going to get any vendor help or are you on your own? Say for example, you
have a collector that goes down and are you going to have a vendor that's going to overnight you a new collector and you can just place it in your data center that day or you going to waste time finding a new host, compiling the code and setting everything up? Also, another thing around maintenance is in my previous experience, there was only one or two people that knew the tool in and out. Um, a commercial solution offers you the ability to spread the knowledge a little better. Some other key uh concerns are data retention. What's the number of days you want to look back on? Ideally, you want to have at least 90 days of retention for incident response.
Performance is another concern. How fast does it take for your queries to run? And does it work when you need it most? If you're spending 15 to 20 minutes waiting for queries to come back, you probably need to either fix your current solution or find a better one. And of course, the big one is cost. Um, so since disc space is a big concern, a general rule of thumb is for every one meg of storage that is for every one meg of storage that you have um you need one meg of storage for every two gig of network traffic. So, another thing you're want to do is consider your network topology. Since Net Flow is UDP based, you're not going
to want to send flows from India all the way back to Cleveland because some flows may not make it back. Um, also you need to fan out flows if needed, as I alluded to earlier. So, like I said, it really comes down to a freeware solution or a commercial commercial solution. So, some great freeware solutions are listed. OSU Flow Tools is a great one. Uh the only drawback is it only supports V5. So you're gonna want to use NF dump and NFS send to kind of fill that V9 hole. Also a great front end for OSU flow tools is flow viewer and flow grapher. Some other tools are silk which is actually created by uh the cargi melonert team. It
supports both v5 and v9 and top is another one. It's uh kind of known to be limited out of the box, but the great news about most of these tools is they're obviously free. They're fairly easy to use and they're well documented. Commercial solutions are led by Lancope and Pixer. Um Lancope has more of a security focus while Pixer is known for traffic analysis, but Pixer has been really starting to get into more of a security focus. And of course, Arbor Networks is the 500 pound gorilla with a more of a a DDoS focus. And actually, interestingly enough, this is the exact position that I'm in now with my current employer. I'll probably have to prove Net Flow's
worth with a freeware solution, then convince them to pony up the money to buy a commercial solution with added benefits. So, where in the network do you want to collect flows? You want to collect flows at the choke points much like an IDS deployment. You want to at the ingress and egress points of the network the f with the furthest aggregation point from the corporate network. Those are clouds but you can't really see them that well. Um so ideally the internet gateways and the data center gate data center gateways. So this is a uh net flow cache example. It's a basic representation of a cache flow on a router. So data comes into a router, a net flow cache is created and
the data keeps moving out. So in step one, you can see that it's created and there's the the who and the who from who to where and ports time all that all that good stuff. And then a flow will expire. So there's two two primary ways a flow will expire. An inactive timer. So if it's if a a flow stops sending information for I think the default is 15 seconds it it'll flow the flow will be stopped. And an active timer the default is 30 minutes. An active timer is um basically will be called whenever a say a flow is uh a traffic flow is 60 minutes long. after 30 minutes the flow is going to be stopped and then either
putting the cache or then then finally exported out. Um so really the key for having for these is having a collector that is going to stitch the flows back together to make in a useful conversation. Um also once the net flow cache is full the oldest flow is going to be deleted first. So the next step is really exporting the flow. It's uh you call out the version the destination and the port. Um and then the UD UDP packets are sent off to the collector. Um so 30 flows um is about there's about 30 flows in a 1500 byt packet. So uh exporting net flow is actually pretty simple. It basically consists of telling the router which version of net
flow. Well, first of all, you have to make sure net flow is enabled on your router. We like to kind of suggest to the networking team to have net flow turned on so you kind of avoid the whole conversation of uh a performance hit and all that. Um so then you're going to want to export the flows. So exporting a flow is basically consists of setting the version that you want the interface that you want flows to go out of and the destination and the UDP port that you need. Some of the optional items is where this is where you can set the active and inactive timers. And you can always verify your device configuration
by doing show ip flow export and show IP route cache flow. So you have no gear in your network that supports net flow. Is that a pro? Are you screwed? No. Uh there's various freeware and commercial solutions that will capture Ethernet frames and process them into net flow records. These devices are usually connected to a span, tap or mirror port on a router switch. I believe the some of the the freeware solutions are off the top of my head end probe and I know that Lancope has something called a flow sensor uh for on the commercial side. So the performance impact of net flow um net flow is the processing overhead for net flow is fairly minimal. It's a
very efficient protocol. As you can see, the red line is the router CPU and the green line is the router CPU with net flow turned on. Um an interesting uh fact here is on a router outputting out outputting 20 gigs of net flow data a day and 800 plus active flows the net flow process consumed is less consumed by the CPU is less than the SSH process during a five minute interval. So to kind of put that into a picture for you. All right. And now we're going to kind of talk about some example cases and then we'll follow up with some lessons learned. So uh a great case scenario here is uh to detect a host that's scanning.
Basically you're going to run a net flow report for hosts with multiple connections to many destinations over a short period of time. You can exclude ports on specific hosts such as 53 on your DNS server. This will help to identify a machine or a malicious user on the network. So in all of these case scenarios, uh the the key thing to remember is a lot of these tools will allow you to set up jobs that run reports and these reports can either be emailed to you and they can be run on a certain interval and any time of the day. And also some of the commercial solutions will actually send alerts as they're happening um to your email or
however you want to see an alert. So detecting a botnet. So, say you have a host that is identified as infected through either uh IDS or AV, you're going to want to run a net flow report um to determine where the infected host was connecting and what uh the command and control server was to find out the IP and the and the ports that were being used. So, in this case, you can see the infected host is connecting to the command and control server over a certain port. Part two of that is since we identified the command and control server as a bad guy, we're going to want to run a net flow report to see all the
connections that are outbound from your organization to the command and control server over a specific port. This is very helpful in finding other compromised hosts on your network and helps you to reduce the meantime to know about any other incidents. Trace spoofing. This is one that can be run on the router itself in the cache. Um so say uh basically you're going to run a query directly on the router to determine the source of the source interface of the traffic. So you'd basically do that with show ip cache flow and include the IP address and you can do this on the next top router assuming that CDP neighbor is enabled. Um actually team Cry has a great
walkthrough on their website about how to do this. I believe I have it listed in my uh resources at the end. Uh another example is um finding uh a host that is being natted. So in this situation a router is providing NAT services to the internal host. The external XP the external IP is the 641 listed. you receive a complaint from an external organization uh claiming that one of your hosts coming from the external facing IP is downloading some protected copy copyright protected material. Um then and the excite tables show no concurrent connections to the victims. So net flow can show all the connections to the destination before NAT. Uh this can help you mitigate the issue on the
correct host and tie it down to the correct user as well. So uh firewall and ACL audit. So net flow can be used to validate traffic flow from one area of your network to another for policy reasons. So keep it think about a data data center to the internet traffic or traffic that is subject to regulatory compliance such as HIPPA or PCI. And obviously a firewall is configured to only allow specific services on specific ports. So, you're going to want to run a net flow report showing all traffic. You can exclude and you're well, you're going to want to exclude the traffic that you allow. In this example, you can see the ACL is letting a host talk over port
19504. This is good for detecting compromised hosts, establishing connections. It also it also helps you to determine how an ACL will affect traffic. Monitoring a darknet. So assuming your organization has a darknet of unallocated IP set up, you can actually run a query for all connections communicating to your darknet. In this example, there's a host communicating over ICMP. Um, this is great for detecting probes on your network and, uh, it's a really easy way if you have a darknet set up. Bad IP list. Um, say you get a external intelligence from a trusted source such as law enforcement or a colleague. Uh, they send you some information about an IP and you don't really want any of your
hosts communicating with it. What you can do is obviously run a report, a net flow report to all connections to the external IP. Then you can dig further into your investigation as to why these hosts were communicating to that bad IP. Net flow can also help you uh discover a possible tunnel. So um think about an HTTP request. Um the client in this example the client is sending more traffic than the host for examp. So think about imagine the HTTP traffic where the client is sending more host more traffic than the host. This is contrary to how the protocol works. The client usually sends a small amount a get ahead a post to the server and the
server usually sends a larger amount of data back. This unusual activity in this case it's over port 80 is likely to indicate a possible source of data data exfiltration or unauthorized use. Um large file transfers can be detected by net flow as well. Um you can set up a threshold for whatever you want it to be. in this case 100 meg. Um so you can have a report that comes back and says hey this user is sending over a certain over the threshold amount of data. And this is also another uh indication of possible data loss. Long running sessions. This is kind of the uh low and slow data exfiltration or communication to a command and control
server. So, Net Flow can determine a long running session to the internet from a a data center host. Um, in this example, uh, the data center host has flows that last that is lasting five hours and only sending five packets per second. This is something that you're going to want to investigate further and is kind of an anomalous behavior in the network. So, protocol anomaly. Um, Net Flow can determine when a host has a TP a TCP session with less than three packets and the TCP flags are set incorrectly. Um, in this example, the infected host is sending 50 packets, but all the flags are reset packets and it's not and the traffic is not sourced from a standard
application. This could possibly be the work of a packet generator or communications from botnet clients out to their respective hosts. This is another one of those things that you're going to want to look further into. Pure command control connections. This kind of this kind of goes along with uh finding botn nets. Uh you run a report and you find out that you have five hosts that are communicating to a command and control server. um you take it another step further and you run reports for each of the hosts and all their outbound connections. In this example, say the host first all four hosts connected to a uh a download server. And once you sift through all
the other ports, you find out that there's actually another server in common and this server is actually another a secondary command and control server. So, a few lessons learned. Um, setting a time zone to uh, UTC for all your networking device and using NTP helps out greatly. It really helps out uh, it really prevents having to reconcile times and uh, issues during an investigation. Freeware solutions are difficult in larger organizations mostly because of the scalability issues, data retention issues and also as I alluded to the time meant the time spent maintaining solutions uh in the virtual in virtualization environment uh multi-tenant hosts uh usually it seems that they needed to be required to be on their own
VLAN in order to capture flows effectively. Uh this is where you're going to want to lean on your vendor to explore uh VM based collectors. A lot of them are moving into this area as well. Also, after an investigation, there's a couple of questions you should ask yourself. Um like do we have any gaps in coverage? Do we not have net flow deployed in certain areas? Uh do we have the right kind of uh data feeds coming in? Also, if there are any net flow reports that are particularly useful. If there were, you should think about scheduling them and running them on a routine basis um and getting reports directly from the collector as needed.
So in summary, Net Flow is a powerful investigative tool that allows detailed accounting of network traffic. It is very powerful on its own, but it's uh better it's it works better with other data sources and to help you paint a full picture of an incident. Uh we have seen that enabling net flow really doesn't impact the performance of the router. And lastly um net flow can be used as a primary investigative data source. Without it we may not be able to paint the full picture of a network compromise that we encounter. So some of the references are here are pretty good. Uh the first one is actually from a former coworker. This is more details about setting up OSU flow
tools uh for net flow and incident detection. Uh the Cisco um net flow technical overview is really good. There's the team Camry tracking this spooped IP address um walkthrough and there's also a pretty good SANS paper on uh using net flow and applying it towards an investigative model. So, I guess kind of breeze through that. A little nervous, but any questions. Okay. Thank you.
Absolutely. This was This is very awkward.