NetFlow’s Dirty Little Secret
By Mark Evans, VP Marketing, Endace
Many organizations assume their security tools can see everything that happens across the network to detect potential threats. Unfortunately, that’s not the case, for two reasons.
Firstly, if security tools are only analyzing network flow data (NetFlow) then they can’t analyze the actual content of network transactions. NetFlow can show that a conversation happened between host A and host B, what time that conversation happened, what port it happened on, how long the conversation took and perhaps even how much data was exchanged and what applications were involved. But without looking at the actual packet data from that conversation it’s impossible to know what data was exchanged. We’ll come back to this issue later.
Secondly, there’s an “inconvenient truth” about NetFlow generation. Namely, that many NetFlow generators only analyze some – not all – of the traffic on the network. Often NetFlow data is based on sampling traffic and using statistical analysis to “estimate” what’s happening across the network. This is because the computational overhead of analyzing every packet traversing the network is heavy. Since NetFlow data is often generated by network appliances such as switches and routers, sampling is often used to reduce the load on those devices. This helps ensure their core, primary role of routing or switching traffic isn’t compromised by the overhead of analyzing that traffic to generate NetFlow data.
Sampling works by taking a sample set of packets (packet sampling) or network flows (flow sampling) and using statistical analysis of that sample set to model the traffic flowing across the network. This approach is often sufficient for what NetFlow was originally designed for: generating traffic flow information for managing the network, identifying congestion points or outages, and forecasting network demand. Unfortunately, it just doesn’t cut it when it comes to security monitoring.
Effective network security monitoring relies on being able to see all activity on the network. If security analytics tools are relying on just a sample set of network data, they’re bound to miss crucial details – for example the packets or flows relating to specific threats may simply not be part of the sample set the NetFlow data was generated from. This creates a massive “blind spot” – the smaller the sample sizes, the bigger the blind spot.
There is a simple solution. You can turn sampling off (assuming that’s an option on the switches and routers generating NetFlow on your network). This ensures you are generating flows for every packet that traverses the network. However, the problem is you are then placing a potentially unsustainable load on the appliances that are generating NetFlow. When those appliances are overloaded, the accuracy of the NetFlow and the performance of their core routing and switching functions is impacted.
The solution to this issue is to decouple the task of NetFlow generation from core network appliances by deploying standalone NetFlow generators that can generate unsampled NetFlow (where every packet is analyzed to produce the NetFlow metadata).
On small, lightly loaded, networks this can potentially be done using software-based NetFlow generators and standard NIC cards. But today’s high-volume, high-speed enterprise networks, require purpose-built hardware that can capture and analyze every packet to create 100% accurate, unsampled NetFlow metadata. Only then can you be confident your security tools can see all the flow data related to all threats on the network.
I promised I’d circle back to the first issue: even with 100% accurate NetFlow data you still can’t see the actual content of transactions that happen on the network. For that, you need full packet data. Without the packets, security teams (and their tools) can’t see the detail necessary to quickly – and more importantly definitively – investigate and remediate advanced threats on the network. This is another big blind spot.
Widespread vulnerabilities such as SolarFlare, Log4J 2 and high-profile attacks such as the one against Colonial Pipeline have highlighted the importance of full packet data for threat detection and investigation. In response to these increasing threats, the White House issued a broad-ranging Cybersecurity Mandate (Executive Order 14028) which explicitly includes a requirement for all Federal agencies, and their suppliers, to continually record and store a minimum of 72 hours of full packet data that can be provided to FBI and/or CISA on request for cyber investigations. This mandate takes effect from February 2023.
That the White House has seen fit to mandate this requirement highlights the importance it places on the value of full packet capture as a critical resource for enabling government agencies to defend against threats including nation-state attacks. Full packet data provides the only definitive evidence of network activity. It is also a key resource for the effective implementation of Zero Trust and other important Government cybersecurity initiatives.
As Shamus McGillicuddy, VP of Research at Enterprise Management Associates, suggests in this whitepaper, rather than viewing the mandate as an unwelcome compliance headache, agencies and suppliers should welcome it as an opportunity to implement an infrastructure that enables resiliency in the face of ever-increasing cyber threats. Indeed, ensuring this level of visibility into network threats should be seen as a best practice blueprint for public and private sector enterprises around the world.
The gold standard for security teams (and their tools) is to access to both a complete, record of unsampled NetFlow data, and as much full packet data as possible – ideally weeks to months, but a minimum of several days.
NetFlow provides high-level visibility into network activity. Because it is metadata, it is relatively compact, making it possible to store months or years of data. It also easily searchable, allowing analysts to quickly find anomalous flows that are detected by their security tools. On the downside it does only provide a summary of network activity, not the entirety.
Full packet data, on the other hand, gives security teams the ground-truth about exactly what took place in every network conversation. It enables accurate threat reconstruction of any detected threat activity and provides absolute proof of what took place. Because full packet data contains the entire payload of network conversations, data volumes are significantly larger than the equivalent NetFlow data. Nevertheless, it is still quite feasible to record weeks, or even months, of full packet data cost-effectively.
By combining accurate NetFlow with full packet data, security teams gain uncompromised visibility into activity on their network. When used together, these two sources of evidence let analysts quickly get to the flows relating to the alerts their security tools detect, or that they identify through threat hunting activity. They can then analyze the actual packets to see precisely what took place. The combination of both sources of data speeds the investigation process and makes it possible to reach definitive conclusions about what happened and the best remediation actions to take.
If your organization is relying solely on endpoint data, log files and NetFlow as the evidence your security tools analyze to detect threats and your security teams rely on to investigate threats and respond quickly and accurately then you need to be aware of the risk this presents.
So, in summary, check if your NetFlow generators are generating sampled NetFlow. If they are, then there’s a lot your network security tools won’t be able to analyze and it will be difficult or impossible for your security team to investigate issues where there are holes in the evidence. Can you turn off sampling without degrading network performance? If not, look to offload NetFlow generation to a dedicated solution.
And if you are not recording packet data, be aware that without the packets it’s impossible to determine exactly what data was exchanged during network conversations. Did a user enter their credentials on that phishing site? Was data exfiltrated and if so, what data was taken? Is there command-and-control traffic on your network and what is it doing? If it’s important for your security team to be able to answer these sorts of questions, then you really need to be looking to deploy a packet capture solution.
About the Author
Mark Evans is Vice President, Marketing at Endace. He has been involved in the technology industry for more than 30 years. He started in IT operations, systems and application programming and held roles as IT Manager, CIO, and CTO, at technology media giant IDG Communications, before moving into technology marketing and co-founding a tech marketing consultancy. Mark now heads up global marketing for Endace, a world leader in packet capture and network recording solutions.