SD-WAN needs a dose of AIOps to deliver automation
Software-defined WAN (SD-WAN) is getting a big boost from AIOps as vendors look to simplify operations, lower costs, and optimize WAN performance in the modern cloud era.
SD-WAN decouples the control aspect of a network from the hardware to create a virtualized network overlay, while AIOps applies machine learning and data analytics to IT operations to automate processes. The convergence of the two – a.k.a. AI-driven WAN – promises to usher in a new era of WAN networking that enables IT to go beyond optimizing network and application experiences to delivering the best experiences to individual users.
SD-WAN has been one of the hottest areas of networking over the past five years. It was overdue. More than 20 years ago, I was a network engineer working with frame-relay, ATM and the tried-and-true MPLS. Even in the mid-’90s, I and other network professionals wanted to move away from the rigid hub-and-spoke type of networks, but with no viable alternatives, we stuck with what we knew.
The WAN was in dire need of an evolutionary change, and along came SD-WAN. It brought greater agility to the network and enabled organizations to leverage low cost Internet instead of higher cost telco broadband services. Plus, SD-WAN improves network resiliency with automated failover. In an era of “everything being connected,” network uptime is crucial to keeping business operations going.
Adoption of SD-WAN was already strong leading up to 2020, and the COVID-19 pandemic has spurred even faster adoption. In the 2020 ZK Research Work From Anywhere Study, 46% of respondents stated that the pandemic has accelerated their SD-WAN deployments.
Despite the rapid uptake, however, SD-WAN doesn’t solve all network woes. The challenge with network operations, software defined or otherwise, is that policy configuration and ongoing management and troubleshooting are done manually. All of the SD-WAN vendors have done a good job simplifying deployment though zero-touch provisioning, but that addresses only day zero operations. Once the network is up and running, the process of finding the source of WAN outages remains one that requires a lot of manual heavy lifting. Isolating problems across the LAN, WLAN, and WAN domains is exponentially harder and all but impossible without an automated model for intelligent event correlation.
In some ways, SD-WAN exacerbates the troubleshooting problem. It adds a level of resiliency to the network via multi-path networking that can hide outages. This leads to a situation where the network operations dashboard can show everything is “green,” but apps are performing poorly. Network performance issues have become glaringly obvious with the rise of video, and they are causing network engineers to constantly scramble to try and remediate issues.
Here is where AI can make a difference. AI systems can ingest the massive amounts of data provided by network infrastructure (LAN, WLAN and WAN) to “see” things that even the savviest network engineer can’t see. At one time, when networks were fairly simple and traffic volumes were lower, it was possible for a seasoned network professional to “know” a network and quickly find the root of problems through a combination of domain knowledge and rapid inspection of traffic. But not so today as the numbers of devices, applications and volume of information have skyrocketed. One of the big changes is that periodic polling data has been replaced by real-time streaming telemetry that increases data by an order of magnitude or more.
AI systems can see even the smallest changes in the network and predict things that may not be discernable to the human eye. A good analogy is how AI is used in the medical profession in radiology. AI systems can detect the smallest anomalies in an MRI, enabling doctors to treat patients earlier than they would have without AI. The same holds true for network professionals. AI systems can spot small problems in the network that might cause an irregularity in an application that’s not noticeable by an end user but will cause bigger issues later. Network engineers can use the output from AI systems to proactively fix issues before they become business impacting – i.e. self-driving networks.
When it comes to AI initiatives, there’s an axiom used by data scientists that states “good data leads to good insights.” This is certainly true, but it is also true that partial data leads to partial insights, which can be one of the limitations of an AI-driven WAN offering. More specifically, if a solution only looks at the network without understanding the impact on actual user experiences and applications, it’s missing a big part of the equation. If there are multiple network issues, priority should be given to the ones that are impacting critical applications and/or users. If some issues don’t impact application performance at all, then those can be placed on the back burner and fixed at a later time.
In addition, if a solution only sees WAN data without correlating it with other areas of the network (i.e. LAN and WLAN), it can create inefficiencies that drive up network costs. Worse yet, this can lead to incorrect assumptions or conclusions about network performance issues and how to fix them. In this respect, AI-driven WANs need to be viewed holistically, as a key part of a larger end-to-end story for AI-driven networking.
AIOps is key to the evolution of SD-WAN, bringing much needed automation and insight to a critical part of the network. However, AI-driven WANs cannot be deployed in a vacuum and must go beyond the network to deliver meaningful insight (and actions) at the application and user level. This not only leads to a better performing and lower cost network but also ensures user productivity and customer service remain high.
Copyright © 2020 IDG Communications, Inc.