Should AI initiatives change network planning?

That’s probably unnecessary, and possibly downright wrong. Enterprises don’t need to crawl the Internet for training data for their model. Enterprises don’t need to support mass-market use of their AI, and if they did for applications like chatbots in customer support, they’d likely use cloud hosting not in-house deployment. That means that AI to the enterprise is really a form of enhanced analytics. Widespread use of analytics has influenced data center network planning for access to the databases, and AI would likely increase database access if it’s widely used. But even given all of that, there’s no reason to think that Ethernet, the dominant data center network technology, wouldn’t be fine for AI. So forget the notion of an InfiniBand technology shift. But that doesn’t mean that AI won’t need to be planned for in the network.

Think of an AI cluster as an enormous virtual user community. It has to collect data from the enterprise repository, all of it, to train and get the latest information to answer user questions. That means it needs a high-performance data path to this data, and that  path can’t be allowed to congest other traditional workflows within the network. The issue is acute for enterprises with multiple data centers, multiple complexes of users, because it’s likely that they won’t want to host AI in every location. If the AI cluster is separated from some applications, databases, and users, then data center interconnect (DCI) paths might have to be augmented to carry the traffic without congestion risk.

According to those eight AI-hosting enterprises, the primary rule for AI traffic is that you want the workflows to be as short as possible, over the fastest connections you have. Pulling or pushing masses of AI data over widespread connections could make it almost impossible to prevent random massive movements of data from interfering with other traffic. It’s particularly important to ensure that AI flows don’t collide with other high-volume data flows, like conventional analytics and reporting. One approach is to map AI workflows and augment capacity along the path, and the other is to shorten and guide AI workflows by properly placing the AI cluster.

Planning for the AI cluster starts with the association between enterprise AI and business analytics. Analytics uses the same databases that AI would likely use, which means that placing AI where the major analytics applications are hosted would be smart. Remember that this means placing AI where the actual analytics applications are run, not where the results are formatted for use. Since analytics applications are often run proximate to the location of the major databases, this will put AI in the location most likely to generate the shortest network connections. Run fat Ethernet pipes within the AI cluster and to the database hosts, and you’re probably in good shape. But watch AI usage and traffic carefully, particularly if there aren’t many controls on who uses it and how much. Rampant, and largely unjustified, use of self-hosted AI was reported by six of the eight enterprises, and that could drive costly network upgrades.

The future of AI networking for enterprises isn’t about how AI is run, it’s about how it’s used, and while AI usage will surely drive additional traffic, it’s not going to require swapping out the entire data center network for hundreds of gigabits of Ethernet capacity. What it will require is a better understanding of how AI usage connects with AI data center clusters, cloud resources, and some generative AI thrown in. If Cisco, Juniper, or another vendor can provide that, they can expect a happy bonus in 2024.



Source link