VMware, Nvidia offer GPU-powered AI in virtual machines


VMware and Nvidia have expanded their alliance to support Nvidia GPU-based applications on VMware’s new vSphere 7 Update 2. The upgraded version of vSphere 7 will support the new Nvidia AI Enterprise offering, a suite of enterprise-grade AI tools and frameworks that enables GPU-accelerated applications to run in VMware virtual machines and containers.

VMware’s vSphere 7 U2 adds support for Nvidia’s A100 Tensor Core GPU and its multi-instance GPU feature, which allows for partitioning of the cores on an A100 for use by multiple users, much in the same way VMware partitions CPU cores out to multiple users.

This means that AI workloads can now run on VMware’s virtualized platform. Up to now, AI workloads have only run on bare-metal servers. AI is nothing if not performance-intensive, and a bare-metal environment delivers the full power of the hardware rather than sharing it in a virtual, multi-tenant scenario.

Nvidia claims in a blog post announcing the new software that AI Enterprise enables virtual workloads to run at near bare-metal performance on vSphere. AI workloads will be able to scale across multiple nodes, allowing even the largest deep-learning training models to run on VMware Cloud Foundation.

With this capability, developers can build scale-out, multi-node performance for CUDA applications, AI frameworks, models and SDKs on the vSphere platform. The AI Enterprise platform is designed to be deployed on Nvidia-certified systems from Dell Technologies, Hewlett Packard Enterprise (HPE), Supermicro, Gigabyte, and Inspur.

In addition to the A100 support, vSphere 7 U2 adds the ability to employ vSphere Lifecycle Manager to see images and manage instances of vSphere running with Tanzu, VMware’s distribution of Kubernetes. vSphere 7 U2 comes with integrated application load balancing as well as better support for private and third-party container registries.

vSAN 7 U2 enhancements

In addition to vSphere upgrades, VMware also announced the availability of VMware vSAN 7 Update 2 with several new and enhanced features. First up is a new version of its hyperconverged infrastructure (HCI) software, HCI Mesh; the new release builds upon the software-based approach for disaggregation of compute and storage resources initially released in vSAN 7 Update 1.

The new release offers a broader set of customer use cases, particularly for customers looking to increase resource efficiency beyond their existing vSAN environment. It enables compute clusters, or non-HCI clusters, to remotely use storage from a vSAN cluster within the data center, allowing customers to scale compute and storage independently.

vSAN 7 Update 2 also introduces new capabilities to better support various physical topologies. This includes integrated DRS awareness of stretched cluster configurations for more consistent performance in failback, as well as vSAN file services support for stretched clusters and 2-node clusters.

Also, VMware continues to deliver capabilities that drive better performance of vSAN, including vSAN over Remote Direct Memory Access (RDMA) and enhancements to RAID 5/6 erasure coding that improve CPU utilization and app performance for certain workloads.

Finally, vSAN 7 Update 2 includes FIPS 140-2 validation of the cryptographic module for data-in-transit encryption to meet strict government requirements.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2021 IDG Communications, Inc.



Source link