‘Don’t be surprised if I am an AI,’ says Nvidia’s CEO
Nvidia used to be just a graphics chip vendor, but CEO Jensen Huang wants you to know that the company is now a full-stack computing service provider, and that he may be an artificial construct.
With such lofty ambitions, Nvidia is moving into the cloud, delivering both hardware and software as-a-service. At the company’s GTC Fall conference last week, Huang showed off a few new toys for gamers, but he spent most of his keynote speech outlining the tools Nvidia offers CIOs to accelerate computing in the enterprise.
There was hardware for industrial designers in the new Ada Lovelace RTX GPU; a chip to steer self-driving vehicles while entertaining passengers; and the IGX edge computing platform for autonomous systems.
But it wasn’t only hardware. Software (for drug discovery, biology research, language processing, and building metaverses for industry) and services including consulting, cybersecurity, and software- and infrastructure-as-a-service in the cloud were there too.
Huang punctuated his keynote with demos of a single processor performing photo-realistic, real-time rendering of scenes with natural-looking lighting effects, an AI that can seamlessly fill in missing frames to smooth and speed up animation, and a way of training large language models for AI that allow them to respond to prompts in context-dependent ways. The quality of those demos made it at least somewhat plausible when, in a videoconference with journalists after the keynote, the on-screen Huang quipped, “Don’t be surprised if I’m an AI.”
Joking aside, CIOs will want to pay serious attention to Nvidia’s new cloud services play, as it could enable them to deliver new capabilities across their organizations without increasing equipment budgets. In an age when hardware costs are likely to climb and the industry’s ability to pack more transistors into a given area of silicon is stalling, challenges still exist for many.
“Moore’s law is dead,” said Huang, referencing Gordon Moore’s 1965 statement that the number of transistors on microchips will double about every two years. “And the idea that a chip is going to go down in cost over time, unfortunately, is a story of the past.”
Many factors are contributing to the troubles of chip makers like Nvidia, including difficulty obtaining vital tooling and the rising cost of raw materials such as neon gas (supplies of which have been affected by the war in Ukraine) and the silicon wafers chips are made from.
“A 12-inch wafer is a lot more expensive today than it was yesterday,” Huang said. “And it’s not a little bit more expensive, it is a ton more expensive.”
Nvidia’s response to those rising costs is to develop software optimized so customers get the most out of its processors, helping redress a price-performance balance. “The future is about accelerated full stack,” he said. “Computing is not a chip problem. Computing is a software and chip problem, a full stack challenge.”
Fine-tuning NeMo
To underline that point, Nvidia announced it’s already busy optimizing its NeMo large language model training software for its new H100 chip, which has just entered full production. The H100 is the first chip based on the Hopper architecture that Nvidia unveiled at its Spring GTC conference in March. Other deep learning frameworks being optimized for the H100 include Microsoft DeepSpeed, Google JAX, PyTorch, TensorFlow, and XLA, Nvidia said.
Nvidia Hopper
NeMo also has the distinction of being one of the first two Nvidia products to be sold as a cloud-based service, the other being Omniverse.
The NeMo Large Language Model Service enables developers to train or tailor the responses of large language models built by Nvidia for processing or predicting responses in human languages and computer code. The related BioNeMo LLM Service does something similar for protein structures, predicting their biomolecular properties.
Nvidia’s latest innovation in this area is to enable enterprises to take a model built from billions of parameters and fine-tune it using a few hundred data points, so a chatbot can provide responses more appropriate to a particular context. For example, if a chatbot asked, “What are the rental options?” it might respond, “You can rent a modem for $5 per month,” if it were tuned for an ISP; “We can offer economy, compact and full-size cars,” for a car rental company; or, “We have units from studios to three bedrooms,” for a property management agency.
Such tuning, Nvidia said, can be performed in hours, whereas training a model from scratch can take months. Tuned models, once created, can also be called up using a “prompt token” combined with the original model. Enterprises can run the models on premises or in the cloud or, starting in October, access them in Nvidia’s cloud through an API.
Omniverse Cloud
Nvidia’s Omniverse platform is the foundation of the other suite of cloud services the company offers.
Huang described the platform as having three key features. One is the ability to ingest and store three-dimensional information about worlds: “It’s a modern database in the cloud,” Huang said. Another is its ability to connect devices, people or software agents to that information and to one another. “And the third gives you a viewport into this new world, another way of saying it’s a simulation engine,” Huang said.
Those simulations can be of the real world, in the case of enterprises creating digital twins of manufacturing facilities or products, or of fictional worlds used to train sensor networks (with Omniverse Replicator), robots (with Isaac Sim), and self-driving vehicles (with Drive Sim) by feeding them simulated sensor data.
There’s also Omniverse Nucleus Cloud, which provides a shared Universal Scene Description store for 3D scenes and data that can be used for online collaboration, and Omniverse Farm, a scale-out tool for rendering scenes and generating synthetic data using Omniverse.
Industrial giant Siemens is already using the Omniverse platform to develop digital twins for manufacturing, and Nvidia said the company is now working on delivering those services to its customers using Omniverse Cloud.
Omniverse Farm, Replicator and Isaac Sim are already available in containers for enterprises to deploy on Amazon Web Services’ compute cloud instances equipped with Nvidia GPUs, but enterprises will have to wait for general availability of the other Omniverse Cloud applications as Nvidia managed services. The company is now taking applications for early access.
Nvidia is also opening up new channels to help enterprises consume its new products and services. Management consulting provider Booz Allen Hamilton offers enterprises a new cybersecurity service it calls Cyber Precog, built on Nvidia Morpheus, an AI cybersecurity processing framework, while Deloitte will offer enterprise services around Nvidia’s Omniverse software suite, the companies announced at GTC.
As Nvidia works with consultants and systems integrators to roll out its SaaS and hardware rental offerings, that doesn’t mean it’s going to stop selling hardware outright. Huang noted that some organizations, typically start-ups or those that only use their infrastructure sporadically, prefer to rent, while large, established enterprises prefer to own their infrastructure.
He likened the process of training AI models to operating a factory. “Nvidia is now in the factory business, the most important factory of the future,” he says. Where today’s factories take in raw materials and put out products, he said, “In the future, factories are going to have data come in, and what comes out is going to be intelligence or models.”
But Nvidia needs to package its hardware and software factories for CIOs in different ways, Huang said: “Just like factories today, some people would rather outsource their factory, and some would rather own it. It just depends on what business model you’re in.”