Google Cloud Run now allows AI inferencing on Nvidia GPUs

The combination of GPU support and the serverless nature of the service, according to experts, should benefit enterprises trying to run AI workloads as with Cloud Run they don’t need to buy and station hardware compute resources on-premises and not spend relatively more by spinning up a typical cloud instance.

“When your app is not in use, the service automatically scales down to zero so that you are not charged for it,” Google wrote in a blog post.

The company claims that the new feature opens up new use cases for developers, including performing real-time inference with lightweight open models such as Google’s open Gemma (2B/7B) models or Meta’s Llama 3 (8B) to build custom chatbots or on-the-fly document summarization, while scaling to handle spiky user traffic.

Another use case is serving custom fine-tuned gen AI models, such as image generation tailored to your company’s brand, and scaling down to optimize costs when nobody’s using them.

Additionally, Google said that the service can be used to speed up compute-intensive Cloud Run services, such as on-demand image recognition, video transcoding and streaming, and 3D rendering.

But are there caveats?

To being with, enterprises may worry about cold start — a common phenomenon with serverless services. Cold start refers to the amount of time needed for the service to load before running actively.

Source link

Google Cloud Run now allows AI inferencing on Nvidia GPUs

But are there caveats?

VMWARE

Helping Public Sector Organisations Define Cloud Strategy

How to change the VLAN ID of the Service Console in ESX from the command line/console

Cisco UCS and Vmware Interfaces (Vnics) HA Design Considerations

Troubleshooting network and TCP/UDP port connectivity issues on ESX/ESXi(2020669)

vSphere Client Parameters

Configuration Templates

CUE Licenses

Trouble shooting Unity Express with Call Manager Integeration & Operational Issues

CME Configuration Example: SIP Trunks to Viatalk and VoIP.ms

SIP Phone registration – CME Configuration

CUE Voicemail + VPIM networking (CUE to unity)

Related Post

But are there caveats?

VMWARE

Configuration Templates