Run Generative AI on-premises, with a cloud experience

IT leaders are grappling with a critical question as they seek to deploy generative AI workloads today: Is it better for my business to run GenAI applications in the public cloud or on-premises?

The question inspires spirited debate from both sides of the hosting aisle. Most IT leaders say, “It depends.” True, but it also begs some unpacking.

As you prepare to run a new workload, your first inclination may be to build, test, and launch it in a public cloud. And why not? The approach has probably helped you reduce time to deployment and even accelerated innovation.

So naturally, as you consider rolling out a GenAI service you may be tempted to build and launch it in your preferred public cloud. You believe it will offer greater agility and speed faster than if you do it in your corporate datacenter—or anywhere else.

Normally nobody would blink, blame you, or tell you to think twice. Except this workload is a bit different.

As always, you’ll base your workload placement decision on security, performance, latency, cost, and other variables, including the size and complexity of the large (or small) language model you plan to run, as well as the environments you plan to deploy it to.

Yet given the myriad known unknowns of deploying GenAI models—and the fact that the value you may derive from it may be intrinsically linked to your corporate data—your ability to control this new technology might trump all other factors.

Start your GenAI journey in your datacenter

Using an off-the-shelf or open source model as you build, test, and tune your app on-premises you can bring the AI to your data, affording you greater processing efficiency while retaining control over your data.

Say you work in a regulated sector such as finance and you wish to create a GenAI service that surfaces product information. Strict data security and privacy mandates may govern if and how you work with AI services in the public cloud. Running a GenAI app on premises ensures that all data remains within the organization’s environment, reducing the risk of data breaches while respecting regulatory requirements.

Plus, your ability to control access to the GenAI instance could help alleviate “shadow AI” concerns, which are growing among organizations. Protecting your IP while preventing that Wild West is good governance.

Some scenarios require real-time interactions with the AI model, such as chatbots that support sales or customers. Running the LLM on-premises can minimize latency since data doesn’t have to travel to remote cloud servers and back. This can result in faster response times while enabling you to better monitor latency and throughput, as well as the accuracy of your model. Fifty-five percent of IT decision makers cited performance as a top reason for running GenAI workloads on-premises, according to a Dell survey of IT managers.¹

Costs present another tricky variable. Operating a GenAI app in the public cloud can yield sticker shock as usage grows—or if the implementation isn’t properly scoped. Maybe you’re looking to stand up a paired programming environment in which humans write code while GenAI puts it through the QA ringer—or vice versa.

You get greater control over how many resources you consume on-premises, which will help you curb costs. That’s no small consideration, as 35% of IT leaders Dell surveyed cited cost as a key reason for deploying their GenAI workload on-premises.²

The cloud experience delivered on-premises

Maybe your GenAI journey starts on premises but once you’ve tested and trained your app, checking it for performance, bias, and other issues, you decide to also launch it in a public cloud. Eighty-two percent of IT-decision-makers indicated they were most interested in taking an on-premises or hybrid approach to building their GenAI solution, according to a Dell Generative AI Pulse survey.³

Hybrid cloud models naturally provide more choices. In that vein, did you know there are other ways to enjoy a cloud experience in-house? You can build a bridge between your on-premises estate and public clouds to get the best of both operating environments.

Dell APEX Cloud Platforms enable you to enjoy the agility and flexibility of cloud services, with the security, performance, and control of an on-premises solution. These platforms, which include Microsoft Azure, VMware, and Red Hat OpenShift, provide a unified cloud experience, allowing you to procure more infrastructure as required while enabling optimal deployment of GenAI apps, such as digital assistants and other tools that surface business information.

That way you can spend more of your time and energy accelerating your GenAI journey to achieve business outcomes that will help you drive digital transformation.

Learn more about Dell APEX Cloud Platforms.

¹Dell internal survey of IT decision makers, August 2023

²Dell internal survey of IT decision makers, August 2023

³Generative AI Pulse Survey, Dell Technologies, Sept. 2023

Source link

Run Generative AI on-premises, with a cloud experience

Start your GenAI journey in your datacenter

The cloud experience delivered on-premises

VMWARE

Helping Public Sector Organisations Define Cloud Strategy

How to change the VLAN ID of the Service Console in ESX from the command line/console

Cisco UCS and Vmware Interfaces (Vnics) HA Design Considerations

Troubleshooting network and TCP/UDP port connectivity issues on ESX/ESXi(2020669)

vSphere Client Parameters

Configuration Templates

CUE Licenses

Trouble shooting Unity Express with Call Manager Integeration & Operational Issues

CME Configuration Example: SIP Trunks to Viatalk and VoIP.ms

SIP Phone registration – CME Configuration

CUE Voicemail + VPIM networking (CUE to unity)

Related Post

Start your GenAI journey in your datacenter

The cloud experience delivered on-premises

VMWARE

Configuration Templates