This AI summer is abloom with smaller models, on more devices

A long-held opinion about AI capabilities is that bigger and more are better. Bigger models, with more data, invariably equal better AI experiences.

Today’s market reality is starkly different. It turns out companies adopting generative AI today don’t need models with 1 trillion parameters or even hundreds of billions of parameters frontier LLMs are trained on. Instead, many organizations are adopting small language models (SLMs) customized to handle specific tasks.

Such model sizes range from one hundred million to one hundred billion parameters; many can run on PCs or even smartphones. Who knows? SLMs may eventually power virtual reality headsets.

The test-and-learn arc typically goes like this: Organizations used LLMs to implement proof-of-concepts but over time realized they could achieve similar outcomes at a lower cost using smaller models from Microsoft, Meta, and Google, as well as startups such as Hugging Face, Mistral, and Anthropic.

Using SLMs in conjunction with methods such as finetuning and retrieval augmented generation (RAG) to refine outputs using corporate data, companies are beginning to automate document retrieval and analyze customer service data to keep up with consumer habits, according to The Wall Street Journal. And the use cases continue to stack up.

Going small for big advantages

Cost is surely a big factor for IT leaders weighing how to invest in emerging technologies such as GenAI. Even OpenAI, the king of frontier models whose ChatGPT kicked off the GenAI frenzy, recently issued a smaller, lower-cost model.

Lower cost isn’t the only advantage of using SLMs.

Boost Speed and Efficiency. LLMs require multiple GPUs, which can slow down inference time. SLMs, however, can run on local machines while generating rapid results to prompts. And they don’t need to connect to the cloud.

Reduce Latency. Fewer parameters to process often means quicker prompt response times. SLMs won’t likely soon outclass GPT-4o, but again, depending on corporate needs or use cases, they might not need to.

Domain Specificity: Because SLMs are typically trained on specific domains, they may provide more relevant results than LLMs, which aim for broader generalization. This lends itself well to use cases where corporate IP is included as part of the model.

Limit Errors. Hallucinations remain a fear for organizations concerned about models producing inaccurate or biased information. Industry consensus holds that using a smaller set of training data, along with RAG and human-in-the-loop approaches, may reduce inaccuracies in results. This can protect your IP and corporate reputation.

Increase Sustainability. As it turns out, downsizing AI models—opting for SLMs over LLMs—is also better for the environment. AI consumes a tremendous amount of power to produce tokens. Lowering this quotient and thus reducing an organization’s AI footprint is essential for meeting corporate sustainability standards.

Practitioners watching the growing thirst for SLMs have made some high-level observations about the current downsizing trend.

To mold available training data into synthetic formats, models needed to get larger before they could get smaller, resulting in smaller models that will be able to think reasonably well, according to Andrej Karpathy, a former OpenAI engineer. Developer Drew Breunig notes that model makers are becoming more selective about the training data they use to create smaller, more effective models.

It’s just as well when you consider the growing scarcity of remaining data available to train LLMs. Smaller models, it seems, are as much a practical necessity as they are a boon for IT budgets and efficiency.

Getting the Dell AI Factory to work for you

Regardless of the model path you choose, getting a proof-of-concept up and running can feel daunting. The good news is that it doesn’t have to be.

Dell Technologies offers a roadmap for AI model and infrastructure deployment—whether you’re using an SLM, LLM, or something in between.

The Dell AI Factory helps organizations work through these challenges, offering guidance for preparing your corporate data and choosing AI-enabled infrastructure. The Dell AI Factory can connect you with partners from the open ecosystem, as well as with professional services, that can provide you with the use cases and tools to help facilitate your AI deployment.

The AI era comprises model types of all shapes and sizes. Just remember, you don’t have to go big to get the business outcomes you seek.

Learn more about the Dell AI Factory in this webinar.

Source link

This AI summer is abloom with smaller models, on more devices

VMWARE

Helping Public Sector Organisations Define Cloud Strategy

How to change the VLAN ID of the Service Console in ESX from the command line/console

Cisco UCS and Vmware Interfaces (Vnics) HA Design Considerations

Troubleshooting network and TCP/UDP port connectivity issues on ESX/ESXi(2020669)

vSphere Client Parameters

Configuration Templates

CUE Licenses

Trouble shooting Unity Express with Call Manager Integeration & Operational Issues

CME Configuration Example: SIP Trunks to Viatalk and VoIP.ms

SIP Phone registration – CME Configuration

CUE Voicemail + VPIM networking (CUE to unity)

Related Post

VMWARE

Configuration Templates