- Critical warning from Microsoft: .NET install domains changing
- Why I recommend this Windows tablet for work travel over the iPad and Lenovo Yoga
- I tested the new Kindle Paperwhite, and it has the one upgrade I've been waiting for
- If you're a Ring user, I highly recommend this video doorbell that's easy to install
- This tablet solved my biggest problem as a smart home enthusiast
This AI summer is abloom with smaller models, on more devices
A long-held opinion about AI capabilities is that bigger and more are better. Bigger models, with more data, invariably equal better AI experiences.
Today’s market reality is starkly different. It turns out companies adopting generative AI today don’t need models with 1 trillion parameters or even hundreds of billions of parameters frontier LLMs are trained on. Instead, many organizations are adopting small language models (SLMs) customized to handle specific tasks.
Such model sizes range from one hundred million to one hundred billion parameters; many can run on PCs or even smartphones. Who knows? SLMs may eventually power virtual reality headsets.
The test-and-learn arc typically goes like this: Organizations used LLMs to implement proof-of-concepts but over time realized they could achieve similar outcomes at a lower cost using smaller models from Microsoft, Meta, and Google, as well as startups such as Hugging Face, Mistral, and Anthropic.
Using SLMs in conjunction with methods such as finetuning and retrieval augmented generation (RAG) to refine outputs using corporate data, companies are beginning to automate document retrieval and analyze customer service data to keep up with consumer habits, according to The Wall Street Journal. And the use cases continue to stack up.
Going small for big advantages
Cost is surely a big factor for IT leaders weighing how to invest in emerging technologies such as GenAI. Even OpenAI, the king of frontier models whose ChatGPT kicked off the GenAI frenzy, recently issued a smaller, lower-cost model.
Lower cost isn’t the only advantage of using SLMs.
Boost Speed and Efficiency. LLMs require multiple GPUs, which can slow down inference time. SLMs, however, can run on local machines while generating rapid results to prompts. And they don’t need to connect to the cloud.
Reduce Latency. Fewer parameters to process often means quicker prompt response times. SLMs won’t likely soon outclass GPT-4o, but again, depending on corporate needs or use cases, they might not need to.
Domain Specificity: Because SLMs are typically trained on specific domains, they may provide more relevant results than LLMs, which aim for broader generalization. This lends itself well to use cases where corporate IP is included as part of the model.
Limit Errors. Hallucinations remain a fear for organizations concerned about models producing inaccurate or biased information. Industry consensus holds that using a smaller set of training data, along with RAG and human-in-the-loop approaches, may reduce inaccuracies in results. This can protect your IP and corporate reputation.
Increase Sustainability. As it turns out, downsizing AI models—opting for SLMs over LLMs—is also better for the environment. AI consumes a tremendous amount of power to produce tokens. Lowering this quotient and thus reducing an organization’s AI footprint is essential for meeting corporate sustainability standards.
Practitioners watching the growing thirst for SLMs have made some high-level observations about the current downsizing trend.
To mold available training data into synthetic formats, models needed to get larger before they could get smaller, resulting in smaller models that will be able to think reasonably well, according to Andrej Karpathy, a former OpenAI engineer. Developer Drew Breunig notes that model makers are becoming more selective about the training data they use to create smaller, more effective models.
It’s just as well when you consider the growing scarcity of remaining data available to train LLMs. Smaller models, it seems, are as much a practical necessity as they are a boon for IT budgets and efficiency.
Getting the Dell AI Factory to work for you
Regardless of the model path you choose, getting a proof-of-concept up and running can feel daunting. The good news is that it doesn’t have to be.
Dell Technologies offers a roadmap for AI model and infrastructure deployment—whether you’re using an SLM, LLM, or something in between.
The Dell AI Factory helps organizations work through these challenges, offering guidance for preparing your corporate data and choosing AI-enabled infrastructure. The Dell AI Factory can connect you with partners from the open ecosystem, as well as with professional services, that can provide you with the use cases and tools to help facilitate your AI deployment.
The AI era comprises model types of all shapes and sizes. Just remember, you don’t have to go big to get the business outcomes you seek.
Learn more about the Dell AI Factory in this webinar.