Introducing the GenAI models you haven’t heard of yet
S&P Global is testing Llama 2, Biem says, as well as other open source models on the Hugging Face platform.
Many companies start out with OpenAI, says Sreekar Krishna, managing director for data and analytics at KPMG. But they don’t necessarily stop there.
“Most of the institutions I’m working with are not taking a single vendor strategy,” he says. “They’re all very aware that even if you just start with OpenAI, it’s just a starting gate.”
Most often, he sees companies look at Google’s Bard next, especially if they’re already using Google cloud or other Google platforms.
Another popular option is Databricks, which is a popular data pipeline platform for enterprise data science teams. The company then introduced Dolly, its open source LLMs, in April, licensed for both research and commercial use, and in July, also added support for Llama 2.
“The Databricks platform is capable of consuming large volumes of data and is already one of the most widely used open source platforms in enterprises,” says Krishna.
The Dolly model, as well as Llama 2 and the open source models from Hugging Face, will also become available on Microsoft, Krishna says.
“It’s such a fast-evolving landscape,” he says. “We feel that every hyperscaler will have open source generative AI models quickly.”
But given how fast the space is evolving, he says, companies should focus less on what model is the best, and spend more time thinking about building flexible architectures.
“If you build a good architecture,” he says, “your LLM model is just plug-and-play; you can quickly plug in more of them. That’s what we’re doing.”
KPMG is also experimenting with building systems that can use OpenAI, Dolly, Claude, and Bard, he says. But Databricks isn’t the only data platform with its own LLM.
John Carey, MD of the technology solutions group at global consulting firm AArete, uses Document AI, a new model now in early release from Snowflake that allows people to ask questions about unstructured documents. But, most importantly, it allows AArete to provide security for their enterprise clients.
“They trust you with their data that might have customer information,” says Carey. “You’re directly obligated to protect their privacy.”
Snowflake’s Document AI is a LLM that runs within a secure, private environment, he says, without any risk that private data would be shipped off to an outside service or wind up being used to train the vendor’s model.
“We need to secure this data, and make sure it has access controls and all the standard data governance,” he says.
Beyond large foundation models
Using large foundation models and then customizing them for business use by fine-tuning or embedding is one way enterprises are deploying generative AI. But another path some companies are taking is to look for narrow, specialized models.
“We’ve been seeing domain-specific models emerging in the market,” says Gartner analyst Arun Chandrasekaran. “They also tend to be less complex and less expensive.”
Databricks, IBM, and AWS all have offerings in this category, he says.
There are models specifically designed to generate computer code, models that can describe images, and those that perform specialized scientific tasks. There are probably a hundred other models, says Chandrasekaran, and several different ways companies can use them.
Companies can use public versions of generative AI models, like ChatGPT, Bard, or Claude, when there are no privacy or security issues, or run the models in private clouds, like Azure. They can access the models via APIs, augment them with embeddings, or develop a new custom model by fine-tuning an existing model via training it on new data, which is the most complex approach, according to Chandrasekaran.
“You have to get your data and annotate it,” he says. “So you now own the model and have to pay for inference and hosting costs. As a result, we’re not seeing a lot of fine-tuning at this point.”
But that will probably change, he says, with new models emerging that are smaller, and therefore easier and cheaper for companies to do the additional training and deploy them.
There’s one other option for companies, he adds.
“That’s where you build your own model from scratch,” he says. “That’s not something a lot of enterprises are going to do, unless you’re a Fortune 50 company, and even then, only for very specific use cases.”
For many companies, using off-the-shelf models and adding embeddings will be the way to go. Plus, using embedding has an extra benefit, he says.
“If you’re using the right architecture, like a vector database, the AI can include references with its answers,” he says. “And you can actually tune these models not to provide a response if they don’t have reference data.”
That’s not usually the case with public chatbots like ChatGPT.
“Humility is not a virtue of the online chatbots,” says Chandrasekaran. “But with the enterprise chatbots, it would say, ‘I don’t know the answer.’”
Going small
Smaller models aren’t just easier to fine-tune, they can also run in a wider variety of deployment options, including on desktop computers or even mobile phones.
“The days of six-plus months of training and billions of parameters are gone,” says Bradley Shimmin, chief analyst for AI platforms, analytics, and data management at tech research and advisory group, Omdia. “It now takes just hours to train a model. You can iterate rapidly and improve that model, fine tune it, and optimize it to run on less hardware or more efficiently.”
A company can take open source code for a model such as Llama 2—which comes in three different sizes—and customize it to do exactly what it wants.
“That’s going to cost me phenomenally less than using GPT 4’s API,” says Shimmin.
The smaller models also make it possible for companies to experiment, even when they don’t know much about AI when they’re starting out.
“You can stumble around without having a lot of money,” he says, “And stumble into success very rapidly.”
Take Gorilla, for example. It’s an LLM based on Llama, fine-tuned on 1,600 APIs.
“It’s built to learn how to navigate APIs,” Shimmin adds. “Use cases include data integration in the enterprise. You’ll no longer have to maintain a pipeline, and it can do root cause analysis, self-heal, build new integrations rapidly—your jaw will drop.”
The challenge, he says, is to figure out which model to use where, and to navigate all the different license terms and compliance requirements. Plus, there’s still a lot of work to do when it comes to operationalizing LLMs.
Gen AI isn’t just about language
Language models are getting most of the attention in the corporate world because they can write code, answer questions, summarize documents, and generate marketing emails. But there’s more to generative AI than text.
Several months before ChatGPT hit the news headlines, another generative AI tool that made waves—Midjourney. Image generators evolved quickly, to the point where the images produced were indistinguishable from human work, even winning art and photography awards.
DeadLizard, a boutique creative agency that counts Disney among its clients, uses not only Midjourney but several other image tools, including Stable Diffusion and ClipDrop for image editing, and Runway for adding motion.
The images are used in the company’s own branded social media content, but also as part of the idea-generation and creative development process.
“By adding an open generative AI toolset, it’s the equivalent of opening an entire Internet worth of brains and perspectives,” says DeadLizard co-founder Todd Reinhart. “This helps accelerate ideation.”
Even weird or illogical suggestions can be helpful at this stage, he says, since they can inspire solutions outside the usual comfort zones. In addition, new generative AI tools can dramatically improve photo editing capabilities. Previously, the company had to do custom shoots, which are usually prohibitively expensive for all but the biggest projects, or use stock photography and Photoshop.
“We find entirely new workflows and toolsets coming to light on nearly a weekly basis,” he said.