- The Model Context Protocol: Simplifying Building AI apps with Anthropic Claude Desktop and Docker | Docker
- This robot vacuum and mop performs as well as some flagship models - but at half the price
- Finally, a ThinkPad model that checks all the boxes for me as a working professional
- Why I recommend this Android phone for kids over a cheap Samsung or Motorola model
- My favorite USB-C accessory of all time scores a magnetic upgrade
Know before you go: 6 lessons for enterprise GenAI adoption
In 1895, Mary Lathrap penned a poem that inspired the quote, “You can’t really understand another person’s experience until you’ve walked a mile in their shoes.” That quote aptly describes what Dell Technologies and Intel are doing to help our enterprise customers quickly, effectively, and securely deploy generative AI and large language models (LLMs).Many organizations know that commercially available, “off-the-shelf” generative AI models don’t work well in enterprise settings because of significant data access and security risks. As a result, organizations like Apple, Samsung, Accenture, Microsoft, Verizon, Wells Fargo, and others1 have banned the use of commercial large language models.
Given the importance of being able to control data access and respect privacy and regulatory concerns while harnessing GenAI’s tremendous potential, Dell Technologies and Intel have been investigating GenAI implementations, open-source models, and alternatives to trillion-plus parameter models. We’re using our own databases, testing against our own needs, and building around specific problem sets. In other words, we are walking a mile in our customers’ shoes.
Walking a mile taught us 6 lessons
After extensive exploration, we learned 6 important lessons that illuminate the challenges and opportunities of the enterprise generative AI path forward. Knowing these lessons before generative AI adoption will likely save time, improve outcomes, and reduce risks and potential costs.
(Here’s a quick read about how enterprises put generative AI to work).
Lesson 1: Don’t start from scratch to train your LLM modelMassive amounts of data and computational resources are needed to train an LLM. That makes it impractical to train an LLM from scratch. Training GPT-3 was heralded as an engineering marvel. It is rumored to have used 1024 GPUs, took 34 days, and cost $4.6 million in compute alone2. Speculations about GPT-4 indicate it is 1000 times larger than GPT-33 and took months and much more investment to complete. These are notable investments of time, data, and money.
Instead, a more viable option is to perform fine-tuning on a pre-trained, general model. Interesting approaches such as parameter-efficient fine-tuning (PEFT) and low-rank adaptation (LORA) can make this process less expensive and more feasible. However, these methods can still become costly, especially if constant updates are required.
A better approach is to use prompting engineering techniques where specific knowledge and custom instructions are used as input for a pre-trained LLM. Retrieval Augmented Generation (RAG), which provides a way to optimize LLM output without altering the underlying LLM model, seems to be the best and most practical framework to do so.
Lesson 2: LLMs are not just for text generation
In addition to text generation, LLMs are state-of-the-art for most natural language processing (NLP) tasks, such as identifying user intent, classification, semantic search, and sentiment analysis. LLMs are also at the heart of text-to-image generation like DALL-E and Stable Diffusion. For enterprises, being creative with LLMs and using them for different tasks will help ensure a robust solution across all potential use cases.
For example, in customer support, you’ve likely heard “This call may be recorded for training purposes.” Telecommunications companies are using NLP to analyze ways to improve customer experiences. In addition, enterprises use automated systems that direct customers to the proper support representative based on verbal prompts—that’s also NLP in action.
Lesson 3: Open-source LLMs are limited
There are 300,000 models and counting on HuggingFace.co, all of which are open-source and backed by a dedicated developer community. Despite rapid developments and improvements, open-source LLMs, while sophisticated, still have limitations. As with both open-source and proprietary models, you must do your due diligence. Because LLMs are built to handle complex tasks, inherent limitations can emerge when working with large data volumes.
One workaround is to build a system with multiple LLMs. That way, the multiple LLMs can work together to limit and manage the scope of the LLM tasks by using pre-processing techniques and standard machine learning (ML) approaches whenever possible. At the same time, managing many LLMs simultaneously is important to prevent them from relying too much on each other and causing cumulative errors.
Lesson 4: Input data sources are as important as output
At Dell Technologies and Intel, we are focused on improving customer outcomes. Generating high-quality LLM outcomes depends on reliable, well-formatted, and relevant data for input when customizing LLMs. In practice, more time should be spent organizing and preparing data sources versus adjusting LLM model parameters.
Leveraging structures that can improve data representation, such as knowledge graphs, advanced parsing, and entity recognition, can significantly improve results. LLMs should be used to produce better output and to understand and hone better input.
Lesson 5: Cost is an important, but manageable, part of the equation
As noted above, training GPT-3 and GPT-4 is rumored to have required very expensive machines and lengthy processes that required supercomputing infrastructure. This highlights the major constraints facing LLMs and generative AI.
Training LLMs is expensive and energy-intensive. Running inference on 100+ Billion parameters is also very costly. A query on ChatGPT takes far more energy and compute than a typical search engine request. Few enterprises can afford to buy a supercomputer—or use one as a service—to develop their own LLMs.
There are ways to run AI services—even generative AI—on less-expensive cloud instances and on-premises or co-located data centers. Retraining a model on your data for your specific application can create a smaller, more accurate model that performs well with less computing power.
Lesson 6: Use your unique problem to your advantage
Using custom, open-source, and on-premises generative AI and LLM models is an opportunity. Enterprises can build tailor-made solutions based on specific demands. Another tip is to invest in a good user interface including capturing rich input information, guiding the user throughout system usage, and evaluating the output to ensure it is meaningful and relevant. Much of the LLM development and deployment work includes experimentation and creative use of prompts.
It is also important to understand that not every problem needs a generative AI solution or even an AI solution. Focusing on specific, unique needs creates opportunities to match models to the application, retrain on precise data sets, and craft tailor-made applications. At Dell Technologies and Intel, we´ve learned not to be constrained by traditional uses and to be open to a world of possibilities when exploring generative AI models.
Walking forward together
Generative AI and LLMs promise to bring incredible transformation to the enterprise world. To embrace this power and potential, enterprises must customize approaches and tailor LLMs with new ways of doing and thinking. Based on our hands-on experience at Dell Technologies and Intel, we are well-positioned to walk along with our customers on their generative AI journey.
See “Putting AI to Work: Generative AI Meets the Enterprise.”
View “Building the Generative AI-Driven Enterprise: Today’s Use Cases.”
Read more about Dell AI solutions and the latest Intel MLPerf results here.
[1] https://jaxon.ai/list-of-companies-that-have-banned-chatgpt/
[2] https://medium.com/codex/gpt-4-will-be-500x-smaller-than-people-think-here-is-why-3556816f8ff2#:~:text=The%20creation%20of%20GPT%2D3,GPUs%2C%20would%20take%2053%20Years.
[3] https://levelup.gitconnected.com/gpt-4-parameters-explained-everything-you-need-to-know-e210c20576ca