The Right Stuff: The Role of MLOps in AI Success
Great teams incorporate a variety of skill sets. For example, a football team consisting of 11 quarterbacks would get crushed in a game against talented linemen, running backs and receivers. It’s no different when building a team for an enterprise AI project; you can’t just throw a bunch of data scientists into a room and expect them to come up with a revenue-generating or efficiency-improving project without support from other members of the enterprise.
Interestingly, many companies do just that, creating a disconnect between data science teams and IT/DevOps when it comes to AI development. This gap is a significant reason why AI pilot projects fail.
“AI projects are a team sport and should include a multidisciplinary team spanning business analysts, data engineering, data science, application development, and IT operations and security,” according to Moor Insights & Strategy in a September 2021 report titled “Hybrid Cloud is the Right Infrastructure for Scaling Enterprise AI.”
The biggest divide between data scientists and IT often centers around the tools necessary to develop AI models.
“Many IT organizations try to build a killer, one-stop solution that fits all needs,” says Michael Balint, principal product architect at NVIDIA. For example, many prefer to develop with deep learning frameworks such as PyTorch on a dedicated system, while others schedule their work using Slurm or Kubeflow. IT is often left scratching their heads about how they can consolidate everything into one solution.”
Yet, this can be a disaster when it comes to AI projects, Balint warns. “This is such a nascent area that if you’re in IT and you try to pull the trigger on one solution, you might be missing out on functionality that a data scientist or data engineer might need to get their job done. Data scientists would really love to just build models and do real core data science. They get frustrated when they don’t have the tools to do that, and the blame gets put on IT.”
MLOps to the rescue
The better approach is to have IT work with the data science groups on bridging the gap through processes and tools such as MLOps. These can provide enterprises with governance, security and collaboration through features such as tracking and repeatability. MLOps platforms can orchestrate the collection of artifacts, compute infrastructure and processes that are needed to deploy and maintain AI-based models. Many MLOps systems can also evaluate the accuracy of models in order to retrain and redeploy as needed.
“Organizations can increase the percentage of models that are successfully deployed in production by implementing MLOps tooling, which aids in managing data science users, data, model versions, and experiments,” says Moor Insights. “The tooling should also allow IT teams to manage the develop-to-deploy cycle with the same DevOps rigor as traditional enterprise apps.”
This approach can help companies bridge the divide between the data and IT sides.
“A few years ago there was emphasis on deep learning engineers and data scientists as the heroes of the industry,” says Balint. “I think the unsung heroes are the DevOps and MLOps engineers that sit in the IT group, because you need to build the right solutions and stacks for everybody else to do their job. If you don’t have that, you can’t move very quickly.”
Go here to get more information about AI model development using DGXTM-Ready Software on NVIDIA DGX Systems, powered by DGX A100 Tensor core GPUs and AMD EPYCTM CPUs.