How open source is steering AI down the high road


Linux Foundation’s Jim Zemlin at Open Source China 2024.

Steven Vaughan-Nichols/ZDNET

HONG KONG — At the Open Source Summit China, Jim Zemlin, the Linux Foundation‘s executive director, said that everyone he’s been talking to in China wants to talk about artificial intelligence (AI). Why should China be different from anywhere else? 

Zemlin went on to highlight his organization’s significant contributions to AI development through open-source software initiatives. He pointed out several key areas where open-source principles are enhancing AI development: 

Also: Can AI even be open source? It’s complicated

Fine-tuned specialized models: The Linux Foundation is actively working on projects like the Open Platform for Enterprise AI, which aims to create standards for deploying specialized AI models in enterprise settings. This initiative seeks to facilitate collaboration and streamline the deployment of AI technologies.

“In Beijing,” Zemlin said, “I saw a talk from Alibaba [which] was creating an AI application for early detection of pancreatic cancer. This application is already saving lives in China by helping to detect pancreatic cancer as early as possible.” Now, that’s impressive.

Large language models (LLMs): Semi-open-source models such as Mistral and Llama 3 are rapidly evolving and often rival their purely proprietary counterparts. The Foundation supports these developments, allowing organizations to leverage powerful AI tools without the constraints of closed systems.

“Platforms such as Hugging Face,” Zemlin continued, “are the clear leaders here. There’s a whole ecosystem of open models that people can download and utilize. These enable developers to access and utilize a wide range of AI applications.”

AI safety: “Open-source development’s transparent nature is particularly beneficial for addressing AI safety concerns,” Zemlin noted. “The Linux Foundation aims to combat issues such as content authenticity, privacy, and algorithmic bias by fostering collaboration on tools and standards.” 

Linux Foundation’s AI initiatives

The Linux Foundation, he continued, is spearheading several initiatives that underscore its commitment to fostering open-source AI. These include:

Open Model Initiative (OMI): This project promotes the development of AI models under irrevocable open licenses, removing barriers to enterprise adoption and encouraging widespread use. OMI is also meant to stop companies from closing off once open models. 

Acumos AI: An open-source platform designed for building, sharing, and deploying AI applications, Acumos standardizes the infrastructure necessary for AI development, making it easier for developers to innovate.

Also: A new White House report embraces open-source AI

PyTorch: As one of the Foundation’s fastest-growing projects, PyTorch is the preferred tool for creating machine learning and LLMs, further solidifying the Foundation’s role in AI development.

Unified Acceleration Foundation: This initiative aims to create a common acceleration API that can be utilized across various silicon architectures, promoting competition and simplifying development for AI applications.

Coalition for Content Provenance and Authenticity: This effort focuses on ensuring content authenticity through digital watermarking, a crucial aspect in a world increasingly influenced by generative AI technologies.

Zemlin also emphasized the importance of establishing a clear definition of “open” in the context of AI. While the Open Source Initiative (OSI) is doing the yeoman work of defining open-source AI, the Linux Foundation has developed the Model Openness Framework (MOF)

“MOF, ” Zermlin explained, “is a way to help evaluate if a model is open or not open. It allows people to grade models. People always ask, is ‘Llama 3 really open? Is this particular model really open? I don’t get the data. I’m not sure, really, how it was trained.'”

Also: Sonos is failing and millions of devices could go with it – why open-source audio is our only hope

MOF provides an open framework to help answer those questions — no easy task given the many moving parts in LLM production and deployment. The Linux Foundation has created a grading system to help understand which components are open and included in a model. 

Zemlin continued, “We agreed on three different classes of openness. The highest level, level one, is an open science definition where the data and every component that was used and all of the instructions need to actually go and create your own model the exact same way. “Level two is a subset of that where not everything is actually open, but most of them are. Then, on level three, you have areas where the data may not be available, and the data that describe the data sets would be available. And you can kind of understand that even though the model is open, that not all the data is available.”

He concluded, “This is a great way for you to all take a risk-based approach, a more nuanced approach to understanding what is open and not. It enables you to evaluate the openness of any particular model based on various components, including data access, model architecture, and training processes. This framework allows practitioners to assess models’ transparency and make informed decisions about their use.”

Also: The best Linux laptops: Expert tested and reviewed

Having spent a lot of time talking to experts about open source and AI, I expect this graded model will become the standard in the years to come. 

Put it all together, and the Linux Foundation’s initiatives are not only advancing AI technologies but also ensuring that these advancements are made ethically and responsibly. By promoting open-source collaboration, the Foundation is creating an inclusive environment where everyone, not just huge companies with budgets to match, can contribute to and benefit from AI innovations. 





Source link