UK AI Safety Institute: A Blueprint for the Future of AI?
The UK Frontier AI Taskforce, a government-funded initiative launched in April 2023 as the Foundation Model Taskforce, is evolving to become the UK AI Safety Institute.
British Prime Minister Rishi Sunak announced the creation of the Institute during his closing speech at the AI Safety Summit, held in Bletchley Park, England, on November 2, 2023.
He said the UK government’s ambition for this new entity is to make it a global hub tasked with testing the safety of emerging types of AI.
“The Institute will carefully test new types of frontier AI before and after they are released to address the potentially harmful capabilities of AI models, including exploring all the risks, from social harms like bias and misinformation, to the most unlikely but extreme risk, such as humanity losing control of AI completely,” said the UK government in a public statement.
To pursue this mission, the UK AI Safety Institute will partner with domestic organizations like the Alan Turing Institute, Imperial College London, TechUK and the Startup Coalition. All have welcomed the launch of the Institute.
It will also engage with private AI companies both in the UK and abroad. Some of them, such as Google DeepMind and OpenAI, have already publicly backed the initiative.
Confirmed Partnerships with the US and Singapore
Sunak added that the Institute will be at the forefront of the UK government’s AI strategy and will bear the mission to cement the country’s position as a world leader in AI safety.
In undertaking this role, the UK AI Safety Institute will partner with similar institutions in other countries.
The Prime Minister has already announced two confirmed partnerships to collaborate on AI safety testing with the recently announced US AI Safety Institute and with the Government of Singapore.
Read more: AI Safety Summit: Biden-Harris Administration Launches US AI Safety Institute
Ian Hogarth, chair of the Frontier AI Taskforce, will continue as chair of the Institute. The External Advisory Board for the Taskforce, comprised of industry heavyweights from national security to computer science, will now advise the new global hub.
Eight AI Firms Agreed for Pre-Deployment Testing of Their Models
Additionally, Sunak announced that several countries, including Australia, Canada, France, Germany, Italy, Japan, Korea, Singapore, the US, the UK and the EU delegation signed an agreement to test leading companies’ AI models.
To help with this mission, eight companies involved with AI development – Amazon Web Services (AWS), Anthropic, Google, Google DeepMind, Inflection AI, Meta, Microsoft, Mistral AI and OpenAI — have agreed to “deepen” the access to their future AI models before they go public.
The Prime Minister closes out the AI Safety Summit by announcing a landmark agreement with eight major AI companies and likeminded countries on the role of government in pre-deployment testing of the next generation of models for national security and other major risks pic.twitter.com/lLmRZiX7Ip
— Matt Clifford (@matthewclifford) November 2, 2023
On X, the non-profit PauseAI, which actively calls for banning all AI models without proper legislative safeguards in place, called this agreement “a step in the right direction.”
However, it added that relying on pre-deployment testing only is dangerous.
The reasons outlined are:
- Models can be leaked (e.g. Meta’s LLaMA model).
- Testing for dangerous capabilities is difficult. “We don’t know how we can (safely) test if an AI can self-replicate, for example. Or how to test if it deceives humans,” said Pause AI.
- Bad actors can still build dangerous AIs – and pre-deployment testing cannot prevent it from happening.
- Some capabilities are even dangerous inside AI labs. “A self-replicating AI, for example, could escape from the lab before deployment,” wrote Pause AI.
- Capabilities can be added or discovered after training, including fine-tuning, jailbreaking, and runtime improvements.
Read more: 28 Countries Sign Bletchley Declaration on Responsible Development of AI