- Hisense's latest laser projector is so sharp and vivid, it may just replace your 4K TV
- If you're planning to upgrade your phone, you might want to buy one now - here's why
- Run LLMs Locally with Docker Model Runner | Docker
- Microsoft unveils 9 new Copilot features - you can try some now
- Nintendo Switch 2 pre-orders delayed, new price hike likely - here's why
Google DeepMind Proposes AI ‘Monitors’ to Police Hyperintelligent Models

Google DeepMind has introduced a new approach to securing frontier generative AI and released a paper on April 2. DeepMind focused on two of its four key risk areas: “misuse, misalignment, mistakes, and structural risks.”
DeepMind is looking beyond current frontier AI to artificial general intelligence (AGI), human-level smarts, which could revolutionize healthcare and other industries or trigger technological chaos. There is some skepticism over whether AGI of that magnitude will ever exist.
Asserting that human-like AGI is imminent and must be prepared for is a hype strategy as old as OpenAI, which started out with a similar mission statement in 2015. Although panic over hyperintelligent AI may not be warranted, research like DeepMind’s contributes to a broader, multipronged cybersecurity strategy for generative AI.
Preventing bad actors from misusing generative AI
Misuse and misalignment are the two risk factors that would arise on purpose: misuse involves a malicious human threat actor, while misalignment describes scenarios where the AI follows instructions in ways that make it an adversary. “Mistakes” (unintentional errors) and “structural risks” (problems arising, perhaps from conflicting incentives, with no single actor) complete the four-part framework.
To address misuse, DeepMind proposes the following strategies:
- Locking down the model weights of advanced AI systems
- Conducting threat modeling research to identify vulnerable areas
- Creating a cybersecurity evaluation framework tailored to advanced AI
- Exploring other, unspecified mitigations
DeepMind acknowledges that misuse occurs with today’s generative AI — from deepfakes to phishing scams. They also cite the spread of misinformation, manipulation of popular perceptions, and “unintended societal consequences” as present-day concerns that could scale up significantly if AGI becomes a reality.
SEE: OpenAI raised $40 billion at a $300 billion valuation this week, but some of the money is contingent on the organization going for-profit.
Preventing generative AI from taking unwanted actions on its own
Misalignment could occur when an AI conceals its true intent from users or bypasses security measures as part of a task. DeepMind suggests that “amplified oversight” — testing an AI’s output against its intended objective — might mitigate such risks. Still, implementing this is challenging. What types of example situations should an AI be trained on? DeepMind is still exploring that question.
One proposal involves deploying a “monitor,” another AI system trained to detect actions that don’t align with DeepMind’s goals. Given the complexity of generative AI, such a monitor would need precise training to distinguish acceptable actions and escalate questionable behavior for human review.