OpenAI used to test its AI models for months – now it's days…

Elyse Betters Picaro / ZDNET

On Thursday, the Financial Times reported that OpenAI has dramatically minimized its safety testing timeline.

Also: The top 20 AI tools of 2025 – and the No. 1 thing to remember when you use them

Eight people who are either staff at the company or third-party testers told FT that they had “just days” to complete evaluations on new models — a process they say they would normally be given “several months” for.

Competitive edge

Evaluations are what can surface model risks and other harms, such as whether a user could jailbreak a model to provide instructions for creating a bioweapon. For comparison, sources told FT that OpenAI gave them six months to review GPT-4 before it was released — and that they only found concerning capabilities after two months.

Also: Is OpenAI doomed? Open-source models may crush it, warns expert

Sources added that OpenAI’s tests are not as thorough as they used to be and lack the necessary time and resources to properly catch and mitigate risks. “We had more thorough safety testing when [the technology] was less important,” one person, who is currently testing o3, the full version of o3-mini, told FT. They also described the shift as “reckless” and “a recipe for disaster.”

Also: This new AI benchmark measures how much models lie

The sources attributed the rush to OpenAI’s desire to maintain a competitive edge, especially as open-weight models from competitors, like Chinese AI startup DeepSeek, gain more ground. OpenAI is rumored to be releasing o3 next week, which FT’s sources say rushed the timeline to under a week.

No regulation

The shift emphasizes the fact that there is still no government regulation for AI models, including any requirements to disclose model harms. Companies including OpenAI signed voluntary agreements with the Biden administration to conduct routine testing with the US AI Safety Institute, but records of those agreements have quietly fallen away as the Trump administration has reversed or dismantled all Biden-era AI infrastructure.

Also: OpenAI research suggests heavy ChatGPT use might make you feel lonelier

However, during the open comment period for the Trump administration’s forthcoming AI Action Plan, OpenAI advocated for a similar arrangement to avoid navigating patchwork state-by-state legislation.

Also: The head of US AI safety has stepped down. What now?

Outside the US, the EU AI Act will require that companies risk test their models and document results.

“We have a good balance of how fast we move and how thorough we are,” Johannes Heidecke, head of safety systems at OpenAI, told FT. Testers themselves seemed alarmed, though, especially considering other holes in the process, including evaluating the less-advanced versions of the models that are then released to the public or referencing an earlier model’s capabilities rather than testing the new one itself.

Also: The Turing Test has a problem – and OpenAI’s GPT-4.5 just exposed it

Get the morning’s top stories in your inbox each day with our Tech Today newsletter.

Source link

OpenAI used to test its AI models for months – now it's days. Why that matters

Competitive edge

No regulation

VMWARE

Helping Public Sector Organisations Define Cloud Strategy

How to change the VLAN ID of the Service Console in ESX from the command line/console

Cisco UCS and Vmware Interfaces (Vnics) HA Design Considerations

Troubleshooting network and TCP/UDP port connectivity issues on ESX/ESXi(2020669)

vSphere Client Parameters

Configuration Templates

CUE Licenses

Trouble shooting Unity Express with Call Manager Integeration & Operational Issues

CME Configuration Example: SIP Trunks to Viatalk and VoIP.ms

SIP Phone registration – CME Configuration

CUE Voicemail + VPIM networking (CUE to unity)

Related Post

Competitive edge

No regulation

VMWARE

Configuration Templates