OpenAI slashes AI model safety testing time

CALIFORNIA, UNITED STATES — OpenAI has reduced the time and resources allocated to safety testing its most powerful artificial intelligence (AI) models, sparking internal and external concern that its latest releases are being rushed without adequate safeguards.
The San Francisco-based company, valued at $300 billion, is reportedly pushing to launch its next-generation model, dubbed “o3.” But according to multiple sources familiar with the process, safety evaluators—who previously had months to conduct thorough testing—now often receive just days to flag potential risks.
“We had more thorough safety testing when [the technology] was less important,” said one current evaluator, calling the process “a recipe for disaster.”
Competitive pressures driving shortcuts
The accelerated schedule appears to be fueled by mounting pressure to stay ahead in the AI arms race, as OpenAI faces stiff competition from Big Tech rivals like Google, Meta, and Elon Musk’s xAI.
One person involved in testing o3 said the model is designed for high-level reasoning and problem-solving tasks—capabilities that carry greater risk of misuse.
“Because there is more demand for it, they want it out faster. I hope it is not a catastrophic mis-step, but it is reckless. This is a recipe for disaster.”
OpenAI acknowledged making its testing processes more efficient through automation but insisted the models are still thoroughly evaluated for catastrophic risks.
Former OpenAI staff, experts warn
Critics argue OpenAI is not living up to its own safety standards. Former OpenAI safety researcher Steven Adler warned that the company may be underestimating the worst risks by skipping key evaluations, such as fine-tuning tests on its most advanced models to simulate misuse scenarios—like engineering a more transmissible virus.
“It is great OpenAI set such a high bar by committing to testing customised versions of their models. But if it is not following through on this commitment, the public deserves to know,” Adler said.
Meanwhile, former staff members have raised concerns about OpenAI’s practice of testing earlier “checkpoints” rather than final versions of models released to the public. “It is bad practice to release a model which is different from the one you evaluated,” said a former technical staff member.
Looming AI regulation may change the game
There are currently no global standards for AI model testing. However, the EU’s AI Act will require mandatory safety checks for high-risk systems. Until then, most safety efforts remain voluntary.
“We have a good balance of how fast we move and how thorough we are,” said Johannes Heidecke, head of OpenAI’s safety systems.