OpenAI unveils AI model o1 with self-fact-checking, advanced reasoning skills
CALIFORNIA, UNITED STATES — OpenAI, the company behind ChatGPT, has unveiled its latest innovation in the field of artificial intelligence: the o1 model.
This new family of AI models, including o1-preview and o1-mini, represents a significant leap forward in generative AI technology, particularly in its ability to fact-check itself and engage in complex reasoning.
Enhanced reasoning capabilities of OpenAI o1
The o1 model stands out for its ability to “think” before responding to queries.
According to Noam Brown, a research scientist at OpenAI, “o1 is trained with reinforcement learning.” This approach teaches the system to develop a “private chain of thought” before providing an answer, resulting in more accurate and well-reasoned responses.
o1 is trained with RL to “think” before responding via a private chain of thought. The longer it thinks, the better it does on reasoning tasks. This opens up a new dimension for scaling. We’re no longer bottlenecked by pretraining. We can now scale inference compute too. pic.twitter.com/niqRO9hhg1
— Noam Brown (@polynoamial) September 12, 2024
OpenAI claims that o1 excels in tasks requiring the synthesis of multiple subtasks, such as detecting privileged emails or brainstorming marketing strategies.
The model’s performance improves with additional processing time, allowing it to reason through problems holistically.
Impressive performance metrics of OpenAI o1
In various tests, o1 has shown remarkable improvements over its predecessors:
- Solved 83% of problems in an International Mathematical Olympiad qualifying exam, compared to GPT-4o’s 13%
- Reached the 89th percentile in Codeforces programming challenges
- Demonstrated enhanced multilingual skills, particularly in Arabic and Korean
OpenAI o1 codes a video game from a prompt. pic.twitter.com/aBEcehP0j8
— OpenAI (@OpenAI) September 12, 2024
Despite its advancements, o1-preview is priced at $15 per 1 million input tokens and $60 per 1 million output tokens, significantly higher than previous models.
Responses are processed more slowly, taking over 10 seconds for complex queries. Some testers also reported that o1 may confidently provide incorrect information more often than GPT-4o.
OpenAI is not alone in its efforts to improve AI reasoning. Google DeepMind researchers have published studies suggesting that providing models with more time to process requests can significantly enhance performance. The competition is fierce, and OpenAI faces the challenge of making o1 more widely accessible while addressing its current limitations.