OpenAI Announces Next-Generation AI Models O3 and O3 Mini

This is Hamamoto from TIMEWELL

This is Hamamoto from TIMEWELL.

OpenAI Announces O3 and O3 Mini on the Final Day of Its 12-Day Event

On December 21 (around 3:00 AM Japan time), on the final day of its 12-day new features and models announcement event, OpenAI unveiled the next-generation AI models O3 and O3 Mini. These models significantly exceed the performance of the preceding O1 model and have achieved remarkable results in programming and mathematics. OpenAI hopes these models will mark the dawn of a new era in artificial intelligence.

This article provides a detailed breakdown of the performance of the next-generation AI models O3 and O3 Mini.

O3 and O3 Mini's Remarkable Performance Record-Breaking Results on the ARC AGI Benchmark What Comes Next Summary

O3 and O3 Mini's Remarkable Performance

The next-generation AI models O3 and O3 Mini have delivered stunning results across a variety of benchmark tests, particularly in programming and mathematics — far surpassing the performance of the predecessor O1 model.

Strong Performance on Software-Style Benchmarks

On the SWE-bench Verified benchmark, which consists of real-world software tasks, O3 achieved approximately 71.7% accuracy — an improvement of 22.8 percentage points over O1, demonstrating a significant leap in software engineering capability. In programming, it scored 2,727 on the Codeforces ELO ranking, surpassing OpenAI's Chief Scientist's score of 2,665 and demonstrating advanced coding ability. In mathematics, O3 achieved a 96.7% correct answer rate on a simulated USA Mathematical Olympiad exam, far exceeding O1's 83.3%.

Furthermore, O3 recorded over 25% accuracy on the Epic AI's Frontier Math Benchmark, currently considered the most difficult mathematics benchmark. This is an impressive result given that other AI models had achieved less than 2%.

O3 Mini likewise demonstrated exceptional performance, delivering performance equal to or better than O1 Mini at significantly lower cost. Both in programming and mathematics, O3 Mini outperformed O1 Mini.

Record-Breaking Results on the ARC AGI Benchmark

O3 also set a new record on the ARC AGI benchmark — a test that AI models had long struggled with. ARC AGI is a benchmark designed to measure AGI (Artificial General Intelligence) — easy for humans but difficult for AI. Until now, humans had averaged around 84% correct, while the best AI scores hovered around 30%.

O3 Surpasses Human-Level Performance on ARC AGI

On the ARC AGI private test set, O3 achieved 75.7% accuracy under low-compute settings, placing first on the public leaderboard. Under high-compute settings, it reached an accuracy rate of 87.5% — more than three times better than previous models and surpassing the human average of 85%.

This marks the first time an AI model has achieved human-level performance on ARC AGI. Greg, a representative of the ARC Prize Foundation, stated that this result is an important milestone toward AGI and expressed anticipation for further collaboration with OpenAI.

Currently, O3 and O3 Mini are not yet publicly available. OpenAI is currently conducting internal safety testing as well as providing access to external researchers to verify safety before proceeding with broader release.

However, early access is available for safety and security researchers. By filling out an application form on OpenAI's website, interested parties can participate in safety testing of O3 and O3 Mini and be among the first to evaluate these next-generation models. (Applications were accepted through January 10.)

Public Release Timeline

OpenAI has announced a plan to release O3 Mini to the general public at the end of January, with O3 to follow shortly after. However, the release schedule is subject to change depending on the results of safety testing.

OpenAI has also published a report on a new safety technology called "Deliberative Alignment." Traditional safety approaches train models by showing examples of safe and unsafe prompts to learn the boundary between acceptable and unacceptable content. However, this new technique leverages the model's reasoning capabilities to more accurately judge the safety of prompts, enabling a better tradeoff between safety and performance — paving the way for AI models that are both safer and more capable.

The next-generation AI models O3 and O3 Mini announced by OpenAI have demonstrated remarkable performance in programming and mathematics, achieving human-level performance on the ARC AGI benchmark — a potential harbinger of a new era in artificial intelligence.

OpenAI is taking careful measures to ensure safety, conducting both internal testing and external researcher evaluations. The company is also working on the new safety technology "Deliberative Alignment," aiming to realize AI models that are both safer and more capable.

Public Launch Dates Subject to Safety Results

O3 Mini is expected to be released publicly at the end of January and O3 shortly after, though timing may change depending on safety test outcomes. These efforts by OpenAI represent an important step in advancing artificial intelligence while ensuring its safety.

Reference: OpenAI Official HP "Day 12 — o3 preview & call for safety researchers"

OpenAI Announces Next-Generation AI Models O3 and O3 Mini

This is Hamamoto from TIMEWELL

OpenAI Announces O3 and O3 Mini on the Final Day of Its 12-Day Event

O3 and O3 Mini's Remarkable Performance

Strong Performance on Software-Style Benchmarks

Record-Breaking Results on the ARC AGI Benchmark

O3 Surpasses Human-Level Performance on ARC AGI

Public Release Timeline

Public Launch Dates Subject to Safety Results

Considering AI adoption for your organization?

Newsletter

あなたのAIリテラシー、診断してみませんか？

Related Knowledge Base

Solutions

Learn More About AIコンサル

Related Articles

The Day the Government Becomes a Startup's 'First Customer': How the New Procurement Package for Japan's 17 Strategic Sectors Changes the Deep Tech Landscape (April 2026 Update)

Management Strategy for an AI-Driven Society — Fujitsu CTO Takagi on the Reality of "Human-Centered AI x Corporate Transformation" [SusHi Tech Tokyo 2026]

AI x Education for Well-being in the Intelligent Age | The Vision of UTokyo President Fujii and Mongolia-born AI Academia at SusHi Tech Tokyo 2026

Newsletter