This Wednesday, OpenAI introduced the next installment in its o-series of models: OpenAI o3 and o4-mini. OpenAI calls these models "the smartest models [they]’ve released to date". Additionally, the startup is making its newly released models even smarter by enabling them to "agentically" use and combine the tools available in ChatGPT, including web search, file and data analysis using Python, reasoning based on visual understanding, and image generation.

Broadly put, what OpenAI means when it says its models can use the tools available in ChatGPT "agentically" is that o3 and o4-mini are trained to determine when and how to utilize these tools to produce detailed responses, rather than having the user specify which tools the model should use with each query. OpenAI says this enables both models to deliver correctly formatted results to complex problems, usually taking less than a minute. In addition to making o3 and o4-mini into strongly performing models that score highly in many popular benchmarks, this approach also paves the way for truly agentic versions of ChatGPT, which can perform tasks on behalf of users.

OpenAI o3: Smarter and more affordable than its predecessor

OpenAI o3 sets the new standard for powerful reasoning, pushing boundaries in coding, mathematics, scientific understanding, and visual perception. Notably, the model has achieved state-of-the-art performance on several competition math and coding benchmarks, including Codeforces, SWE-bench, and AIME 2024/25, without using additional tools.

OpenAI says that external experts found that o3 makes 20% fewer major errors than its predecessor, OpenAI o1, on difficult real-world tasks. According to the startup, hese evaluations also found that o3 is particularly strong performing in areas like programming, business and consulting, and creative ideation. Additionally, early testers have reportedly remarked on the model's analytical capabilities as a thought partner and its ability to generate and critically evaluate novel hypotheses, especially in biology, mathematics, and engineering contexts.

OpenAI o4-mini: Efficient Reasoning

Like most models on the smaller side, o4-mini model offers optimized for fast, cost-efficient reasoning without significant losses in performance. Despite its smaller size, it performs remarkably in mathematics, coding, and visual tasks. Notably, o4-mini is especially proficient at competition math problems, achieving higher scores than o1, o3-mini and o3 on the AIME 2024 and 2025 benchmarks without using additional tools.

According to OpenAI, expert evaluators found that o4-mini outperforms o3-mini, on non-STEM and data science tasks. Moreover, its efficiency supports significantly higher usage limits than o3; this makes o4-mini well-suited for high-volume applications that benefit from quick processing and sophisticated reasoning.

More generally, OpenAI claims the external expert evaluations reveal that both models exhibit improved instruction following capabilities and deliver more useful and verifiable answers thanks to their improved intelligence and the addition of web search. Both o3 and o4-mini are expected to benefit from the recently introduced memory feature in ChatGPT that enables the assistant to reference past conversations.

Visual Reasoning Breakthroughs

The two advancements o3 and o4-mini make in the realm of visual understanding is that both models can integrate images directly into their chain of thought and can manipulate them to make the necessary adjustments to make low-quality, reversed or blurry images readable. These and other improvements, says OpenAI, enables the models to be more accurate on visual understanding tasks and to answer questions that were inaccessible to their predecessors.

Availability and Access

ChatGPT Plus, Pro, and Team users can access o3, o4-mini, and o4-mini-high starting immediately, as the new models are intended to replace o1, o3-mini, and o3-mini-high, respectively. Access for ChatGPT Enterprise and Edu users will be ready in about a week. Free users can try o4-mini by selecting 'Think' in the composer before submitting queries. ChatGPT Pro users will also gain access to o3-pro in the coming weeks as a replacement for o1-pro.

Both models are also available to developers via the Chat Completions API and Responses API. The latter is compatible with reasoning summaries and preserving reasoning tokens around function calls for improved performance. The Responses API is also expected to gain full support of built-in tool use in the near future.