o3
o3 is OpenAI's advanced reasoning model that succeeds o1, delivering stronger chain-of-thought performance on mathematical, scientific, and coding problems with improved efficiency and full tool support.
import { streamText } from 'ai'
const result = streamText({ model: 'openai/o3', prompt: 'Why is the sky blue?'})Playground
Try out o3 by OpenAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
About o3
o3 was released on April 16, 2025 as the successor to o1 in OpenAI's reasoning model series. It advances the chain-of-thought paradigm introduced by o1-preview: the model generates internal reasoning tokens, working through problems step by step, checking its work, and trying alternative approaches before producing a final answer.
o3 improves on o1 across key reasoning benchmarks while using reasoning tokens more efficiently. It launches with the full set of production API features: function calling for tool use, structured outputs via JSON schema constrained decoding, vision input for reasoning over images and diagrams, and developer system messages for behavioral control.
The context window of 200K tokens accommodates the lengthy inputs that complex reasoning tasks demand. The model also supports the reasoning_effort parameter, letting you control how deeply it thinks on a per-request basis for efficient handling of mixed-difficulty workloads.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by OpenAI
| Model |
|---|
What To Consider When Choosing a Provider
- Configuration: o3 generates internal reasoning tokens that work through problems step by step before producing a visible response. This trades latency for accuracy on hard problems.
- Configuration: Unlike earlier reasoning previews, o3 ships with function calling, structured outputs, vision, and system messages from day one.
- Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use o3
Best For
- Advanced mathematical reasoning: Competition-level math, proofs, and quantitative analysis
- Complex coding problems: Algorithm design, optimization, and architectural reasoning
- Scientific analysis: Multi-step derivations in physics, chemistry, and biology
- Agentic reasoning: Agent backbones that need deep deliberation before acting
- Hard problem solving: Any task where extended chain-of-thought produces measurably better results
Consider Alternatives When
- General-purpose tasks: GPT-5 or GPT-5.2 for conversational and generative workloads that don't need chain-of-thought
- Cost-sensitive reasoning: O4-mini for reasoning at a lower price point
- Maximum reasoning compute: O3-pro for the hardest problems that benefit from extended computation
- Fast responses: GPT-5.1 instant or GPT-4o when latency matters more than reasoning depth
Conclusion
o3 advances reasoning model capability beyond o1, delivering stronger performance and greater efficiency than o1 with full production API features. For the hardest analytical, mathematical, and coding problems routed through AI Gateway, it is the standard reasoning model.
Frequently Asked Questions
How does o3 improve over o1?
It delivers stronger performance on reasoning benchmarks while using reasoning tokens more efficiently, resulting in better accuracy at comparable or lower cost per request.
Does o3 support function calling?
Yes. It ships with function calling, structured outputs, vision input, and system messages, the full production API feature set.
What is the
reasoning_effortparameter?It controls how deeply the model reasons per request. Low effort for simple queries saves cost; high effort for hard problems enables maximum deliberation.
What context window does o3 support?
200K tokens, supporting lengthy inputs for complex reasoning tasks.
How does AI Gateway handle authentication for o3?
AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.
When should I use o3 versus GPT-5?
Use o3 for problems that benefit from extended chain-of-thought reasoning (math, science, hard coding). Use GPT-5 for general-purpose tasks, creative writing, and conversational workloads.
What are typical latency characteristics?
This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.