o3-mini
o3-mini is a cost-efficient reasoning model in the o3 family, delivering strong chain-of-thought performance on math, code, and science at a fraction of full o3's cost, with configurable reasoning effort for flexible cost-quality tradeoffs.
import { streamText } from 'ai'
const result = streamText({ model: 'openai/o3-mini', prompt: 'Why is the sky blue?'})What To Consider When Choosing a Provider
Zero Data Retention
AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.Authentication
AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
o3-mini makes chain-of-thought reasoning affordable enough to run on every request rather than reserving it for the hardest problems. The reasoning_effort parameter enables further cost optimization.
Like o1-mini before it, o3-mini concentrates its reasoning capability on structured problem domains rather than broad general knowledge.
When to Use o3-mini
Best For
Math and science reasoning:
Competition-level problems, derivations, and quantitative analysis at accessible cost
Code reasoning:
Algorithm analysis, debugging, and optimization with step-by-step deliberation
High-frequency reasoning pipelines:
Per-request chain-of-thought on technical workloads at scale
Education platforms:
Tutoring and problem-solving assistance with visible reasoning steps
Cost-optimized reasoning:
Tasks that benefit from deliberation but don't justify full o3 pricing
Consider Alternatives When
Maximum reasoning quality:
Full o3 for the hardest problems where every increment of accuracy matters
Broader knowledge needed:
Full o3 or GPT-5 for tasks requiring wide-ranging factual recall
Fastest reasoning:
O4-mini for a newer cost-efficient reasoning option with vision support
General-purpose tasks:
GPT-5 mini for workloads that don't benefit from chain-of-thought
Conclusion
o3-mini makes chain-of-thought reasoning broadly accessible by bringing o3-family performance to a cost tier that scales. For technical workloads on AI Gateway where per-request reasoning is desirable but full o3 pricing is not, it provides the right balance.
FAQ
o3-mini is the next generation of cost-efficient reasoning, delivering stronger performance on key benchmarks while maintaining the affordability that makes per-request reasoning practical.
Yes. You can control reasoning depth per request, enabling cost optimization across mixed-difficulty workloads.
200K tokens, matching the o3 family.
When the hardest problems require maximum reasoning depth and the quality gap between mini and full is consequential for your application.
AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.
This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.