o3
o3 is OpenAI's advanced reasoning model that succeeds o1, delivering stronger chain-of-thought performance on mathematical, scientific, and coding problems with improved efficiency and full tool support.
import { streamText } from 'ai'
const result = streamText({ model: 'openai/o3', prompt: 'Why is the sky blue?'})Frequently Asked Questions
How does o3 improve over o1?
It delivers stronger performance on reasoning benchmarks while using reasoning tokens more efficiently, resulting in better accuracy at comparable or lower cost per request.
Does o3 support function calling?
Yes. It ships with function calling, structured outputs, vision input, and system messages, the full production API feature set.
What is the
reasoning_effortparameter?It controls how deeply the model reasons per request. Low effort for simple queries saves cost; high effort for hard problems enables maximum deliberation.
What context window does o3 support?
200K tokens, supporting lengthy inputs for complex reasoning tasks.
How does AI Gateway handle authentication for o3?
AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.
When should I use o3 versus GPT-5?
Use o3 for problems that benefit from extended chain-of-thought reasoning (math, science, hard coding). Use GPT-5 for general-purpose tasks, creative writing, and conversational workloads.
What are typical latency characteristics?
This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.