Skip to content

o3

o3 is OpenAI's advanced reasoning model that succeeds o1, delivering stronger chain-of-thought performance on mathematical, scientific, and coding problems with improved efficiency and full tool support.

File InputReasoningTool UseVision (Image)Implicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'openai/o3',
prompt: 'Why is the sky blue?'
})

Frequently Asked Questions

  • How does o3 improve over o1?

    It delivers stronger performance on reasoning benchmarks while using reasoning tokens more efficiently, resulting in better accuracy at comparable or lower cost per request.

  • Does o3 support function calling?

    Yes. It ships with function calling, structured outputs, vision input, and system messages, the full production API feature set.

  • What is the reasoning_effort parameter?

    It controls how deeply the model reasons per request. Low effort for simple queries saves cost; high effort for hard problems enables maximum deliberation.

  • What context window does o3 support?

    200K tokens, supporting lengthy inputs for complex reasoning tasks.

  • How does AI Gateway handle authentication for o3?

    AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.

  • When should I use o3 versus GPT-5?

    Use o3 for problems that benefit from extended chain-of-thought reasoning (math, science, hard coding). Use GPT-5 for general-purpose tasks, creative writing, and conversational workloads.

  • What are typical latency characteristics?

    This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.