Skip to content
Vercel April 2026 security incident

o4-mini

openai/o4-mini

o4-mini advances OpenAI's compact reasoning model line with stronger performance and greater efficiency than o3-mini, adding native tool use and image reasoning.

File InputReasoningTool UseVision (Image)Implicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'openai/o4-mini',
prompt: 'Why is the sky blue?'
})

What To Consider When Choosing a Provider

  • Zero Data Retention

    AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.

    Authentication

    AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

o4-mini incorporates advances beyond o3-mini, including native vision support. It's a strong option for projects that need affordable chain-of-thought reasoning.

Unlike earlier mini reasoning models, o4-mini natively supports vision input, enabling reasoning over images, diagrams, and documents.

When to Use o4-mini

Best For

  • Affordable chain-of-thought reasoning:

    Per-request deliberation on technical tasks at scale

  • Visual reasoning:

    Analyzing diagrams, charts, mathematical notation, and screenshots with step-by-step thinking

  • Tool-using agents:

    Lightweight reasoning backbone for agents that call external tools and APIs

  • Math and code reasoning:

    Competition-level problems and algorithmic analysis at accessible cost

  • Mixed-difficulty pipelines:

    Using reasoning_effort to optimize cost across varied query complexity

Consider Alternatives When

  • Maximum reasoning depth:

    O3 or o3-pro for the hardest problems requiring exhaustive deliberation

  • General-purpose tasks:

    GPT-5 mini for workloads that don't benefit from chain-of-thought

  • Coding agent workflows:

    Codex models for autonomous software engineering

  • Non-reasoning speed:

    GPT-5.1 instant for the fastest possible general-purpose responses

Conclusion

o4-mini combines stronger reasoning performance than o3-mini with native vision and tool use at an affordable price point. For technical workloads on AI Gateway that need per-request reasoning with multimodal support, it advances the cost-efficient reasoning tier.

FAQ

It delivers stronger reasoning performance with greater efficiency, adds native vision support, and includes improved tool use capabilities.

Yes. Unlike earlier mini reasoning models, it natively processes images, diagrams, and visual content as part of its chain-of-thought reasoning.

It controls how deeply the model reasons per request. Low effort for simple queries saves cost; high effort for hard problems enables thorough deliberation.

200K tokens, providing ample capacity for complex reasoning tasks.

AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.

When the hardest problems require maximum reasoning depth and the quality gap between o4-mini and o3 is consequential for your application.

This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.