Skip to content
Dashboard

GPT-5 mini

GPT-5 mini delivers GPT-5 family intelligence at a reduced cost tier, making advanced reasoning, coding, and multimodal capabilities accessible for high-volume production workloads where full GPT-5 pricing is impractical.

File InputReasoningTool UseVision (Image)Implicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'openai/gpt-5-mini',
prompt: 'Why is the sky blue?'
})

Playground

Try out GPT-5 mini by OpenAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

openai logo
openai logo

Ask GPT-5 mini anything to try it out.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Azure
400K
5.6s
95tps
$0.25/M$2.00/M
Read:$0.03/M
Write:—
$14/K
+ input costs
—
+3
08/07/2025
OpenAI
400K
5.0s
89tps
$0.25/M$2.00/M
Read:$0.03/M
Write:—
$10.00/K
+ input costs
—
+4
08/07/2025
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by OpenAI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
0.8s
32tps
$5.00/M
$30.00/M
Read:
$0.5/M
Write:
—
$10.00/K
+ input costs
—
+4
azure logo
bedrock logo
openai logo
04/24/2026
400K
0.8s
265tps
$0.75/M$4.50/M
Read:$0.07/M
Write:—
$10.00/K
+ input costs
—
+4
azure logo
openai logo
03/17/2026
400K
0.6s
106tps
$0.20/M$1.25/M
Read:$0.02/M
Write:—
$10.00/K
+ input costs
—
+4
azure logo
openai logo
03/17/2026
1.1M
0.9s
73tps
$2.50/M
$15.00/M
Read:
$0.25/M
Write:
—
$10.00/K
+ input costs
—
+4
azure logo
openai logo
03/05/2026
131K
0.1s
325tps
$0.35/M$0.75/M
Read:$0.25/M
Write:—
——
baseten logo
bedrock logo
cerebras logo
+5
08/05/2025
128K
0.5s
71tps
$0.15/M$0.60/M
Read:$0.07/M
Write:—
$14/K
+ input costs
—
+2
azure logo
openai logo
07/18/2024

About GPT-5 mini

GPT-5 mini launched on August 7, 2025 alongside GPT-5, GPT-5 nano, and GPT-5 pro as part of the GPT-5 model family. It occupies the cost-performance tier that typically handles the majority of production API traffic: capable enough for most tasks, affordable enough to scale.

The model inherits the GPT-5 family's architectural improvements in reasoning, coding, and instruction following while operating at reduced compute requirements. It supports a context window of 400K tokens, multimodal input, function calling, and structured outputs, providing the full feature set developers need for production applications.

If you're migrating from GPT-4o mini or GPT-4.1 mini, GPT-5 mini is the next generation of the mid-tier model. It improves quality across the board while maintaining the economics that make high-volume deployment practical.

What To Consider When Choosing a Provider

  • Configuration: GPT-5 mini is a strong choice for most production traffic in the GPT-5 family. It provides enough capability for the vast majority of tasks while keeping per-request costs manageable at scale.
  • Configuration: It sits between GPT-5 nano (fastest, cheapest) and full GPT-5 (most capable), covering the middle ground where most real-world applications operate.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use GPT-5 mini

Best For

  • Production chat interfaces: Fast, capable responses for customer-facing conversational products
  • Code assistance: Strong coding support for development tools at sustainable per-request costs
  • Document processing: Analyzing and summarizing documents with GPT-5 family instruction following
  • Agentic workflows: Cost-effective backbone for multi-step agent pipelines with many sequential calls
  • Content generation: Marketing copy, technical writing, and editorial assistance at volume

Consider Alternatives When

  • Maximum capability needed: Full GPT-5 for the highest quality on complex tasks
  • Minimal cost required: GPT-5 nano for classification, routing, and simple extraction
  • Deep reasoning: O3 for problems requiring extended chain-of-thought deliberation
  • Legacy compatibility: GPT-4o mini if you need to maintain existing integrations without migration

Conclusion

GPT-5 mini is the default production model in the GPT-5 family, balancing capability and cost for the workloads that make up the bulk of real-world API traffic. Available through AI Gateway, it is the natural upgrade path from GPT-4o mini and GPT-4.1 mini.