Skip to content

GPT-5

GPT-5 is the standard tier of the GPT-5 model family, unifying advanced reasoning, coding, and multimodal capabilities in a single architecture that surpasses its predecessors across benchmarks while maintaining broad general-purpose utility.

File InputReasoningTool UseVision (Image) Image GenImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'openai/gpt-5',
prompt: 'Why is the sky blue?'
})

Playground

Try out GPT-5 by OpenAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

About GPT-5

GPT-5 launched on August 1, 2025 as a generational leap over the GPT-4 family. Available through AI Gateway, it brings together advances in reasoning depth, coding proficiency, instruction following, and multimodal understanding in a single model.

The model supports a context window of 400K tokens, enabling developers to pass extensive codebases, long documents, or rich conversation histories in a single request. Combined with improved instruction adherence, GPT-5 follows complex multi-constraint specifications more reliably than its predecessors.

As the standard tier of the GPT-5 family, it sits alongside GPT-5 mini, GPT-5 nano, and GPT-5 pro, each targeting different points on the capability-cost spectrum. GPT-5 is the default choice when full GPT-5 family capability is the priority.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Azure
Legal:Terms
Privacy
400K
5.7s
274tps
$1.25/M$10.00/M
Read:$0.13/M
Write:
$14/K
+ input costs
08/01/2025
OpenAI
Legal:Terms
Privacy
400K
1.4s
169tps
$1.25/M$10.00/M
Read:$0.13/M
Write:
$10.00/K
+ input costs
+1
08/01/2025
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by OpenAI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
3.4s
111tps
$5.00/M
$30.00/M
Read:
$0.5/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
04/24/2026
400K
2.5s
255tps
$0.75/M$4.50/M
Read:$0.07/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/17/2026
400K
0.6s
120tps
$0.20/M$1.25/M
Read:$0.02/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/17/2026
1.1M
0.6s
60tps
$2.50/M
$15.00/M
Read:
$0.25/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/05/2026
128K
0.7s
90tps
$1.25/M$10.00/M
Read:$0.13/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
11/12/2025
131K
0.1s
269tps
$0.35/M$0.75/M
Read:$0.25/M
Write:
baseten logo
bedrock logo
cerebras logo
+5
08/05/2025

What To Consider When Choosing a Provider

  • Configuration: GPT-5 provides the full capability of the GPT-5 architecture. It's the right choice when you need high overall quality across reasoning, coding, and creative tasks and cost is secondary to capability.
  • Configuration: For cost-sensitive deployments, consider GPT-5 mini or GPT-5 nano, which deliver much of the same architectural improvement at lower price points.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use GPT-5

Best For

  • Complex multi-step reasoning: Research analysis, strategic planning, and problems requiring extended deliberation
  • Advanced code generation: Full-repository comprehension, architectural decisions, and complex refactoring
  • Multimodal workflows: Processing images, documents, and text together in sophisticated analysis pipelines
  • High-stakes content generation: Legal drafting, technical documentation, and precision-critical writing
  • Agentic systems: Backbone model for autonomous agents that need the full capability of the GPT-5 family

Consider Alternatives When

  • Cost-sensitive workloads: GPT-5 mini offers strong capability at a lower price point
  • Lightweight tasks: GPT-5 nano or GPT-4.1 nano handle classification and routing more efficiently
  • Specialized reasoning: The o-series reasoning models may outperform on pure mathematical and scientific reasoning tasks
  • Speed-critical applications: Smaller models provide faster time-to-first-token for real-time chat

Conclusion

GPT-5 advanced reasoning, coding, and multimodal capability when the GPT-5 family launched. For applications routed through AI Gateway that need the full GPT-5 family capability, it is the standard tier.

Frequently Asked Questions

  • How does GPT-5 compare to GPT-4o?

    GPT-5 represents a generational improvement over GPT-4o across reasoning, coding, instruction following, and multimodal capabilities.

  • What context window does GPT-5 support?

    GPT-5 supports a context window of 400K tokens, enabling full-codebase analysis and extended document processing in a single request.

  • When should I use GPT-5 versus GPT-5 mini?

    Use GPT-5 when you need the highest-tier capability in the GPT-5 family. Use GPT-5 mini when you need strong performance at lower cost, particularly for high-volume production workloads.

  • Does GPT-5 support multimodal input?

    Yes. It accepts text and image inputs, enabling vision-based analysis, document processing with figures, and mixed-modality workflows.

  • How does AI Gateway handle authentication for GPT-5?

    AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.

  • What is the pricing for GPT-5?

    See the pricing section on this page for today's rates. AI Gateway exposes each provider's pricing for GPT-5.

  • What are typical latency characteristics?

    This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.