Skip to content

GPT 5.4 Pro

GPT 5.4 Pro is the highest-capability tier of the GPT-5.4 family, designed for maximum performance on the most complex tasks with extended reasoning and GPT-5.4 agentic capabilities.

ReasoningTool UseVision (Image)File InputImplicit CachingWeb Search
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'openai/gpt-5.4-pro',
prompt: 'Why is the sky blue?'
})

Playground

Try out GPT 5.4 Pro by OpenAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
OpenAI
Legal:Terms
Privacy
1.1M
76tps
$30.00/M
$180.00/M
$10/K
+ input costs
03/05/2026
Azure
Legal:Terms
Privacy
1M
$30.00/M
$180.00/M
$14/K
+ input costs
03/05/2026
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by OpenAI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
1.9s
111tps
$5.00/M
$30.00/M
Read:
$0.5/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
04/24/2026
400K
1.5s
153tps
$0.75/M$4.50/M
Read:$0.07/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/17/2026
400K
0.6s
20tps
$0.20/M$1.25/M
Read:$0.02/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/17/2026
1.1M
0.8s
61tps
$2.50/M
$15.00/M
Read:
$0.25/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/05/2026
128K
1.0s
109tps
$1.25/M$10.00/M
Read:$0.13/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
11/12/2025
131K
0.1s
1945tps
$0.35/M$0.75/M
Read:$0.25/M
Write:
baseten logo
bedrock logo
cerebras logo
+5
08/05/2025

About GPT 5.4 Pro

GPT 5.4 Pro became available on March 5, 2026 on AI Gateway as the premium tier of the GPT-5.4 model family. It's designed for developers who need maximum performance on the most complex tasks, applying extended compute to produce the deepest reasoning and highest-quality output.

Like the base GPT-5.4, it brings the agentic and reasoning leaps from GPT-5.3 Codex to all domains. The pro tier goes further, handling the subset of problems where the standard model's capability ceiling is a binding constraint. Complex multi-step workflows, advanced research synthesis, and high-stakes analysis all benefit from the additional reasoning depth.

With a context window of 1.1M tokens and the full API feature set, GPT 5.4 Pro handles any task the GPT-5.4 family supports, with more computation applied per request. Route your hardest queries here and use the standard tier for routine traffic.

What To Consider When Choosing a Provider

  • Configuration: GPT 5.4 Pro is for developers who need maximum performance on the most complex tasks. It applies the most compute per request in the GPT-5.4 family.
  • Configuration: Most teams route only their hardest queries to the pro tier, pairing it with GPT-5.4 or GPT-5.4 mini for routine traffic to manage costs.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use GPT 5.4 Pro

Best For

  • Complex agentic workflows: Multi-step tasks with tools, research, and multiple sources
  • Expert-level content creation: Technical documentation, legal drafting, and precision-critical writing
  • Maximum GPT-5.4 compute: Applies the most compute per request of any GPT-5.4 variant
  • Inherits GPT-5.3 Codex leaps: Brings the agentic and reasoning gains from GPT-5.3 Codex into the GPT-5.4 pro tier

Consider Alternatives When

  • Standard workloads: GPT-5.4 for most tasks without the pro tier's premium
  • Cost-sensitive applications: GPT-5.4 mini or nano for high-volume production
  • Specialized reasoning: O3-pro for pure mathematical and scientific chain-of-thought
  • Speed-critical responses: GPT-5.4 mini or nano for lower-latency workloads

Conclusion

GPT 5.4 Pro provides the highest tier of GPT-5.4 family capability, available through AI Gateway. For the most complex tasks where quality and reliability are the primary metrics, it represents the highest tier in the GPT-5.4 family.

Frequently Asked Questions

  • How does GPT 5.4 Pro differ from standard GPT-5.4?

    It applies more compute per request, producing deeper reasoning and higher-quality output on hard problems. On simpler tasks the difference may be minimal.

  • When is the pro tier justified?

    For high-stakes analysis, complex multi-step agentic workflows, expert-level content, and tasks where the quality difference over standard GPT-5.4 directly impacts outcomes.

  • What context window does GPT 5.4 Pro support?

    1.1M tokens, matching the GPT-5.4 family.

  • Is GPT 5.4 Pro slower than standard GPT-5.4?

    It may take longer per request due to the additional compute. For latency-sensitive applications, route only specific high-value queries to the pro tier.

  • Does GPT 5.4 Pro share the same agentic improvements as GPT-5.4?

    Yes. It includes all of GPT-5.4's agentic and reasoning advances, plus extended compute for maximum performance on the hardest problems.

  • How does AI Gateway handle authentication for GPT 5.4 Pro?

    AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.

  • What are typical latency characteristics?

    This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.