Skip to content

GPT 5.1 Thinking

GPT 5.1 Thinking is the reasoning-focused member of the GPT-5.1 family, applying extended chain-of-thought computation to produce more thorough and accurate responses on complex analytical, scientific, and multi-step problems.

Tool UseImplicit CachingFile InputReasoningVision (Image)Web Search Image Gen
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'openai/gpt-5.1-thinking',
prompt: 'Why is the sky blue?'
})

Playground

Try out GPT 5.1 Thinking by OpenAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
OpenAI
Legal:Terms
Privacy
400K
1.5s
86tps
$1.25/M$10.00/M
Read:$0.13/M
Write:
$10.00/K
+ input costs
+1
11/12/2025
Azure
Legal:Terms
Privacy
400K
0.9s
$1.25/M$10.00/M
Read:$0.13/M
Write:
$14/K
+ input costs
11/12/2025
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by OpenAI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
1.2s
90tps
$5.00/M
$30.00/M
Read:
$0.5/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
04/24/2026
400K
1.6s
292tps
$0.75/M$4.50/M
Read:$0.07/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/17/2026
400K
0.4s
45tps
$0.20/M$1.25/M
Read:$0.02/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/17/2026
1.1M
0.6s
70tps
$2.50/M
$15.00/M
Read:
$0.25/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/05/2026
128K
0.7s
105tps
$1.25/M$10.00/M
Read:$0.13/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
11/12/2025
131K
0.1s
1053tps
$0.35/M$0.75/M
Read:$0.25/M
Write:
baseten logo
bedrock logo
cerebras logo
+5
08/05/2025

About GPT 5.1 Thinking

GPT 5.1 Thinking was released on November 12, 2025 as part of the GPT-5.1 model generation on AI Gateway. It occupies the reasoning-depth end of the GPT-5.1 spectrum, complementing the speed-optimized instant variant.

The model uses extended chain-of-thought computation, generating internal reasoning tokens that work through problems step by step before producing a visible response. This approach, proven effective in the o-series models, is here applied within the GPT-5.1 architecture to produce more accurate and thorough answers on complex problems.

The context window of 400K tokens supports the lengthy inputs that complex reasoning tasks often require, whether they are research papers, detailed problem statements, or extensive codebases that need architectural analysis. Longer response times come in exchange for deeper, more reliable reasoning.

What To Consider When Choosing a Provider

  • Configuration: GPT 5.1 Thinking generates internal reasoning tokens before producing its final response, similar to the o-series reasoning models but within the GPT-5.1 architecture. This produces better results on hard problems at the cost of longer response times.
  • Configuration: Use GPT 5.1 Thinking for your hardest queries and GPT-5.1 instant for everything else. A routing layer can direct traffic based on query complexity.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use GPT 5.1 Thinking

Best For

  • Complex analysis: Multi-step research, data analysis, and strategic reasoning that benefits from deliberation
  • Mathematical problem solving: Proofs, derivations, and quantitative analysis requiring verified steps
  • Scientific reasoning: Physics, chemistry, and biology problems with multi-step logical chains
  • Hard coding problems: Algorithm design, optimization, and architectural decisions requiring extended thought
  • Extended chain-of-thought: The GPT-5.1 generation's deliberation tier, complementing the speed-optimized instant variant

Consider Alternatives When

  • Fast responses needed: GPT-5.1 instant for real-time interactions
  • Coding agent workflows: GPT-5.1 codex family for autonomous software engineering
  • Pure STEM reasoning: The o-series reasoning models for the deepest chain-of-thought capability in OpenAI's lineup
  • General chat: GPT-5.1 instant for conversational workloads where reasoning depth is unnecessary

Conclusion

GPT 5.1 Thinking brings extended reasoning to the GPT-5.1 family, producing more thorough and accurate results on complex problems. For analytical, scientific, and multi-step tasks routed through AI Gateway, it is the depth-focused counterpart to the speed-optimized instant variant.

Frequently Asked Questions

  • How does GPT 5.1 Thinking reasoning work?

    It generates internal reasoning tokens that work through the problem step by step before producing a visible response, similar to the approach used in o-series reasoning models.

  • When should I use thinking versus instant?

    Use thinking for complex analysis, math, science, and hard coding problems where accuracy is the priority. Use instant for real-time chat, streaming content, and tasks where speed is the priority.

  • Is GPT 5.1 Thinking slower than GPT-5.1 instant?

    Yes. The extended reasoning process adds time before the first visible output. The tradeoff is deeper, more accurate reasoning on complex problems.

  • What context window does GPT 5.1 Thinking support?

    400K tokens, supporting the lengthy inputs that complex reasoning tasks often require.

  • How does AI Gateway handle authentication for GPT 5.1 Thinking?

    AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.

  • What are typical latency characteristics?

    This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.