Skip to content

GPT OSS 120B

GPT OSS 120B is OpenAI's open-source 120-billion parameter language model, offering strong general-purpose capability with the transparency and flexibility of open weights.

ReasoningTool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'openai/gpt-oss-120b',
prompt: 'Why is the sky blue?'
})

Playground

Try out GPT OSS 120B by OpenAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

About GPT OSS 120B

GPT OSS 120B became available on August 5, 2025 on AI Gateway as part of OpenAI's open-source model initiative. At 120 billion parameters, it is the larger of the two initial open-source releases, providing strong general-purpose language model capability with the transparency of open weights.

Open weights mean the model's parameters are publicly available, enabling inspection, auditing, and customization that closed-source models don't permit. For organizations with requirements around model transparency or regulatory compliance, this is significant.

Through AI Gateway, teams can use GPT OSS 120B as a managed API without handling the infrastructure required to serve a 120B parameter model.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Baseten
Legal:Terms
Privacy
131K
0.2s
201tps
$0.10/M$0.50/M
08/05/2025
Amazon Bedrock
Legal:Terms
Privacy
128K
0.3s
167tps
$0.15/M$0.60/M
08/05/2025
Fireworks
Legal:Terms
Privacy
128K
0.3s
79tps
$0.15/M$0.60/M
Read:$0.01/M
Write:
08/05/2025
Groq
Legal:Terms
Privacy
131K
0.1s
$0.15/M$0.60/M
Read:$0.07/M
Write:
08/05/2025
Parasail
Legal:Terms
Privacy
131K
0.5s
74tps
$0.10/M$0.75/M
08/05/2025
Nebius
Legal:Terms
Privacy
131K
48.9s
59tps
$0.15/M$0.60/M
08/05/2025
Together AI
Legal:Terms
Privacy
128K
1.7s
66tps
$0.15/M$0.60/M
08/05/2025
Cerebras
Legal:Terms
Privacy
131K
0.3s
1283tps
$0.35/M$0.75/M
Read:$0.25/M
Write:
08/05/2025
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by OpenAI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
2.5s
97tps
$5.00/M
$30.00/M
Read:
$0.5/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
04/24/2026
400K
1.4s
219tps
$0.75/M$4.50/M
Read:$0.07/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/17/2026
400K
0.7s
55tps
$0.20/M$1.25/M
Read:$0.02/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/17/2026
1.1M
0.8s
57tps
$2.50/M
$15.00/M
Read:
$0.25/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/05/2026
128K
1.1s
76tps
$1.25/M$10.00/M
Read:$0.13/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
11/12/2025
400K
3.4s
98tps
$0.25/M$2.00/M
Read:$0.03/M
Write:
$14/K
+ input costs
azure logo
openai logo
08/07/2025

What To Consider When Choosing a Provider

  • Configuration: GPT OSS 120B ships with open weights, so you can inspect the model, understand its behavior, and deploy it in environments where model transparency is required.
  • Configuration: At 120B parameters, this is a substantial model. Through AI Gateway you access it as a managed API without handling infrastructure.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use GPT OSS 120B

Best For

  • Transparency-required deployments: Applications where model weights must be inspectable or auditable
  • Research and experimentation: Teams studying large language model behavior with full access to architecture
  • Open-source strategy: Organizations committed to open-source AI infrastructure
  • General-purpose tasks: Chat, content generation, analysis, and coding at a capable model scale

Consider Alternatives When

  • Maximum proprietary capability: GPT-5 or GPT-5.2 for higher closed-source capability
  • Smaller open-source: Gpt-oss-20b for lighter-weight open-source deployments
  • Cost optimization: Smaller models for tasks that don't require 120B parameter scale
  • Specialized tasks: Codex models for coding, o-series for reasoning

Conclusion

GPT OSS 120B combines substantial language model capability with the openness of public weights. Available through AI Gateway as a managed API, it serves teams that need both capable AI and model transparency.

Frequently Asked Questions

  • What does 'open-source' mean for GPT OSS 120B?

    The model weights are publicly available. You can inspect and audit the model while using it through managed APIs.

  • How does GPT OSS 120B compare to GPT-5?

    GPT-5 is OpenAI's closed-source general-purpose model with higher capability. GPT OSS 120B offers strong general-purpose performance with the advantage of open weights and model transparency.

  • What context window does GPT OSS 120B support?

    131.1K tokens, providing substantial capacity for document processing and extended conversations.

  • How does AI Gateway handle authentication for GPT OSS 120B?

    AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.

  • What are typical latency characteristics?

    This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.