GPT OSS 120B
GPT OSS 120B is OpenAI's open-source 20-billion parameter language model, providing a lightweight yet capable open-weights option suitable for cost-efficient deployment.
import { streamText } from 'ai'
const result = streamText({ model: 'openai/gpt-oss-20b', prompt: 'Why is the sky blue?'})What To Consider When Choosing a Provider
Zero Data Retention
AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.Authentication
AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
At 20B parameters, GPT OSS 120B is much more practical to deploy on standard infrastructure compared to the 120B variant. It provides a good balance of capability and resource requirements.
Through AI Gateway you can use it immediately without provisioning GPU infrastructure.
When to Use GPT OSS 120B
Best For
Lightweight open-source deployment:
Open-weight models with reasonable infrastructure requirements
Cost-efficient open-source:
Applications that need open-weight transparency at lower compute cost
Edge deployment research:
Exploring deployment of capable models in resource-constrained environments
General-purpose tasks:
Chat, summarization, and content generation where 20B scale is sufficient
Consider Alternatives When
Higher capability needed:
Gpt-oss-120b for stronger open-source performance
Maximum quality:
GPT-5 or GPT-5.2 for higher capability from closed-source models
Specialized tasks:
Codex models for coding, o-series for reasoning
Smallest possible model:
GPT-5 nano or GPT-4.1 nano for minimal-cost inference
Conclusion
GPT OSS 120B provides a practical entry point to open-source language models from OpenAI, balancing capability with efficiency. Available through AI Gateway, it serves teams that need open weights without the infrastructure demands of larger models.
FAQ
It's more compact (20B vs 120B parameters), making it cheaper to run and easier to self-host, with correspondingly lower capability on complex tasks.
Chat, content generation, summarization, analysis, and other general-purpose language tasks where 20B parameter scale provides sufficient quality.
131.1K tokens.
AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.
This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.