GPT OSS 20B
GPT OSS 20B is OpenAI's smaller open-weight model with roughly 21 billion total parameters and 3.6 billion active per token, designed for low-latency, agentic, and on-device workloads.
import { streamText } from 'ai'
const result = streamText({ model: 'openai/gpt-oss-20b', prompt: 'Why is the sky blue?'})Frequently Asked Questions
How does GPT OSS 20B compare to
gpt-oss-120b?Both share the same MoE architecture, 131.1K tokens context, and Apache 2.0 license. GPT OSS 20B activates fewer parameters per token (about 3.6 billion versus 5.1 billion), runs on a 16 GB device, and matches o3-mini on common benchmarks.
gpt-oss-120bapproaches o4-mini on harder reasoning tasks but costs more to serve.What hardware is GPT OSS 20B designed for?
OpenAI designed it to run on a single device with 16 GB of memory, including consumer-grade hardware. Weights ship natively quantized in MXFP4. When used through AI Gateway you don't manage hardware — requests route to bedrock, fireworks, groq, deepinfra, togetherai, novita, parasail.
Does GPT OSS 20B support tool calling and structured outputs?
Yes. GPT OSS 20B supports native function calling, structured outputs, and long chain-of-thought reasoning. You can also select a reasoning level (low, mid, or high) per request, similar to the o-series.
What license is GPT OSS 20B released under?
Apache 2.0. You can inspect, deploy, fine-tune, and redistribute the weights without OpenAI licensing restrictions. Fine-tuning happens outside AI Gateway since AI Gateway serves models as a managed API.
What context window does GPT OSS 20B support?
131.1K tokens, sufficient for long documents, multi-turn agentic sessions, and transcript-heavy workloads.
Can I call GPT OSS 20B through the AI SDK?
Yes. GPT OSS 20B is available through AI Gateway via the AI SDK as well as Chat Completions, Responses, and Messages-compatible API formats. Use
openai/gpt-oss-20bas the model identifier.Is Zero Data Retention available for GPT OSS 20B?
Yes, Zero Data Retention is available for this model. Zero Data Retention is offered on a per-provider basis. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.