GPT-5 nano
GPT-5 nano is the fastest and most affordable model in the GPT-5 family, designed for high-throughput, low-latency tasks like classification, routing, autocomplete, and lightweight inference at scale.
import { streamText } from 'ai'
const result = streamText({ model: 'openai/gpt-5-nano', prompt: 'Why is the sky blue?'})What To Consider When Choosing a Provider
Zero Data Retention
AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.Authentication
AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
GPT-5 nano prioritizes throughput and latency over reasoning depth. It's the right choice when you need fast answers to simple questions at minimal cost.
At its price point, GPT-5 nano is practical as a classifier, router, or preprocessor that runs on every request, deciding which downstream model or action to invoke.
When to Use GPT-5 nano
Best For
Real-time classification:
Sentiment analysis, intent detection, and topic labeling at high request volume
Routing and triage:
Deciding which model or workflow handles each incoming request
Autocomplete and suggestions:
Sub-second inline suggestions in editors and search interfaces
Lightweight extraction:
Pulling specific fields from structured or semi-structured text
Cost-sensitive batch processing:
Millions of simple inferences at minimal aggregate cost
Consider Alternatives When
Complex reasoning needed:
GPT-5 mini or GPT-5 for tasks requiring multi-step analysis
Code generation:
Codex mini or GPT-5 codex for coding-specific tasks
Deep deliberation:
O3 or o4-mini for problems that benefit from chain-of-thought reasoning
Rich multimodal analysis:
Full GPT-5 for complex vision and document understanding tasks
Conclusion
GPT-5 nano brings GPT-5 family improvements to the fastest and most affordable tier, making it the right choice for classification, routing, and high-throughput lightweight tasks through AI Gateway.
FAQ
Classification, routing, autocomplete, lightweight extraction, and any high-volume workload where speed and cost outweigh the need for deep reasoning.
GPT-5 nano is the next generation, inheriting GPT-5 family improvements in quality and instruction following while maintaining the speed and cost profile expected of a nano-tier model.
400K tokens, which is substantial for a model at this price and speed tier.
It can read long inputs within its window of 400K tokens, but it's optimized for short outputs. For detailed analysis of long documents, consider GPT-5 mini or GPT-5.
AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.
This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.