GPT-5 nano
GPT-5 nano is the fastest and most affordable model in the GPT-5 family, designed for high-throughput, low-latency tasks like classification, routing, autocomplete, and lightweight inference at scale.
import { streamText } from 'ai'
const result = streamText({ model: 'openai/gpt-5-nano', prompt: 'Why is the sky blue?'})Frequently Asked Questions
What tasks is GPT-5 nano designed for?
Classification, routing, autocomplete, lightweight extraction, and any high-volume workload where speed and cost outweigh the need for deep reasoning.
How does GPT-5 nano compare to GPT-4.1 nano?
GPT-5 nano is the next generation, inheriting GPT-5 family improvements in quality and instruction following while maintaining the speed and cost profile expected of a nano-tier model.
What context window does GPT-5 nano support?
400K tokens, which is substantial for a model at this price and speed tier.
Can GPT-5 nano handle long documents?
It can read long inputs within its window of 400K tokens, but it's optimized for short outputs. For detailed analysis of long documents, consider GPT-5 mini or GPT-5.
How does AI Gateway handle authentication for GPT-5 nano?
AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.
What are typical latency characteristics?
This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.