Skip to content

GPT-5 nano

GPT-5 nano is the fastest and most affordable model in the GPT-5 family, designed for high-throughput, low-latency tasks like classification, routing, autocomplete, and lightweight inference at scale.

File InputReasoningTool UseVision (Image) Image GenImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'openai/gpt-5-nano',
prompt: 'Why is the sky blue?'
})

Frequently Asked Questions

  • What tasks is GPT-5 nano designed for?

    Classification, routing, autocomplete, lightweight extraction, and any high-volume workload where speed and cost outweigh the need for deep reasoning.

  • How does GPT-5 nano compare to GPT-4.1 nano?

    GPT-5 nano is the next generation, inheriting GPT-5 family improvements in quality and instruction following while maintaining the speed and cost profile expected of a nano-tier model.

  • What context window does GPT-5 nano support?

    400K tokens, which is substantial for a model at this price and speed tier.

  • Can GPT-5 nano handle long documents?

    It can read long inputs within its window of 400K tokens, but it's optimized for short outputs. For detailed analysis of long documents, consider GPT-5 mini or GPT-5.

  • How does AI Gateway handle authentication for GPT-5 nano?

    AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.

  • What are typical latency characteristics?

    This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.