GPT-5 nano was released on August 7, 2025 as the entry-level tier of the GPT-5 model family. It's optimized for the highest throughput and lowest latency in the family, targeting workloads where speed and cost matter more than reasoning depth.
Despite being the smallest GPT-5 variant, GPT-5 nano benefits from the family's architectural improvements. It handles classification, routing, extraction, and simple generation tasks with quality that reflects the generational leap from GPT-4.1 nano. The context window of 400K tokens is notable for a model at this tier, enabling it to process long inputs even when outputs remain short.
The model is designed to serve as a building block in larger systems: classifying incoming requests, routing them to appropriate handlers, extracting key fields from documents, and providing instant responses for simple queries, all at a cost that makes per-request inference viable for the highest-traffic applications.