GPT-4.1 mini launched on May 14, 2025 as the middle tier of the GPT-4.1 family. Three advances separate it from its predecessor.
First, the context window expanded from 128K to 1.0M tokens, an 8x increase. An entire codebase, a full conversation history spanning days, or a collection of legal documents all fit in a single request. Combined with the 75% prompt caching discount available across the GPT-4.1 family, long-context workflows that reuse system prompts become very affordable.
Second, instruction following improved materially. OpenAI trained the GPT-4.1 family with a focus on adherence to complex, multi-constraint prompts. For developers building structured pipelines where the model must follow formatting rules, respect output schemas, and handle edge cases in system instructions, this reduces debugging time and increases reliability.
Third, coding capability stepped up. The GPT-4.1 family brought measurable gains on code generation, review, and refactoring benchmarks compared to the GPT-4o generation. GPT-4.1 mini inherits those gains, making it capable enough for code assistance tasks that previously required a full-size model.
The result: GPT-4o-class intelligence at lower cost and nearly half the latency. For most production workloads, GPT-4.1 mini is the right choice.