GPT-4.1 mini

GPT-4.1 mini delivers GPT-4o-class intelligence at reduced cost with nearly half the latency, making it a cost-performance option in the GPT-4.1 family for high-volume production workloads.

File InputTool UseVision (Image)Implicit Caching

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'openai/gpt-4.1-mini',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

About GPT-4.1 mini

GPT-4.1 mini launched on May 14, 2025 as the middle tier of the GPT-4.1 family. Three advances separate it from its predecessor.

First, the context window expanded from 128K to 1.0M tokens, an 8x increase. An entire codebase, a full conversation history spanning days, or a collection of legal documents all fit in a single request. Combined with the 75% prompt caching discount available across the GPT-4.1 family, long-context workflows that reuse system prompts become very affordable.

Second, instruction following improved materially. OpenAI trained the GPT-4.1 family with a focus on adherence to complex, multi-constraint prompts. For developers building structured pipelines where the model must follow formatting rules, respect output schemas, and handle edge cases in system instructions, this reduces debugging time and increases reliability.

Third, coding capability stepped up. The GPT-4.1 family brought measurable gains on code generation, review, and refactoring benchmarks compared to the GPT-4o generation. GPT-4.1 mini inherits those gains, making it capable enough for code assistance tasks that previously required a full-size model.

The result: GPT-4o-class intelligence at lower cost and nearly half the latency. For most production workloads, GPT-4.1 mini is the right choice.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

GPT-4.1 mini

About GPT-4.1 mini