Skip to content

GPT-4o mini

GPT-4o mini is OpenAI's cost-efficient multimodal model, priced at $0.15 per million input tokens, at reduced cost compared to GPT-3.5 Turbo, while outperforming GPT-4 on chat preference benchmarks and supporting vision and function calling.

File InputTool UseVision (Image)Implicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'openai/gpt-4o-mini',
prompt: 'Why is the sky blue?'
})

About GPT-4o mini

GPT-4o mini launched on July 18, 2024 as OpenAI's cost-efficient model, positioned to replace GPT-3.5 Turbo for cost-sensitive deployments while providing meaningfully higher capability. The pricing stands out: $0.15 per million input tokens and $0.6 per million output tokens, at reduced cost compared to GPT-3.5 Turbo. It scored 82.0% on MMLU (Massive Multitask Language Understanding), exceeding GPT-3.5 Turbo, and topped GPT-4 on the LMSYS Chatbot Arena chat preference leaderboard at release.

GPT-4o mini supports vision alongside text, inheriting GPT-4o's multimodal design at the small-model tier. You can run cost-efficient image analysis, document processing, visual classification, and screenshot interpretation without routing to a larger model. Function calling support makes it viable as the reasoning layer in tool-using agents and API-calling pipelines.

OpenAI highlighted four patterns where GPT-4o mini excels: chaining or parallelizing multiple model calls, passing large volumes of context such as full codebases or conversation histories, fast real-time text responses for customer-facing interfaces, and workloads previously blocked by GPT-3.5 Turbo's capability ceiling. The context window of 128K tokens gives it substantial headroom for each of these.