Skip to content

GPT 4o Mini Search Preview

GPT 4o Mini Search Preview combines GPT-4o mini's cost efficiency with built-in web search, grounding responses in real-time information from the web without requiring external search tool integration.

Web Search
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'openai/gpt-4o-mini-search-preview',
prompt: 'Why is the sky blue?'
})

Playground

Try out GPT 4o Mini Search Preview by OpenAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
OpenAI
Legal:Terms
Privacy
128K
2.7s
96tps
$0.15/M$0.60/M
$10.00/K
+ input costs
03/12/2025
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by OpenAI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
2.7s
119tps
$5.00/M
$30.00/M
Read:
$0.5/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
04/24/2026
400K
1.7s
267tps
$0.75/M$4.50/M
Read:$0.07/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/17/2026
400K
0.6s
125tps
$0.20/M$1.25/M
Read:$0.02/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/17/2026
1.1M
0.6s
69tps
$2.50/M
$15.00/M
Read:
$0.25/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
03/05/2026
128K
0.7s
97tps
$1.25/M$10.00/M
Read:$0.13/M
Write:
$10.00/K
+ input costs
azure logo
openai logo
11/12/2025
400K
3.5s
180tps
$0.25/M$2.00/M
Read:$0.03/M
Write:
$14/K
+ input costs
azure logo
openai logo
08/07/2025

About GPT 4o Mini Search Preview

GPT 4o Mini Search Preview was introduced on March 12, 2025 as part of OpenAI's expansion of the Responses API with new tools and features. It integrates web search directly into the model's inference pipeline, so the model can retrieve current information from the internet as part of generating a response.

This differs fundamentally from using a separate search API and feeding results into a model's context. With GPT 4o Mini Search Preview, the search happens within the model's reasoning flow, the model decides what to search for, evaluates the results, and synthesizes them into a coherent answer. This produces more naturally grounded responses and reduces the engineering overhead of building and maintaining a search-then-generate pipeline.

The model inherits GPT-4o mini's strengths: low cost per token, fast response times, vision support, and function calling. The addition of search makes it particularly valuable for applications where information freshness matters but the engineering investment of a full RAG system isn't justified.

What To Consider When Choosing a Provider

  • Configuration: GPT 4o Mini Search Preview can query the web mid-response, which means answers about current events, recent releases, or live data are grounded in actual sources rather than training data alone.
  • Configuration: Because it builds on the GPT-4o mini architecture, search-augmented responses remain affordable even at scale, making it practical to enable search on every request rather than selectively.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use GPT 4o Mini Search Preview

Best For

  • Real-time information retrieval: Answering questions about current events, recent product releases, or live data
  • Research assistants: Gathering and synthesizing information from multiple web sources into coherent summaries
  • Customer support bots: Providing answers grounded in current documentation, pricing, or availability
  • Fact-checking pipelines: Verifying claims against live web sources before presenting to users
  • News and content aggregation: Pulling together information from across the web into structured responses

Consider Alternatives When

  • No web access needed: Standard GPT-4o mini is faster and cheaper when responses don't require real-time information
  • Maximum reasoning depth: Full GPT-4o or GPT-4.1 provide stronger analytical capability alongside search
  • Custom search sources: You need to search specific databases or internal knowledge bases, a RAG pipeline with a standard model gives more control

Conclusion

GPT 4o Mini Search Preview adds real-time web grounding to the most cost-efficient model in the GPT-4o family. For applications where answers must reflect current information without the complexity of a custom search integration, it provides a streamlined solution through AI Gateway.

Frequently Asked Questions

  • How does the built-in search differ from using a separate search API?

    The model decides what to search for, when to search, and how to synthesize results as part of its reasoning flow. This produces more naturally integrated answers compared to prepending search results to a prompt.

  • Does GPT 4o Mini Search Preview always search the web?

    No. The model determines whether a web search would improve its response. For questions answerable from training data alone, it may skip the search step.

  • What is the pricing for GPT 4o Mini Search Preview?

    Current pricing is shown on this page. AI Gateway routes across providers, and rates may vary by provider.

  • Can I use GPT 4o Mini Search Preview for real-time customer support?

    Yes. It can ground answers in current documentation, pricing pages, and product information from the web, making support responses more accurate and up-to-date.

  • How does AI Gateway handle authentication for GPT 4o Mini Search Preview?

    AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.

  • What context window does GPT 4o Mini Search Preview support?

    GPT 4o Mini Search Preview supports a context window of 128K tokens, consistent with the GPT-4o mini family.

  • What are typical latency characteristics?

    This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic. Search-augmented responses may take slightly longer due to web retrieval.