GPT-4o

GPT-4o is OpenAI's first natively multimodal "omni" model, unifying text, audio, image, and video processing within a single end-to-end trained architecture and delivering audio response times averaging 320 milliseconds, comparable to human conversational latency.

File InputImplicit CachingTool UseVision (Image)Web Search

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'openai/gpt-4o',
  prompt: 'Why is the sky blue?'
})

Overview About Providers Throughput Latency Uptime Status Similar FAQ

More models by OpenAI

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Capabilities	Providers	ZDR	No Training	Release Date

openai/gpt-5.6-luna

1.1M

1.1s

116tps

$1/M

$6/M

Read:

$0.1/M

Write:

$1.25/M

$10/K

+ input costs

07/09/2026

openai/gpt-5.6-sol

1.1M

2.3s

68tps

$5/M

$30/M

Read:

$0.5/M

Write:

$6.25/M

$10/K

+ input costs

07/09/2026

openai/gpt-5.6-terra

1.1M

1.9s

130tps

$2.50/M

$15/M

Read:

$0.25/M

Write:

$3.13/M

$10/K

+ input costs

07/09/2026

openai/gpt-5.4

1.1M

1.5s

77tps

$2.50/M

$15/M

Read:

$0.25/M

Write:

—

$10/K

+ input costs

03/05/2026

openai/gpt-5-nano

400K

6.5s

168tps

$0.05/M

$0.40/M

Read:$0.01/M

Write:—

$14/K

+ input costs

08/07/2025

openai/gpt-5-mini

400K

4.5s

122tps

$0.25/M

$2/M

Read:$0.03/M

Write:—

$14/K

+ input costs

08/07/2025

Agent Stack

Core Platform

Tools

Learn

Build

Explore

GPT-4o

More models by OpenAI