Gemini 3 Flash

Gemini 3 Flash delivers Gemini 3's pro-grade reasoning at flash-level latency and cost, outperforming Gemini 2.5 Pro across most benchmarks with meaningful gains in token efficiency.

ReasoningTool UseFile InputVision (Image)Web Searchtiered-costImplicit Caching

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'google/gemini-3-flash',
  prompt: 'Why is the sky blue?'
})

Overview About Providers Throughput Latency Uptime Status Similar FAQ

More models by Google

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

google/gemini-3.5-flash

2.1s

188tps

$1.50/M

$9.00/M

Read:$0.15/M

Write:—

$14.00/K

+ input costs

—

05/19/2026

google/gemma-4-26b-a4b-it

262K

0.7s

55tps

$0.13/M

$0.40/M

Read:$0.01/M

Write:—

—

04/02/2026

google/gemini-3.1-flash-lite-preview

0.5s

195tps

$0.25/M

$1.50/M

Read:$0.03/M

Write:—

$14.00/K

+ input costs

—

03/03/2026

google/gemini-3.1-pro-preview

3.2s

114tps

$2.00/M

$12.00/M

Read:

$0.2/M

Write:

—

$14.00/K

+ input costs

—

02/19/2026

google/gemini-2.5-flash-lite

0.2s

227tps

$0.10/M

$0.40/M

Read:$0.01/M

Write:—

$35.00/K

+ input costs

—

06/17/2025

google/gemini-2.5-flash

0.4s

198tps

$0.30/M

$2.50/M

Read:$0.03/M

Write:—

$35.00/K

+ input costs

—

03/20/2025

Agent Stack

Core Platform

Tools

Learn

Build

Explore

Gemini 3 Flash

More models by Google