Gemini 2.0 Flash

Gemini 2.0 Flash is Google's workhorse model for the agentic era. It delivers low-latency multimodal output, including natively generated images and steerable text-to-speech (TTS) audio, alongside native tool use and a Multimodal Live API for real-time streaming.

File InputTool UseVision (Image)Web Search

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'google/gemini-2.0-flash',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

About Gemini 2.0 Flash

Google released Gemini 2.0 Flash on December 11, 2024 as the first model in the Gemini 2.0 generation, optimized for high-volume, high-frequency tasks at scale. It outperforms Gemini 1.5 Pro on key benchmarks at twice the speed, a significant capability leap over the previous Flash generation.

Gemini 2.0 Flash stands out through native multimodal output. Beyond accepting text, images, video, and audio as input, it produces natively generated images mixed with text and steerable text-to-speech (TTS) multilingual audio. This eliminates the need for separate image-generation or speech-synthesis calls, enabling tighter integration within a single request.

Google simultaneously released the Multimodal Live API alongside 2.0 Flash for real-time and interactive applications. This API adds streaming real-time audio and video input, combined tool use, and the low-latency response characteristics conversational agents and live-session experiences need. Gemini 2.0 Flash also supports native tool use including Google Search, code execution, and user-defined functions for multi-step agentic workflows.

The context window of 1.0M tokens handles tasks that require reasoning over large codebases, lengthy documents, or extended conversation histories in a single pass.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Gemini 2.0 Flash

About Gemini 2.0 Flash