Gemini 2.0 Flash Lite
Gemini 2.0 Flash Lite is the lowest-cost multimodal model in Google's 2.0 lineup. This text-output-only model accepts text, images, audio, and documents within a context window of 1.0M tokens, designed for budget-first workloads where output volume drives infrastructure cost.
import { streamText } from 'ai'
const result = streamText({ model: 'google/gemini-2.0-flash-lite', prompt: 'Why is the sky blue?'})About Gemini 2.0 Flash Lite
Gemini 2.0 Flash Lite accepts multimodal inputs (text, images, audio, and documents) but produces text output only. This is a deliberate design choice. It targets the class of tasks where understanding rich input and producing structured text (descriptions, labels, or summaries) is the entire job.
At $0.075 per million input tokens, it fits workloads where unit economics drive architecture decisions. Large image batches stay cheap enough for annotation pipelines, moderation queues, accessibility generation, and visual extraction at scale.
The context window of 1.0M tokens accommodates long audio transcripts, multi-page documents, and extended image sequences within a single request. For ETL-style pipelines that process a batch of mixed-modality records and need structured text output from each, Gemini 2.0 Flash Lite provides the input flexibility of a multimodal model at a price closer to a text-only model.