Skip to content

Gemini 3.5 Flash

Gemini 3.5 Flash advances the Flash line with improved coding proficiency, parallel agentic execution, stronger core reasoning, tighter instruction following, and higher-quality reasoning traces in thinking mode.

ReasoningFile InputVision (Image)Tool UseWeb SearchImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'google/gemini-3.5-flash',
prompt: 'Why is the sky blue?'
})

Frequently Asked Questions

  • What's new in Gemini 3.5 Flash versus Gemini 3 Flash?

    Gemini 3.5 Flash improves coding proficiency and supports more reliable parallel agentic execution loops. Core reasoning, instruction following, and multi-turn coherence are all stronger, and thinking-mode outputs include higher-quality reasoning traces.

  • How do I control how much Gemini 3.5 Flash thinks before responding?

    Set thinkingLevel (for example 'high') and includeThoughts: true under providerOptions.google.thinkingConfig when using the AI SDK plus Chat Completions / Responses / Messages APIs. Gemini 3.5 Flash defaults to the medium level.

  • Which sampling parameters does Gemini 3.5 Flash support?

    Gemini 3.5 Flash does not support temperature, topP, topK, or thinking_budget. If your application depends on those parameters, evaluate a different model before migrating production traffic.

  • Is Gemini 3.5 Flash suitable for agentic coding tasks?

    Yes. Improved coding proficiency and parallel agentic execution make Gemini 3.5 Flash well-suited for refactoring services, running concurrent tool calls, and multi-step code transformation workflows where reliability across steps matters.

  • Does Gemini 3.5 Flash support streaming?

    Yes. Use streamText from the AI SDK plus Chat Completions / Responses / Messages APIs with model: 'google/gemini-3.5-flash' for streaming responses.

  • Do I need a Google Cloud account to use Gemini 3.5 Flash on AI Gateway?

    No. AI Gateway manages provider authentication. Connect using a Vercel API key or OIDC token and AI Gateway handles routing to the underlying provider.

  • How does Zero Data Retention work with Gemini 3.5 Flash through AI Gateway?

    Yes, Zero Data Retention is available for this model. Zero Data Retention is offered on a per-provider basis. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.

  • When should I use Gemini 3.5 Flash versus Gemini 3.1 Pro?

    Choose Gemini 3.5 Flash when Flash-tier latency and cost matter and the task fits within the Flash quality envelope. Choose Gemini 3.1 Pro for the deepest reasoning, long agentic sessions, or finance and spreadsheet workloads that benefit from pro-tier capability.