How does Google Image Search grounding work in this model?

At generation time, the model can query Google's image index to retrieve live visual data for the subject you describe. This improves rendering accuracy for subjects that may not be well-represented in static training data, such as specific real-world locations or recent events.

What are the available thinking levels and when should I use each?

`minimal` and `high`. Use `minimal` when speed is the priority and the prompt is relatively straightforward. Use `high` when the prompt requires precise spatial reasoning, complex diagram layout, or multi-element compositions where reasoning before rendering reduces errors.

What new aspect ratios are available in Gemini 3.1 Flash Image Preview (Nano Banana 2) Preview?

1:4 and 1:8 aspect ratios alongside 512p resolution. These expand the model's usefulness for narrow-format creative assets such as web banners, vertical strips, and other non-standard formats.

Does this model support streaming?

Yes. Use `streamText` from the AI SDK with `responseModalities: ['TEXT', 'IMAGE']` in `providerOptions.google`.

Do I need to set `responseModalities` explicitly?

Yes. Because this is a multimodal model, you must include `responseModalities: ['TEXT', 'IMAGE']` in the provider options to receive image output. The model will not emit images without this configuration.

Can I use this model for real-time applications?

Yes, its flash-tier cost and speed profile are designed for production workloads. Using `thinkingLevel: 'minimal'` minimizes additional latency from the reasoning step.

What does `includeThoughts: true` return?

It streams the model's reasoning tokens before the generated image, giving visibility into how the model interpreted the prompt and planned the composition. This is useful for debugging prompts that produce unexpected output.

Dashboard

Gemini 3.1 Flash Image Preview (Nano Banana 2)

Gemini 3.1 Flash Image Preview (Nano Banana 2) Preview (Nano Banana 2) improves visual output quality at flash-tier speed, adding Google Image Search grounding, configurable thinking levels, and new resolution and aspect ratio options including 512p and ultra-wide formats.

Image GenWeb SearchReasoningVision (Image)

index.ts

import { generateText } from 'ai'

const result = await generateText({
  model: 'google/gemini-3.1-flash-image-preview',
  prompt: 'Render a picture of a red balloon.',
});

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

Playground

Try out Gemini 3.1 Flash Image Preview (Nano Banana 2) by Google. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Legal:Terms

•

Privacy

131K

0.9s

186tps

$0.50/M

$3.00/M

Read:$0.05/M

Write:—

$14/K

+ input costs

—

02/26/2026

Legal:Terms

•

Privacy

131K

2.1s

269tps

$0.50/M

$3.00/M

Read:$0.05/M

Write:—

$14/K

+ input costs

—

02/26/2026

More models by Google

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

2.1s

235tps

$1.50/M

$9.00/M

Read:$0.15/M

Write:—

$14.00/K

+ input costs

—

05/19/2026

0.7s

251tps

$0.25/M

$1.50/M

Read:$0.03/M

Write:—

$14.00/K

+ input costs

—

03/03/2026

2.3s

124tps

$2.00/M

$12.00/M

Read:

$0.2/M

Write:

—

$14.00/K

+ input costs

—

02/19/2026

0.8s

188tps

$0.50/M

$3.00/M

Read:

$0.05/M

Write:

—

$14.00/K

+ input costs

—

12/17/2025

0.4s

302tps

$0.10/M

$0.40/M

Read:$0.01/M

Write:—

$35.00/K

+ input costs

—

06/17/2025

0.4s

196tps

$0.30/M

$2.50/M

Read:$0.03/M

Write:—

$35.00/K

+ input costs

—

03/20/2025

About Gemini 3.1 Flash Image Preview (Nano Banana 2)

Gemini 3.1 Flash Image Preview (Nano Banana 2) Preview, codenamed Nano Banana 2, advances over the prior flash-tier image model with improved visual quality while preserving the generation speed and cost profile that makes flash-tier models viable for production workloads. Three feature additions meaningfully expand what flash-tier image generation can do.

The first is Google Image Search grounding. The model can retrieve live visual reference data from Google's index at generation time, which allows it to render lesser-known landmarks, brand-specific objects, and recent real-world subjects accurately. The second addition is configurable thinking levels. Gemini 3.1 Flash Image Preview (Nano Banana 2) Preview introduces Minimal and High thinking modes, enabling the model to reason through complex prompts before rendering. The third addition is expanded output options: 512p resolution and 1:4 and 1:8 aspect ratios join the existing format options, opening the model to narrow-format creative assets like banners and vertical media strips.

What To Consider When Choosing a Provider

Configuration: This is a multimodal model: use streamText or generateText and specify responseModalities: ['TEXT', 'IMAGE'] in providerOptions.google to receive image output. You can also set thinkingConfig.thinkingLevel to 'minimal' or 'high' to control reasoning depth per request.
Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Gemini 3.1 Flash Image Preview (Nano Banana 2)

Best For

Real-world grounded imagery: Image generation tasks that require grounding in current subjects, landmarks, or recent events
Technical diagram generation: Configurable thinking depth improves spatial accuracy and label placement
Unusual aspect ratios: Creative asset production requiring 1:4, 1:8 ratios or 512p resolution
Multimodal text and image output: Single-response workloads at flash-tier cost
Rapid complex visual iteration: Using the Minimal thinking level to balance speed and reasoning

Consider Alternatives When

Highest image quality required: Your workflow supports pro-tier latency and cost (consider google/gemini-3-pro-image)
Pure image generation API: You do not need multimodal text output (consider google/imagen-4.0-generate-001)
Simple prompts: Thinking levels and search grounding add unnecessary overhead
Video output required: Still images are not sufficient (consider the Veo model family)

Conclusion

Gemini 3.1 Flash Image Preview (Nano Banana 2) Preview closes the gap between flash-tier generation speed and pro-level visual intelligence by adding search grounding, reasoning control, and broader format support. For teams that need current-event-aware imagery or complex diagrams at flash cost, it provides capabilities that earlier flash-tier models did not offer.

Frequently Asked Questions

How does Google Image Search grounding work in this model?
At generation time, the model can query Google's image index to retrieve live visual data for the subject you describe. This improves rendering accuracy for subjects that may not be well-represented in static training data, such as specific real-world locations or recent events.
What are the available thinking levels and when should I use each?
minimal and high. Use minimal when speed is the priority and the prompt is relatively straightforward. Use high when the prompt requires precise spatial reasoning, complex diagram layout, or multi-element compositions where reasoning before rendering reduces errors.
What new aspect ratios are available in Gemini 3.1 Flash Image Preview (Nano Banana 2) Preview?
1:4 and 1:8 aspect ratios alongside 512p resolution. These expand the model's usefulness for narrow-format creative assets such as web banners, vertical strips, and other non-standard formats.
Does this model support streaming?
Yes. Use streamText from the AI SDK with responseModalities: ['TEXT', 'IMAGE'] in providerOptions.google.
Do I need to set responseModalities explicitly?
Yes. Because this is a multimodal model, you must include responseModalities: ['TEXT', 'IMAGE'] in the provider options to receive image output. The model will not emit images without this configuration.
How does this model compare to Gemini 3 Pro Image?
Gemini 3 Pro Image targets professional and creative workflows with higher resolution, higher multi-image input limits, and more advanced compositing support. Gemini 3.1 Flash Image Preview (Nano Banana 2) Preview prioritizes generation speed and cost efficiency while adding grounding and thinking capabilities that were absent from the original flash-tier image model.
Can I use this model for real-time applications?
Yes, its flash-tier cost and speed profile are designed for production workloads. Using thinkingLevel: 'minimal' minimizes additional latency from the reasoning step.
What does includeThoughts: true return?
It streams the model's reasoning tokens before the generated image, giving visibility into how the model interpreted the prompt and planned the composition. This is useful for debugging prompts that produce unexpected output.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Gemini 3.1 Flash Image Preview (Nano Banana 2)

Playground

Providers

More models by Google

About Gemini 3.1 Flash Image Preview (Nano Banana 2)

What To Consider When Choosing a Provider

When to Use Gemini 3.1 Flash Image Preview (Nano Banana 2)

Best For

Consider Alternatives When

Conclusion

Frequently Asked Questions