What is the SWE-Bench Verified improvement in Gemini 2.5 Flash Preview 09-2025?

The score moved from 48.9% on the stable model to 54% on this preview. SWE-Bench Verified tests a model's ability to resolve real software engineering tasks from production repositories.

How much do thinking tokens cost with Gemini 2.5 Flash Preview 09-2025?

Google reported a 24% reduction in output tokens compared to the stable model when thinking is active. The per-token rate stays the same, but you consume fewer tokens per reasoning task.

Is this preview a replacement for the stable Gemini 2.5 Flash?

No. Google released it for developer feedback. It may reach stable later or be superseded. Pin to the explicit model string and monitor deprecation notices.

Does Gemini 2.5 Flash Preview 09-2025 support thinking budgets like the stable 2.5 Flash?

Yes. The hybrid reasoning design carries forward. You can disable thinking entirely or set a budget that controls how much deliberation the model applies per request.

How do I authenticate requests to Gemini 2.5 Flash Preview 09-2025 through AI Gateway?

Use a Vercel API key or OIDC token with AI Gateway. Use the identifier `google/gemini-2.5-flash-preview-09-2025` in your API calls. AI Gateway manages routing, retries, and failover across google, vertex.

Can I use the `-latest` alias instead of the dated preview string?

Google offers aliases like `gemini-flash-latest` that auto-update to the newest preview. These rotate with two-week deprecation notices. Use the explicit `gemini-2.5-flash-preview-09-2025` string for reproducible behavior.

How does Gemini 2.5 Flash Preview 09-2025 compare to Gemini 2.5 Pro?

2.5 Flash (including this preview) sits on the cost-performance frontier. It delivers strong reasoning at lower cost than 2.5 Pro. For the hardest problems where benchmark scores justify the premium, 2.5 Pro remains the Pro tier in the 2.5 family.

Dashboard

Gemini 2.5 Flash Preview 09-2025

Gemini 2.5 Flash Preview 09-2025 is Google's September 2025 preview of the next Gemini 2.5 Flash, scoring 54% on SWE-Bench Verified (up from 48.9%), improving agentic tool use, and producing 24% fewer output tokens with thinking enabled.

File InputImplicit CachingReasoningTool UseVision (Image)Web Search

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'google/gemini-2.5-flash-preview-09-2025',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Uptime Status Similar FAQ

Playground

Try out Gemini 2.5 Flash Preview 09-2025 by Google. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Legal:Terms

•

Privacy

$0.30/M

$2.50/M

Read:$0.03/M

Write:—

$35.00/K

+ input costs

—

09/25/2025

Legal:Terms

•

Privacy

$0.30/M

$2.50/M

Read:$0.03/M

Write:—

—

09/25/2025

More models by Google

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

2.3s

215tps

$1.50/M

$9.00/M

Read:$0.15/M

Write:—

$14.00/K

+ input costs

—

05/19/2026

0.8s

243tps

$0.25/M

$1.50/M

Read:$0.03/M

Write:—

$14.00/K

+ input costs

—

03/03/2026

4.4s

184tps

$2.00/M

$12.00/M

Read:

$0.2/M

Write:

—

$14.00/K

+ input costs

—

02/19/2026

0.8s

187tps

$0.50/M

$3.00/M

Read:

$0.05/M

Write:

—

$14.00/K

+ input costs

—

12/17/2025

0.5s

276tps

$0.10/M

$0.40/M

Read:$0.01/M

Write:—

$35.00/K

+ input costs

—

06/17/2025

0.4s

191tps

$0.30/M

$2.50/M

Read:$0.03/M

Write:—

$35.00/K

+ input costs

—

03/20/2025

About Gemini 2.5 Flash Preview 09-2025

Gemini 2.5 Flash Preview 09-2025 is a preview release from Google dated September 25, 2025. It advances the hybrid reasoning model that defined the 2.5 Flash tier. Two improvements stand out.

First, agentic tool use. Google called out better performance on complex, multi-step applications. The SWE-Bench Verified score moved from 48.9% to 54%, a five-point gain on real-world software engineering tasks involving bug fixes and feature implementations across production codebases.

Second, efficiency with thinking enabled. The preview produces 24% fewer output tokens compared to the stable model when thinking mode is active. Fewer thinking tokens means lower cost and faster responses on reasoning-intensive prompts while maintaining quality.

Like the Flash Lite preview released alongside it, this model uses Google's -latest alias system with two-week deprecation notices. Pin to gemini-2.5-flash-preview-09-2025 for stable evaluation. Google intends the preview for feedback collection, not as a direct replacement for the stable 2.5 Flash.

What To Consider When Choosing a Provider

Configuration: This is a preview model. The 24% reduction in thinking tokens changes cost profiles for reasoning-heavy workloads. Benchmark against the stable 2.5 Flash using AI Gateway's observability tools before committing production traffic.
Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Gemini 2.5 Flash Preview 09-2025

Best For

Agentic coding pipelines: The 54% SWE-Bench Verified score represents a concrete improvement over the stable 2.5 Flash
Multi-step tool use applications: Benefit from improved agentic capabilities across complex workflows
Reasoning-intensive workloads with cost constraints: 24% fewer thinking tokens materially reduces spend
Hybrid reasoning applications: Toggle thinking on and off per request and need better cost efficiency when thinking is active
Software engineering automation: Bug triage, code review, and feature implementation

Consider Alternatives When

Production stability required: Use the stable Gemini 2.5 Flash for production workloads
Simple classification or extraction: Thinking adds cost without benefit, and Gemini 2.5 Flash Lite is cheaper
Deepest reasoning needed: Gemini 2.5 Pro targets the hardest problems with no cost constraint

Conclusion

Gemini 2.5 Flash Preview 09-2025 delivers two concrete gains over the stable 2.5 Flash: stronger agentic tool use (54% SWE-Bench Verified) and cheaper thinking (24% fewer output tokens). For teams already using 2.5 Flash on reasoning or coding tasks, this preview is worth benchmarking in staging.

Frequently Asked Questions

What is the SWE-Bench Verified improvement in Gemini 2.5 Flash Preview 09-2025?
The score moved from 48.9% on the stable model to 54% on this preview. SWE-Bench Verified tests a model's ability to resolve real software engineering tasks from production repositories.
How much do thinking tokens cost with Gemini 2.5 Flash Preview 09-2025?
Google reported a 24% reduction in output tokens compared to the stable model when thinking is active. The per-token rate stays the same, but you consume fewer tokens per reasoning task.
Is this preview a replacement for the stable Gemini 2.5 Flash?
No. Google released it for developer feedback. It may reach stable later or be superseded. Pin to the explicit model string and monitor deprecation notices.
Does Gemini 2.5 Flash Preview 09-2025 support thinking budgets like the stable 2.5 Flash?
Yes. The hybrid reasoning design carries forward. You can disable thinking entirely or set a budget that controls how much deliberation the model applies per request.
How do I authenticate requests to Gemini 2.5 Flash Preview 09-2025 through AI Gateway?
Use a Vercel API key or OIDC token with AI Gateway. Use the identifier google/gemini-2.5-flash-preview-09-2025 in your API calls. AI Gateway manages routing, retries, and failover across google, vertex.
Can I use the -latest alias instead of the dated preview string?
Google offers aliases like gemini-flash-latest that auto-update to the newest preview. These rotate with two-week deprecation notices. Use the explicit gemini-2.5-flash-preview-09-2025 string for reproducible behavior.
How does Gemini 2.5 Flash Preview 09-2025 compare to Gemini 2.5 Pro?
2.5 Flash (including this preview) sits on the cost-performance frontier. It delivers strong reasoning at lower cost than 2.5 Pro. For the hardest problems where benchmark scores justify the premium, 2.5 Pro remains the Pro tier in the 2.5 family.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Gemini 2.5 Flash Preview 09-2025

Playground

Providers

More models by Google

About Gemini 2.5 Flash Preview 09-2025

What To Consider When Choosing a Provider

When to Use Gemini 2.5 Flash Preview 09-2025

Best For

Consider Alternatives When

Conclusion

Frequently Asked Questions