Skip to content

Wan v2.6 Image-to-Video Flash

Wan v2.6 Image-to-Video Flash is Alibaba's speed-optimized image-to-video model that animates still images into video clips at up to 1080p, designed for fast iteration and high-throughput animation pipelines.

image-to-video
index.ts
import { experimental_generateVideo as generateVideo } from 'ai';
const result = await generateVideo({
model: 'alibaba/wan-v2.6-i2v-flash',
prompt: 'A serene mountain lake at sunrise.'
});

Playground

Try out Wan v2.6 Image-to-Video Flash by Alibaba. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

About Wan v2.6 Image-to-Video Flash

Wan v2.6 Image-to-Video Flash occupies the speed-optimized position within the Wan 2.6 image-to-video family. It accepts the same inputs as the standard I2V model, a source image plus a motion-guiding text prompt, but prioritizes low generation latency, making it well-suited for creative iteration, draft reviews, and high-throughput pipelines where many animation variants need to be evaluated quickly.

Despite the speed focus, the Flash model retains the core visual improvements introduced in the 2.6 generation: better temporal consistency between frames, improved instruction-following for motion prompts, and support for the full resolution range from 480p through 1080p. Teams commonly use the Flash variant during the exploration phase of a production workflow and then route finalized prompts to the standard I2V model for polished output.

The Flash architecture also makes it practical to run animation at scale, for example, generating animated thumbnails or preview loops for a large image library, without the queue times that full-quality generation incurs. Optional audio accompaniment is available on the same pass.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Alibaba
Legal:Terms
Privacy
12/16/2025
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

More models by Alibaba

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
240K
1.7s
82tps
$1.30/M
$7.80/M
Read:
$0.26/M
Write:
$1.63/M
alibaba logo
04/20/2026
1M
1.0s
55tps
$0.50/M
$3.00/M
Read:
$0.1/M
Write:
$0.63/M
alibaba logo
fireworks logo
04/02/2026
1M
1.1s
284tps
$0.10/M$0.40/M
Read:$0.0/M
Write:$0.13/M
alibaba logo
02/24/2026
1M
2.3s
55tps
$0.40/M
$2.40/M
Read:
$0.04/M
Write:
$0.5/M
alibaba logo
02/16/2026
256K
0.2s
143tps
$0.50/M$1.20/M
bedrock logo
togetherai logo
07/22/2025
33K
$0.02/M
deepinfra logo
06/05/2025

What To Consider When Choosing a Provider

  • Configuration: When peak visual fidelity for a final deliverable takes priority over turnaround time, evaluate the standard wan-v2.6-i2v model alongside this Flash variant before choosing.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Wan v2.6 Image-to-Video Flash

Best For

  • Rapid prompt iteration: Exploring motion ideas before committing to full-quality I2V rendering
  • High-throughput animation workflows: Processing many images in parallel with short turnaround times
  • Draft previews for review: Generating quick outputs for client review or storyboard approval loops
  • Cost-sensitive animation pipelines: Delivering acceptable quality at lower computational expense

Consider Alternatives When

  • Maximum visual fidelity: Use wan-v2.6-i2v for the highest-quality image animation in final deliverables
  • Text-only source: Use wan-v2.6-t2v when generating from a text description rather than an image
  • Consistent character identity: Use wan-v2.6-r2v or wan-v2.6-r2v-flash when the same subject must appear across multiple generated shots

Conclusion

Wan v2.6 Image-to-Video Flash makes image animation practical at production scale by dramatically reducing generation time relative to the standard I2V model, without sacrificing the resolution range or core motion quality of the Wan 2.6 series. It is the recommended starting point for any iterative or high-volume image-to-video workflow.

Frequently Asked Questions

  • How much faster is I2V Flash compared to the standard I2V model?

    The Flash variant is engineered specifically for faster generation times. Exact speed differences vary by resolution and provider, but Flash is designed for quick-iteration use cases where the standard model's generation time is prohibitive.

  • Does the Flash model support 1080p output?

    Yes. Despite the speed optimization, I2V Flash supports 480p, 720p, and 1080p resolutions.

  • Can I include audio in the generated video?

    Optional audio accompaniment is available on the same generation pass.

  • What is the maximum clip duration?

    Generated clips can be up to 15 seconds long.

  • When should I use Flash versus the standard I2V model?

    Use Flash for drafts, iteration, and high-volume tasks. Use the standard I2V model when final visual quality matters more than turnaround time.

  • Does the Flash model require a different input format than the standard I2V?

    No. Both models accept the same inputs: a source image and a text prompt describing the desired motion.