Skip to content
Vercel April 2026 security incident

Wan v2.6 Reference-to-Video Flash

alibaba/wan-v2.6-r2v-flash

Wan v2.6 Reference-to-Video Flash is Alibaba's fast reference-to-video model that preserves subject identity from video references and generates new scenes at speed, supporting 720p and 1080p output for rapid creative iteration.

reference-to-video
index.ts
import { experimental_generateVideo as generateVideo } from 'ai';
const result = await generateVideo({
model: 'alibaba/wan-v2.6-r2v-flash',
prompt: 'A serene mountain lake at sunrise.'
});

What To Consider When Choosing a Provider

  • Zero Data Retention

    AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.

    Authentication

    AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When a production deliverable demands maximum identity fidelity from the reference material, evaluate the standard wan-v2.6-r2v model before finalizing your pipeline choice.

When to Use Wan v2.6 Reference-to-Video Flash

Best For

  • Fast scene concept iteration:

    Verifying character identity before committing to full-quality R2V renders

  • High-volume social content:

    Pipelines where volume and speed of character-consistent clips outweigh pixel-perfect fidelity

  • Storyboards and animatics:

    Live-action or animation pre-production using real talent references

  • High-throughput brand content:

    Generating many assets quickly where a mascot or spokesperson must appear consistently

Consider Alternatives When

  • Maximum identity fidelity:

    Use wan-v2.6-r2v for the highest-quality character transfer in final deliveries

  • Still-photo source material:

    Use wan-v2.6-i2v-flash for image-based animation when the source is a still photo

  • No reference subject needed:

    Use wan-v2.6-t2v for purely text-prompted video generation

Conclusion

Wan v2.6 Reference-to-Video Flash makes identity-consistent video generation fast enough for iterative creative workflows, preserving the reference-to-video capability that makes the R2V series unique while dramatically shortening generation time. It fits naturally into pipelines where R2V Flash handles draft cycles and the standard R2V model handles final output.

FAQ

Flash is speed-optimized. It generates reference-consistent video much faster than the standard R2V at a potential tradeoff in peak identity fidelity. For drafts and iteration, Flash is preferred; for final output, the standard R2V model is recommended.

Yes. Both accept the same reference URL lists and prompt conventions: use character1, character2, and so on in the prompt, in URL order, with 2 to 30 seconds per video reference where applicable.

720p and 1080p. The R2V variants don't include a 480p option.

Output duration is 2 to 10 seconds for Wan R2V on AI Gateway. The 15-second option available on some T2V and I2V models does not apply here.

Yes. You can combine several reference URLs in one request (within provider limits) and name them character1, character2, and so on in the prompt.

Voice and audio characteristics captured from the reference clips are part of the identity extraction process; check provider-level documentation for specific audio output behavior.