Wan v2.5 Text-to-Video Preview
Wan v2.5 Text-to-Video Preview provides early access to Alibaba's text-to-video rendering pipeline, generating clips up to 10 seconds at resolutions from 480p to 1080p with built-in audio synchronization.
import { experimental_generateVideo as generateVideo } from 'ai';
const result = await generateVideo({ model: 'alibaba/wan-v2.5-t2v-preview', prompt: 'A serene mountain lake at sunrise.'});Frequently Asked Questions
Why would I choose the v2.5 preview over the newer v2.6 model?
The v2.5 preview supports 480p output (which v2.6 T2V doesn't), making it a lower-cost option for draft-quality renders and prompt experimentation. It also serves as a lighter-weight entry point for teams still evaluating the Wan pipeline.
Can this model generate vertical video for mobile platforms?
Yes. The 9:16 aspect ratio produces portrait-oriented output suitable for platforms like TikTok, Instagram Reels, and YouTube Shorts.
How does the built-in audio feature work?
Audio is generated in the same rendering pass as the video. The model produces ambient sound, effects, and, if the prompt describes speech, character dialogue with lip-sync, all without requiring a separate audio generation tool.
What kind of text prompts work best with this model?
Descriptive scene prompts that specify setting, action, and mood tend to produce the most coherent output. Including details about lighting, camera angle, and desired audio cues gives the model more information to work with.
Is there a way to control video duration precisely?
You can request specific durations within the model's range. The maximum is 10 seconds; for longer output, use the Wan v2.6 T2V model.
Do I need special Vercel plan access to use this model?
Access requires a Pro or Enterprise plan or paid AI Gateway usage.