Seedance v1.5 Pro
Seedance v1.5 Pro is ByteDance's first audio-visual joint generation video model, released December 16, 2025. It produces synchronized dialogue, sound effects, and ambient audio alongside 1080p video in one generation pass, with multilingual voice and regional dialect support.
import { experimental_generateVideo as generateVideo } from 'ai';
const result = await generateVideo({ model: 'bytedance/seedance-v1.5-pro', prompt: 'A serene mountain lake at sunrise.'});Frequently Asked Questions
What languages does Seedance v1.5 Pro support for voice generation?
Six languages: Chinese, English, Japanese, Korean, Spanish, and Indonesian. Regional dialect coverage includes Sichuanese and Cantonese.
Does Seedance v1.5 Pro require a separate text-to-speech step for audio?
No. Seedance v1.5 Pro generates voice, ambient sound, and sound effects in the same inference pass as the video. You don't need an external audio pipeline.
How does audio-visual synchronization work in Seedance v1.5 Pro?
Seedance v1.5 Pro trains to align lip movements, intonation patterns, and performance rhythm with visual content. ByteDance's release documentation reports lower audio-visual misalignment than listed baselines in its tables. See https://console.byteplus.com/ark/region:ark+ap-southeast-1/model/detail?Id=seedance-1-5-pro.
What video specifications does Seedance v1.5 Pro support?
Resolutions of 480p, 720p, and 1080p at 24 fps, clip durations from four to 12 seconds, and seven aspect ratios: 16:9, 9:16, 1:1, 4:3, 3:4, 21:9, and 9:21.
How does Seedance v1.5 Pro differ from Seedance 1.0 Pro on video quality alone?
Seedance 1.5 Pro adds cinematic camera techniques (dolly zooms, long takes), color grading controls, and more facial detail in close-ups, beyond the motion stability focus of 1.0 Pro.
Can Seedance v1.5 Pro generate ambient sound without spoken dialogue?
Yes. The audio system generates spatial sound effects and ambient audio that match the visual scene's physical environment, whether or not the scene contains speech.