
Wan 2.7 Reference-to-Video API by Alibaba
Generates character-driven videos from reference images and videos, with multi-subject and voice-cloning support.
Alibaba WAN 2.7 Reference-to-Video
Alibaba WAN 2.7 Reference-to-Video generates character-driven videos from reference images and videos, supporting multi-subject scenes and voice cloning.
What makes it stand out?
- Character consistency: Provide reference images or videos of characters, and the model preserves their appearance across the generated video.
- Multi-subject scenes: Include up to 5 reference materials (images + videos combined) to create scenes with multiple characters interacting.
- Voice cloning: Attach a voice reference audio to transfer a character's voice into the generated video.
- Flexible framing: Five aspect ratios (16:9, 9:16, 1:1, 4:3, 3:4) at 720P or 1080P.
Designed For
- Creators building character-driven stories that need consistent character identity across clips.
- Teams producing multi-character interaction videos from a set of reference assets.
- Anyone who wants to animate a character from a photo or short clip with a specific voice.
How to Use
- Write a prompt using labels like "character1" and "character2" to reference each subject.
- Provide reference materials in order: the first image or video maps to character1, the second to character2, and so on.
- Each reference should contain only one subject (person, animal, or object).
- Optionally add a
reference_voiceaudio URL to give a character a specific voice. - Set resolution, ratio, and duration for the output video.
Super Resolution
Use 1080P-SR or 1440P-SR when you want sharper detail than the native output. The request first renders a 720P source video, then applies FlashVSR super-resolution to the requested target tier. 1080P-SR is intended as a lower-cost HD option, while 1440P-SR is intended for larger screens or publishing workflows.
For reference-to-video requests, reference-video billable duration can still affect the final charge. Final billing is calculated from the active model pricing configuration for the selected resolution, duration, account, and environment.


















