
Wan 2.7 Image-to-Video API by Alibaba
Animates images into videos with first-frame, first-and-last-frame, video continuation, and audio-driven modes.
Alibaba WAN 2.7 Image-to-Video
Alibaba WAN 2.7 Image-to-Video animates images into videos with multiple generation modes: first-frame, first-and-last-frame, video continuation, and audio-driven animation.
What makes it stand out?
- Multiple animation modes: Start from a single image, control both start and end frames, or extend an existing video clip.
- Audio-driven generation: Provide a driving audio file to generate lip-synced or action-matched video content.
- Multi-shot support: Generate multi-shot narratives with natural transitions and scene variety.
- Up to 15 seconds: Generate videos from 2 to 15 seconds at 720P or 1080P resolution.
- Super Resolution options: Choose
1080P-SRor1440P-SRwhen you need a sharper final video with cleaner edges and improved texture detail.
Designed For
- Creators who want to bring still images to life with motion and sound.
- Teams building video content from existing image assets or storyboard frames.
- Anyone who needs controlled video generation with specific start and end states.
- Publishing workflows that need a higher-detail final output from the same source image.
Super Resolution
Set resolution to 1080P-SR or 1440P-SR to use the FlashVSR super-resolution path. The request first generates a source video, then applies video super-resolution before returning the final output.
Use 1080P-SR for sharper HD results when the native 1080P output is not detailed enough. Use 1440P-SR for larger-screen previews, presentation assets, or publishing workflows where extra texture detail matters.
Super Resolution can take longer than native generation because it adds a post-processing step. Final billing is calculated from the selected model, resolution, duration, account, and environment pricing configuration.
How to Use
- First-frame mode: Provide an
imageURL. The model animates it into a video. - First-and-last-frame mode: Provide both
image(start) andlast_image(end). The model generates a transition between them. - Video continuation: Provide a
videoclip URL. The model extends the content of this clip. - Audio-driven: Add an
audioURL to any mode. The model matches the video to the audio. - Add a text prompt to guide the video content and style.
- Choose
resolution:720Por1080Pfor native generation,1080P-SRor1440P-SRfor Super Resolution.


















