
Vidu Q2 Text-to-Video API by Vidu
Vidu Q2 Text-to-Video is an advanced AI video generation model that brings static images to life. Upload a reference image and describe the motion you want — the model generates high-quality video with smooth animation, optional audio, and cinematic quality up to 1080p.
Vidu Q2 Text-to-Video
Vidu Q2 Text-to-Video is a capable AI video generation model that creates videos directly from text descriptions. Positioned between the entry-level Q1 and the professional Q2-Pro, it delivers a strong balance of quality, speed, and affordability — suitable for a wide range of creative and commercial workflows.
Why Choose This?
-
Balanced quality and speed Strong visual output without the wait time of higher-tier models.
-
High resolution output Generate videos in 540p, 720p, or 1080p quality.
-
Flexible duration Create videos from 1 to 10 seconds in length.
-
Audio generation Optional synchronized audio and background music.
-
Motion control Adjust movement amplitude for subtle or dynamic animations.
-
Prompt Enhancer Built-in tool to automatically improve your video descriptions.
Parameters
| Parameter | Required | Description |
|---|---|---|
| prompt | Yes | Text description of the video scene and action |
| resolution | No | Output quality: 540p, 720p (default), 1080p |
| duration | No | Video length in seconds (1-10, default: 5) |
| aspect_ratio | No | Output ratio: 16:9, 4:3, 9:16, etc. |
| movement_amplitude | No | Motion intensity: auto (default), small, medium, large |
| generate_audio | No | Generate synchronized audio (default: enabled) |
| bgm | No | Add background music (default: enabled) |
| seed | No | Random seed for reproducibility (-1 for random) |
How to Use
- Write your prompt — describe the scene, characters, and action in detail.
- Set resolution — higher resolution for better quality, lower for faster processing.
- Adjust duration — set video length up to 10 seconds.
- Configure audio (optional) — enable/disable audio generation and background music.
- Set motion intensity (optional) — control how dynamic the movement is.
- Run — submit and download your video.
Pricing
| Resolution | Cost |
|---|---|
| 540p | Starts at 0.0100/sec |
| 720p | Starts at 0.0250/sec |
| 1080p | Starts at 0.0500/sec |
Best Use Cases
- Social Media Content — Create short-form videos for TikTok, Reels, and Stories.
- Concept Visualization — Bring creative ideas and narratives to life without filming.
- Marketing Videos — Produce promotional content from text descriptions.
- Educational Content — Generate illustrative clips for tutorials and explainers.
- Storytelling — Create narrative scenes for creative and commercial projects.
Pro Tips
- Use the Prompt Enhancer to refine your descriptions automatically.
- Be specific about character actions, emotions, and scene details for best results.
- Set movement_amplitude to "small" for subtle, cinematic motion or "large" for energetic action.
- Enable generate_audio and bgm for a complete video experience with sound.
- Use seed for reproducible results when iterating on prompts.
- Start with 540p to preview your concept, then switch to 720p or 1080p for final output.
Notes
- Maximum video duration is 10 seconds.
- Audio generation adds synchronized sound effects and ambient audio.
- BGM adds background music appropriate to the scene mood.
- Ensure prompts describe clear, visualizable scenes for best results.
Related Models
- Vidu Q2 Reference-to-Video — Generate video using reference images as visual anchors.
- Vidu Q2-Pro Text-to-Video — Higher quality text-to-video with 1080p support and extended duration.
- Vidu Q1 Text-to-Video — Faster, more affordable text-to-video for quick prototyping.


















