
Vidu Q1 Reference-to-Video API by Vidu
Vidu Q1 Reference-to-Video is an advanced AI video generation model that brings static images to life. Upload a reference image and describe the motion you want — the model generates high-quality video with smooth animation, optional audio, and cinematic quality up to 1080p.
Vidu Q1 Reference-to-Video
Vidu Q1 Reference-to-Video is an efficient AI video generation model that generates video featuring specific subjects. Provide subject images alongside a motion prompt, and the model generates a 5-second 1080p video that faithfully preserves each subject's appearance, style, and identity — fast and at an accessible price point.
Why Choose This?
-
Fast generation Optimized for quick turnaround with minimal wait time.
-
Subject-driven generation Feature specific characters or objects with consistent appearance across the generated video.
-
1080p output Generate videos in full 1080p high definition quality.
-
5-second videos Produces crisp, fixed-length 5-second videos ready to share.
-
Audio generation Optional audio with configurable type: full audio, speech only, or sound effects only.
-
Prompt Enhancer Built-in tool to automatically improve your motion descriptions.
Parameters
| Parameter | Required | Description |
|---|---|---|
| prompt | Yes | Text description of the desired motion and action |
| subjects | Yes | One or more subject images to feature in the video (URL or upload) |
| resolution | No | Output quality: 1080p |
| duration | No | Fixed video length of 5 seconds |
| aspect_ratio | No | Aspect ratio of the output: 16:9 (default), 9:16, 1:1, 4:3, 3:4 |
| movement_amplitude | No | Motion intensity: auto (default), small, medium, large |
| generate_audio | No | Whether to generate audio for the video (default: true) |
| audio_type | No | Audio type when generate_audio is true: all (default), speech_only, sound_effect_only |
| seed | No | Seed for generation (default: 0); use -1 for a random seed |
How to Use
- Upload your subject images — provide one or more images of the subjects to feature in the video.
- Write your prompt — describe the motion, camera movement, and desired action.
- Configure audio (optional) — enable audio and select the audio type: all, speech_only, or sound_effect_only.
- Set motion intensity (optional) — adjust movement_amplitude for subtle or dynamic animations.
- Run — submit and download your video.
Pricing
| Resolution | Cost |
|---|---|
| 1080p | $0.4 |
Best Use Cases
- Character Consistency — Generate video featuring a specific character or subject across scenes.
- Product Videos — Animate product photos while preserving brand appearance.
- Style-Consistent Content — Produce video that matches the visual aesthetic of existing assets.
- Social Media Content — Create engaging animated clips based on your existing image library.
- Rapid Prototyping — Quickly explore reference-guided video concepts before committing to higher-quality generation.
Pro Tips
- Use the Prompt Enhancer to refine your motion descriptions.
- Provide clear, well-lit subject images for the most consistent visual output.
- Be specific about movement direction, speed, and camera angles in your prompt.
- Use multiple subject images when the scene involves more than one character or object.
- Set audio_type to speech_only when the scene involves dialogue, or sound_effect_only for purely ambient audio.
- Describe environmental elements (lighting, background) in the prompt to guide the scene composition.
Notes
- Both prompt and subjects are required fields.
- Video duration is fixed at 5 seconds.
- When generate_audio is true, audio_type controls what is generated: all includes speech and sound effects, speech_only generates voice audio, sound_effect_only generates ambient and environmental sounds.
- Ensure uploaded subject image URLs are publicly accessible.
Related Models
- Vidu Q1 Image-to-Video — Animate a single reference image with text-guided motion.
- Vidu Q2-Pro-Fast Reference-to-Video — Higher quality reference-to-video with adjustable duration and resolution.


















