หน้าหลัก
สำรวจ
vidu/q2-pro/reference-to-video
Vidu Q2-Pro Reference-to-video
อ้างอิงเป็นวิดีโอ
PRO

Vidu Q2 Pro Reference-to-Video API by Vidu

vidu/q2-pro/reference-to-video
Reference-to-video

Vidu Q2-Pro Reference-to-Video is an advanced AI video generation model that brings static images to life. Upload a reference image and describe the motion you want — the model generates high-quality video with smooth animation, optional audio, and cinematic quality up to 1080p.

Vidu Q2-Pro Reference-to-Video

Vidu Q2-Pro Reference-to-Video is a professional-grade AI video generation model that generates video featuring specific subjects with cinematic precision. Provide subject images alongside a motion prompt, and the model delivers up to 1080p video with rich detail, strict subject fidelity, and smooth natural motion — ideal for high-end creative, brand, and production workflows.

Why Choose This?

  • Professional quality Cinematic detail and smooth motion with faithful subject preservation at up to 1080p.

  • Subject-driven generation Feature specific characters or objects with strict visual fidelity throughout the video.

  • Flexible duration Create videos up to 10 seconds in length.

  • Audio generation Optional audio with configurable type: full audio, speech only, or sound effects only.

  • Motion control Adjust movement amplitude for subtle or dynamic animations.

  • Prompt Enhancer Built-in tool to automatically improve your motion descriptions.

Parameters

ParameterRequiredDescription
promptYesText description of the desired motion and action
subjectsYesOne or more subject images to feature in the video (URL or upload)
resolutionNoOutput quality: 540p, 720p (default), 1080p
durationNoVideo length in seconds (1-10, default: 5)
aspect_ratioNoAspect ratio of the output: 16:9 (default), 9:16, 1:1, 4:3, 3:4
movement_amplitudeNoMotion intensity: auto (default), small, medium, large
generate_audioNoWhether to generate audio for the video (default: true)
audio_typeNoAudio type when generate_audio is true: all (default), speech_only, sound_effect_only
seedNoSeed for generation (default: 0); use -1 for a random seed

How to Use

  1. Upload your subject images — provide one or more images of the subjects to feature in the video.
  2. Write your prompt — describe the motion, camera movement, and desired action.
  3. Set resolution — higher resolution for better quality, lower for faster processing.
  4. Adjust duration — set video length up to 10 seconds.
  5. Configure audio (optional) — enable audio and select the audio type: all, speech_only, or sound_effect_only.
  6. Set motion intensity (optional) — adjust movement_amplitude for subtle or dynamic animations.
  7. Run — submit and download your video.

Pricing

ResolutionCost
540pStarts at 0.1000,+0.1000, +0.0250/sec
720pStarts at 0.1500,+0.1500, +0.0250/sec
1080pStarts at 0.4250,+0.4250, +0.0500/sec

Best Use Cases

  • Character Consistency — Generate high-quality video featuring a specific character or subject with strict visual fidelity.
  • Brand & Product Videos — Produce professional-grade product animations while preserving brand identity.
  • Film & Narrative Production — Animate reference imagery for previs, concept reels, or final narrative content.
  • Style-Consistent Campaigns — Create multiple video assets that maintain a unified visual style across a campaign.
  • Premium Social Media Content — Publish cinematic, reference-guided video for high-visibility channels.

Pro Tips

  • Use the Prompt Enhancer to refine your motion descriptions.
  • Provide high-resolution, well-composed subject images for the strongest visual consistency.
  • Be specific about movement direction, speed, camera angles, and framing in your prompt.
  • Use multiple subject images to define different characters or scene elements independently.
  • Set movement_amplitude to "small" for precise, controlled motion or "large" for expressive action.
  • Set audio_type to speech_only when the scene involves dialogue, or sound_effect_only for purely ambient audio.
  • Describe lighting, atmosphere, and environmental effects in the prompt for richer scene quality.

Notes

  • Both prompt and subjects are required fields.
  • Maximum video duration is 10 seconds.
  • When generate_audio is true, audio_type controls what is generated: all includes speech and sound effects, speech_only generates voice audio, sound_effect_only generates ambient and environmental sounds.
  • Ensure uploaded subject image URLs are publicly accessible.

สำรวจโมเดลที่คล้ายกัน

API เดียวสำหรับ AI สื่อทุกประเภท

สำรวจโมเดลทั้งหมด

Join our Discord community

Join the Discord community for the latest model updates, prompts, and support.