Home
Verkennen
vidu/q2/reference-to-video
Vidu Q2 Reference-to-video
Referentie-naar-Video

Vidu Q2 Reference-to-Video API by Vidu

vidu/q2/reference-to-video
Reference-to-video

Vidu Q2 Reference-to-Video is an advanced AI video generation model that brings static images to life. Upload a reference image and describe the motion you want — the model generates high-quality video with smooth animation, optional audio, and cinematic quality up to 1080p.

Vidu Q2 Reference-to-Video

Vidu Q2 Reference-to-Video is a capable AI video generation model that generates video featuring specific subjects. Provide subject images alongside a motion prompt, and the model produces smooth, natural video that faithfully preserves each subject's appearance and identity — offering a strong balance of quality and cost for subject-driven workflows.

Why Choose This?

  • Balanced quality and speed Solid visual consistency and motion quality at a mid-tier price point.

  • Subject-driven generation Feature specific characters or objects with consistent appearance throughout the video.

  • High resolution output Generate videos in 540p, 720p, or 1080p quality.

  • Flexible duration Create videos from 1 to 10 seconds in length.

  • Audio generation Optional audio with configurable type: full audio, speech only, or sound effects only.

  • Prompt Enhancer Built-in tool to automatically improve your motion descriptions.

Parameters

ParameterRequiredDescription
promptYesText description of the desired motion and action
subjectsYesOne or more subject images to feature in the video (URL or upload)
resolutionNoOutput quality: 540p, 720p (default), 1080p
durationNoVideo length in seconds (1-10, default: 5)
aspect_ratioNoAspect ratio of the output: 16:9 (default), 9:16, 1:1, 4:3, 3:4
movement_amplitudeNoMotion intensity: auto (default), small, medium, large
generate_audioNoWhether to generate audio for the video (default: true)
audio_typeNoAudio type when generate_audio is true: all (default), speech_only, sound_effect_only
seedNoSeed for generation (default: 0); use -1 for a random seed

How to Use

  1. Upload your subject images — provide one or more images of the subjects to feature in the video.
  2. Write your prompt — describe the motion, camera movement, and desired action.
  3. Set resolution — higher resolution for better quality, lower for faster processing.
  4. Adjust duration — set video length up to 10 seconds.
  5. Configure audio (optional) — enable audio and select the audio type: all, speech_only, or sound_effect_only.
  6. Set motion intensity (optional) — adjust movement_amplitude for subtle or dynamic animations.
  7. Run — submit and download your video.

Pricing

ResolutionCost
540pStarts at 0.0750,+0.0750, +0.0250/sec
720pStarts at 0.1250,+0.1250, +0.0250/sec
1080pStarts at 0.3750,+0.3750, +0.0500/sec

Best Use Cases

  • Character Consistency — Generate video featuring a specific character or subject across multiple scenes.
  • Product Videos — Animate product imagery while maintaining accurate brand appearance.
  • Style-Consistent Content — Produce video that matches the visual aesthetic of existing creative assets.
  • Social Media Content — Create animated clips grounded in your existing image library.
  • Concept Development — Explore reference-guided motion ideas quickly and affordably.

Pro Tips

  • Use the Prompt Enhancer to refine your motion descriptions.
  • Provide clear, well-lit subject images for the most consistent visual output.
  • Be specific about movement direction, speed, and camera angles in your prompt.
  • Use multiple subject images when the scene involves more than one character or object.
  • Set audio_type to speech_only when the scene involves dialogue, or sound_effect_only for purely ambient audio.
  • Start with 540p for previews and switch to 720p or 1080p for final output.

Notes

  • Both prompt and subjects are required fields.
  • Maximum video duration is 10 seconds.
  • When generate_audio is true, audio_type controls what is generated: all includes speech and sound effects, speech_only generates voice audio, sound_effect_only generates ambient and environmental sounds.
  • Ensure uploaded subject image URLs are publicly accessible.

Ontdek Vergelijkbare Modellen

Eén API voor alle media-AI.

Verken alle modellen

Join our Discord community

Join the Discord community for the latest model updates, prompts, and support.