
Kling Video O3 Pro Reference-to-Video API by Kuaishou
Kling Omni Video O3 Reference-to-Video generates creative videos using character, prop, or scene references. Professional quality with up to 7 reference images and optional video input.
Kling Video O3 Pro Reference-to-Video
Kling Video O3 Pro Reference-to-Video generates premium video from reference images with optional video guidance. Upload reference images to establish character identity and appearance, optionally provide a reference video for motion guidance, and describe the scene — the model produces top-tier cinematic video with identity consistency.
Why Choose This?
O3 Pro quality The highest visual fidelity and motion realism in the Kling family.
Multi-reference images Upload up to 7 reference images (or up to 4 with a reference video).
Video-guided generation Optional reference video for motion and scene guidance.
Keep original sound Preserve the audio from the reference video in the output.
Sound generation Optional AI-generated sound effects when no reference video is provided.
Parameters
| Parameter | Required | Description |
|---|---|---|
| prompt | Yes | Text description of the video scene and motion |
| video | No | Reference video for motion guidance |
| images | No | Reference images: up to 4 with video, up to 7 without (click "+ Add Item") |
| keep_original_sound | No | Keep audio from the reference video (default: enabled) |
| sound | No | Generate AI audio (only when no reference video, default: disabled) |
| aspect_ratio | No | Output ratio: 16:9 (default), 9:16, 1:1 |
| duration | No | Video length: 3-15 seconds (default: 5) |
How to Use
- Run — submit and download your video.
- Set duration — choose any length from 3 to 15 seconds.
- Select aspect ratio — match your target platform.
- Configure audio — keep original sound from video, or enable AI sound generation.
- Upload reference images — add character or scene references.
- Upload reference video (optional) — provide a video for motion guidance.
- Write your prompt — describe the scene, characters, and action.
Best Use Cases
- Long-Form Scenes — Up to 15 seconds for extended scene development.
- Storytelling — Produce narrative scenes with consistent character appearance.
- Marketing & Ads — Create promotional videos featuring specific people or products.
- Video Remixing — Use reference video for motion guidance with new characters.
- Character Consistency — Generate videos with identity-consistent characters.
Pro Tips
- Match aspect ratio to your platform: 16:9 for YouTube, 9:16 for TikTok/Reels, 1:1 for Instagram.
- Use shorter durations (3-5s) for testing, longer (10-15s) for final production.
- Sound generation is only available when no reference video is provided.
- Enable keep_original_sound to preserve audio from your reference video.
- When using a reference video, the image limit is 4; without a video, you can use up to 7.
- Use multiple reference images from different angles for better identity preservation.
Notes
- Ensure uploaded image and video URLs are publicly accessible.
- When a reference video is provided, sound generation is replaced by keep_original_sound.
- Reference images limit: up to 4 with video, up to 7 without.
- Duration supports any value from 3 to 15 seconds.
- Only prompt is required; other parameters are optional.
Related Models
- Kling V3.0 Pro Text-to-Video — Pro quality text-to-video.
- Kling V3.0 Pro Image-to-Video — V3.0 Pro quality at lower cost.
- Kling Video O3 Pro Image-to-Video — O3 Pro quality single image to video.


















