kwaivgi/kling-video-o3-pro/reference-to-video

Kling Omni Video O3 Reference-to-Video generates creative videos using character, prop, or scene references. Professional quality with up to 7 reference images and optional video input.

IMAGE-TO-VIDEOREFERENCE-TO-VIDEOPRO
Home
Explore
Kling Video Models
Kling 3.0 Video Models
kwaivgi/kling-video-o3-pro/reference-to-video
image-to-video
PRO

Kling Omni Video O3 Reference-to-Video generates creative videos using character, prop, or scene references. Professional quality with up to 7 reference images and optional video input.

INPUT

Loading parameter configuration...

OUTPUT

Idle
Your generated videos will appear here
Configure your settings and click Run to get started

Your request will cost 0.204 per run. For $10 you can run this model approximately 49 times.

Here's what you can do next:

Parameters

Queue

Integrations

Input Schema

The following parameters are accepted in the request body.

Total: 0Required: 0Optional: 0

No parameters available.

Example Request Body

json
{
  "model": "kwaivgi/kling-video-o3-pro/reference-to-video"
}

Please log in to view request history

You need to be logged in to access your model request history.

Log In

Kling Video O3 Pro Reference-to-Video

Kling Video O3 Pro Reference-to-Video generates premium video from reference images with optional video guidance. Upload reference images to establish character identity and appearance, optionally provide a reference video for motion guidance, and describe the scene — the model produces top-tier cinematic video with identity consistency.

Why Choose This?

O3 Pro quality The highest visual fidelity and motion realism in the Kling family.

Multi-reference images Upload up to 7 reference images (or up to 4 with a reference video).

Video-guided generation Optional reference video for motion and scene guidance.

Keep original sound Preserve the audio from the reference video in the output.

Sound generation Optional AI-generated sound effects when no reference video is provided.

Parameters

ParameterRequiredDescription
promptYesText description of the video scene and motion
videoNoReference video for motion guidance
imagesNoReference images: up to 4 with video, up to 7 without (click "+ Add Item")
keep_original_soundNoKeep audio from the reference video (default: enabled)
soundNoGenerate AI audio (only when no reference video, default: disabled)
aspect_ratioNoOutput ratio: 16:9 (default), 9:16, 1:1
durationNoVideo length: 3-15 seconds (default: 5)

How to Use

  1. Run — submit and download your video.
  2. Set duration — choose any length from 3 to 15 seconds.
  3. Select aspect ratio — match your target platform.
  4. Configure audio — keep original sound from video, or enable AI sound generation.
  5. Upload reference images — add character or scene references.
  6. Upload reference video (optional) — provide a video for motion guidance.
  7. Write your prompt — describe the scene, characters, and action.

Best Use Cases

  • Long-Form Scenes — Up to 15 seconds for extended scene development.
  • Storytelling — Produce narrative scenes with consistent character appearance.
  • Marketing & Ads — Create promotional videos featuring specific people or products.
  • Video Remixing — Use reference video for motion guidance with new characters.
  • Character Consistency — Generate videos with identity-consistent characters.

Pro Tips

  • Match aspect ratio to your platform: 16:9 for YouTube, 9:16 for TikTok/Reels, 1:1 for Instagram.
  • Use shorter durations (3-5s) for testing, longer (10-15s) for final production.
  • Sound generation is only available when no reference video is provided.
  • Enable keep_original_sound to preserve audio from your reference video.
  • When using a reference video, the image limit is 4; without a video, you can use up to 7.
  • Use multiple reference images from different angles for better identity preservation.

Notes

  • Ensure uploaded image and video URLs are publicly accessible.
  • When a reference video is provided, sound generation is replaced by keep_original_sound.
  • Reference images limit: up to 4 with video, up to 7 without.
  • Duration supports any value from 3 to 15 seconds.
  • Only prompt is required; other parameters are optional.
  • Kling V3.0 Pro Text-to-Video — Pro quality text-to-video.
  • Kling V3.0 Pro Image-to-Video — V3.0 Pro quality at lower cost.
  • Kling Video O3 Pro Image-to-Video — O3 Pro quality single image to video.

Start From 300+ Models,

Explore all models