google/veo3.1/reference-to-video

Create richly detailed videos guided by visual references. Veo 3.1 Reference-to-Video preserves characters, style, and composition across scenes for consistent, visually coherent storytelling.

IMAGE-TO-VIDEOHOTNEW
이미지를 비디오로

Create richly detailed videos guided by visual references. Veo 3.1 Reference-to-Video preserves characters, style, and composition across scenes for consistent, visually coherent storytelling.

Google Veo 3.1 — Reference-to-Video Model

Veo 3.1 Reference-to-Video brings static images to life by combining visual reference consistency with cinematic motion generation. Powered by Google DeepMind’s next-generation Veo 3.1 architecture, this model transforms up to three reference images into coherent 5-second videos with smooth motion, accurate visual alignment, and synchronized native audio.

🌟 Key Features

🧠 Multi-Image Reference Support

  • Accepts up to three reference images to define the subject, environment, or style.
  • Maintains consistent identity, lighting, and appearance across frames.
  • Ideal for animating people, objects, or scenes with reliable fidelity.

🎬 Cinematic Video Generation

  • Produces 5-second motion clips at 1080p or 720p resolution.
  • Adds camera dynamics such as panning, zooming, or subtle perspective drift.
  • Supports synchronized audio generation, matching dialogue or ambient context.

💡 Smart Prompt Adherence

  • Interprets both text instructions and visual cues for precise motion storytelling.
  • Automatically harmonizes character interactions, props, and backgrounds.

⚙️ Capabilities

  • Input:

    • Up to 3 reference images (JPEG / PNG / WEBP)
    • Text prompt describing motion, action, and scene context
  • Output:

    • 8-second MP4 video (720p or 1080p)
    • Optional synchronized audio
  • Negative Prompt (optional):

    • Exclude unwanted artifacts or elements (e.g., “no text”, “no flicker”).
  • Seed (optional):

    • Reproduce specific results for consistent creative control.

💰 Pricing

DurationResolutionWith AudioWithout Audio
8 seconds720p$3.20$1.60
8 seconds1080p$3.20$1.60

✅ Commercial use allowed

🧩 How to Use

  1. Upload up to 3 reference images — define the subject, object, or visual style.
  2. Write a text prompt — describe the action, setting, and camera motion.
  3. (Optional) Add a negative prompt to remove unwanted details.
  4. Choose resolution (720p or 1080p).
  5. (Optional) Enable audio generation for synchronized sound.
  6. Click Run to generate your 5-second cinematic video.

💡 Best Practices

  • Use clear, well-lit reference images with similar styles and proportions.
  • Keep prompts concise but specific (e.g., “The man in image 1 waves to the penguins in image 2 under bright sunlight”).
  • Avoid overly complex scenarios with many characters or fast movement.
  • Enable audio for more immersive storytelling results.

📝 Notes

  • Ensure uploaded images are valid and accessible URLs or uploaded locally.
  • If the output looks unstable, reduce reference count or simplify the prompt.
  • Follow Google’s content safety rules; modify the prompt if flagged.
  • For best performance, prefer portrait-oriented subjects and balanced lighting.

상세 사양

개요:

모델 제공자:GOOGLE
모델 유형:image-to-video
배포:추론 API; Playground
가격:$0.1600/second

주요 사양:

크기 제한:최대 너비 × 높이 (사용자 구성 가능)
LoRA 지원:아니오
시드 옵션:N/A

다음 걸작 만들기

300개 이상의 모델로 시작하세요,

Atlas Cloud에서만.