google/veo3.1/reference-to-video

Create richly detailed videos guided by visual references. Veo 3.1 Reference-to-Video preserves characters, style, and composition across scenes for consistent, visually coherent storytelling.

IMAGE-TO-VIDEOHOTNEW
Hình ảnh-Video

Create richly detailed videos guided by visual references. Veo 3.1 Reference-to-Video preserves characters, style, and composition across scenes for consistent, visually coherent storytelling.

Đầu vào

Đang tải cấu hình tham số...

Đầu ra

Nhàn rỗi
Video đã tạo của bạn sẽ xuất hiện ở đây
Cấu hình tham số và nhấp Chạy để bắt đầu tạo

Mỗi lần chạy có giá 0.18. Với $10, bạn có thể chạy khoảng 55 lần.

Bạn có thể tiếp tục với:

Tham số

Queue

Tích hợp

Input Schema

Các tham số sau được chấp nhận trong nội dung yêu cầu.

Tổng cộng: 0Bắt buộc: 0Tùy chọn: 0

Không có tham số nào.

Ví dụ nội dung yêu cầu

json
{
  "model": "google/veo3.1/reference-to-video"
}

Đăng nhập để xem lịch sử yêu cầu

Bạn cần đăng nhập để truy cập lịch sử yêu cầu mô hình của mình.

Đăng nhập

Google Veo 3.1 — Reference-to-Video Model

Veo 3.1 Reference-to-Video brings static images to life by combining visual reference consistency with cinematic motion generation. Powered by Google DeepMind’s next-generation Veo 3.1 architecture, this model transforms up to three reference images into coherent 5-second videos with smooth motion, accurate visual alignment, and synchronized native audio.

🌟 Key Features

🧠 Multi-Image Reference Support

  • Accepts up to three reference images to define the subject, environment, or style.
  • Maintains consistent identity, lighting, and appearance across frames.
  • Ideal for animating people, objects, or scenes with reliable fidelity.

🎬 Cinematic Video Generation

  • Produces 5-second motion clips at 1080p or 720p resolution.
  • Adds camera dynamics such as panning, zooming, or subtle perspective drift.
  • Supports synchronized audio generation, matching dialogue or ambient context.

💡 Smart Prompt Adherence

  • Interprets both text instructions and visual cues for precise motion storytelling.
  • Automatically harmonizes character interactions, props, and backgrounds.

⚙️ Capabilities

  • Input:

    • Up to 3 reference images (JPEG / PNG / WEBP)
    • Text prompt describing motion, action, and scene context
  • Output:

    • 8-second MP4 video (720p or 1080p)
    • Optional synchronized audio
  • Negative Prompt (optional):

    • Exclude unwanted artifacts or elements (e.g., “no text”, “no flicker”).
  • Seed (optional):

    • Reproduce specific results for consistent creative control.

💰 Pricing

DurationResolutionWith AudioWithout Audio
8 seconds720p$3.20$1.60
8 seconds1080p$3.20$1.60

✅ Commercial use allowed

🧩 How to Use

  1. Upload up to 3 reference images — define the subject, object, or visual style.
  2. Write a text prompt — describe the action, setting, and camera motion.
  3. (Optional) Add a negative prompt to remove unwanted details.
  4. Choose resolution (720p or 1080p).
  5. (Optional) Enable audio generation for synchronized sound.
  6. Click Run to generate your 5-second cinematic video.

💡 Best Practices

  • Use clear, well-lit reference images with similar styles and proportions.
  • Keep prompts concise but specific (e.g., “The man in image 1 waves to the penguins in image 2 under bright sunlight”).
  • Avoid overly complex scenarios with many characters or fast movement.
  • Enable audio for more immersive storytelling results.

📝 Notes

  • Ensure uploaded images are valid and accessible URLs or uploaded locally.
  • If the output looks unstable, reduce reference count or simplify the prompt.
  • Follow Google’s content safety rules; modify the prompt if flagged.
  • For best performance, prefer portrait-oriented subjects and balanced lighting.

Bắt đầu với 300+ Mô hình,

Khám phá tất cả mô hình