google/veo3.1/reference-to-video

Create richly detailed videos guided by visual references. Veo 3.1 Reference-to-Video preserves characters, style, and composition across scenes for consistent, visually coherent storytelling.

IMAGE-TO-VIDEOHOTNEW
gambar-ke-video

Create richly detailed videos guided by visual references. Veo 3.1 Reference-to-Video preserves characters, style, and composition across scenes for consistent, visually coherent storytelling.

INPUT

Memuat konfigurasi parameter...

OUTPUT

Menunggu
Video yang dihasilkan akan muncul di sini
Konfigurasikan pengaturan Anda dan klik Jalankan untuk memulai

Permintaan Anda akan dikenakan biaya 0.18 per eksekusi. Dengan $10 Anda dapat menjalankan model ini sekitar 55 kali.

Berikut yang dapat Anda lakukan selanjutnya:

Parameter

Queue

Integrasi

Input Schema

Parameter berikut diterima di isi permintaan.

Total: 0Wajib: 0Opsional: 0

Tidak ada parameter yang tersedia.

Contoh Isi Permintaan

json
{
  "model": "google/veo3.1/reference-to-video"
}

Silakan masuk untuk melihat riwayat permintaan

Anda perlu masuk untuk mengakses riwayat permintaan model Anda.

Masuk

Google Veo 3.1 — Reference-to-Video Model

Veo 3.1 Reference-to-Video brings static images to life by combining visual reference consistency with cinematic motion generation. Powered by Google DeepMind’s next-generation Veo 3.1 architecture, this model transforms up to three reference images into coherent 5-second videos with smooth motion, accurate visual alignment, and synchronized native audio.

🌟 Key Features

🧠 Multi-Image Reference Support

  • Accepts up to three reference images to define the subject, environment, or style.
  • Maintains consistent identity, lighting, and appearance across frames.
  • Ideal for animating people, objects, or scenes with reliable fidelity.

🎬 Cinematic Video Generation

  • Produces 5-second motion clips at 1080p or 720p resolution.
  • Adds camera dynamics such as panning, zooming, or subtle perspective drift.
  • Supports synchronized audio generation, matching dialogue or ambient context.

💡 Smart Prompt Adherence

  • Interprets both text instructions and visual cues for precise motion storytelling.
  • Automatically harmonizes character interactions, props, and backgrounds.

⚙️ Capabilities

  • Input:

    • Up to 3 reference images (JPEG / PNG / WEBP)
    • Text prompt describing motion, action, and scene context
  • Output:

    • 8-second MP4 video (720p or 1080p)
    • Optional synchronized audio
  • Negative Prompt (optional):

    • Exclude unwanted artifacts or elements (e.g., “no text”, “no flicker”).
  • Seed (optional):

    • Reproduce specific results for consistent creative control.

💰 Pricing

DurationResolutionWith AudioWithout Audio
8 seconds720p$3.20$1.60
8 seconds1080p$3.20$1.60

✅ Commercial use allowed

🧩 How to Use

  1. Upload up to 3 reference images — define the subject, object, or visual style.
  2. Write a text prompt — describe the action, setting, and camera motion.
  3. (Optional) Add a negative prompt to remove unwanted details.
  4. Choose resolution (720p or 1080p).
  5. (Optional) Enable audio generation for synchronized sound.
  6. Click Run to generate your 5-second cinematic video.

💡 Best Practices

  • Use clear, well-lit reference images with similar styles and proportions.
  • Keep prompts concise but specific (e.g., “The man in image 1 waves to the penguins in image 2 under bright sunlight”).
  • Avoid overly complex scenarios with many characters or fast movement.
  • Enable audio for more immersive storytelling results.

📝 Notes

  • Ensure uploaded images are valid and accessible URLs or uploaded locally.
  • If the output looks unstable, reduce reference count or simplify the prompt.
  • Follow Google’s content safety rules; modify the prompt if flagged.
  • For best performance, prefer portrait-oriented subjects and balanced lighting.

Mulai dari 300+ Model,

Jelajahi semua model