google/veo3.1/reference-to-video

Create richly detailed videos guided by visual references. Veo 3.1 Reference-to-Video preserves characters, style, and composition across scenes for consistent, visually coherent storytelling.

IMAGE-TO-VIDEOHOTNEW
image-vers-vidéo

Create richly detailed videos guided by visual references. Veo 3.1 Reference-to-Video preserves characters, style, and composition across scenes for consistent, visually coherent storytelling.

Entrée

Chargement de la configuration des paramètres...

Sortie

Inactif
Les vidéos générées apparaîtront ici
Configurez vos paramètres et cliquez sur exécuter pour commencer

Votre requête coûtera 0.18 par exécution. Avec $10, vous pouvez exécuter ce modèle environ 55 fois.

Vous pouvez continuer avec :

Paramètres

Queue

Intégrations

Schema d'entrée

Les paramètres suivants sont acceptés dans le corps de la requête.

Total: 0Requis: 0Optionnel: 0

Aucun paramètre disponible.

Exemple de corps de requête

json
{
  "model": "google/veo3.1/reference-to-video"
}

Veuillez vous connecter pour voir l'historique des requêtes

Vous devez vous connecter pour accéder à l'historique de vos requêtes de modèle.

Se Connecter

Google Veo 3.1 — Reference-to-Video Model

Veo 3.1 Reference-to-Video brings static images to life by combining visual reference consistency with cinematic motion generation. Powered by Google DeepMind’s next-generation Veo 3.1 architecture, this model transforms up to three reference images into coherent 5-second videos with smooth motion, accurate visual alignment, and synchronized native audio.

🌟 Key Features

🧠 Multi-Image Reference Support

  • Accepts up to three reference images to define the subject, environment, or style.
  • Maintains consistent identity, lighting, and appearance across frames.
  • Ideal for animating people, objects, or scenes with reliable fidelity.

🎬 Cinematic Video Generation

  • Produces 5-second motion clips at 1080p or 720p resolution.
  • Adds camera dynamics such as panning, zooming, or subtle perspective drift.
  • Supports synchronized audio generation, matching dialogue or ambient context.

💡 Smart Prompt Adherence

  • Interprets both text instructions and visual cues for precise motion storytelling.
  • Automatically harmonizes character interactions, props, and backgrounds.

⚙️ Capabilities

  • Input:

    • Up to 3 reference images (JPEG / PNG / WEBP)
    • Text prompt describing motion, action, and scene context
  • Output:

    • 8-second MP4 video (720p or 1080p)
    • Optional synchronized audio
  • Negative Prompt (optional):

    • Exclude unwanted artifacts or elements (e.g., “no text”, “no flicker”).
  • Seed (optional):

    • Reproduce specific results for consistent creative control.

💰 Pricing

DurationResolutionWith AudioWithout Audio
8 seconds720p$3.20$1.60
8 seconds1080p$3.20$1.60

✅ Commercial use allowed

🧩 How to Use

  1. Upload up to 3 reference images — define the subject, object, or visual style.
  2. Write a text prompt — describe the action, setting, and camera motion.
  3. (Optional) Add a negative prompt to remove unwanted details.
  4. Choose resolution (720p or 1080p).
  5. (Optional) Enable audio generation for synchronized sound.
  6. Click Run to generate your 5-second cinematic video.

💡 Best Practices

  • Use clear, well-lit reference images with similar styles and proportions.
  • Keep prompts concise but specific (e.g., “The man in image 1 waves to the penguins in image 2 under bright sunlight”).
  • Avoid overly complex scenarios with many characters or fast movement.
  • Enable audio for more immersive storytelling results.

📝 Notes

  • Ensure uploaded images are valid and accessible URLs or uploaded locally.
  • If the output looks unstable, reduce reference count or simplify the prompt.
  • Follow Google’s content safety rules; modify the prompt if flagged.
  • For best performance, prefer portrait-oriented subjects and balanced lighting.

Commencez avec Plus de 300 Modèles,

Explorer tous les modèles