Inicio
Explorar
google/gemini-omni-flash/image-to-video-developer
Gemini Omni Flash Image-to-Video Developer
Imagen a Video
DEV

Gemini Omni Flash Image-to-Video Developer API by Google

google/gemini-omni-flash/image-to-video-developer
Image-to-video-developer

Gemini Omni Flash is Google's multimodal video generation model. This image-to-video variant creates subject-consistent videos from up to 7 reference images combined with a text prompt, preserving visual identity across the full generated video.

Gemini Omni Flash — Image to Video (Developer)

Model ID: google/gemini-omni-flash/image-to-video-developer

Gemini Omni is Google's multimodal video generation model designed to create high-quality video content from diverse input types. This variant accepts a text prompt plus up to 7 reference images, enabling subject-consistent video generation where the visual identity of characters, objects, or scenes is anchored by real image references.


Overview

Gemini Omni brings together Google's deep knowledge of physics, narrative logic, biology, culture, and visual composition to produce contextually coherent videos. Rather than simple clip synthesis, the model reasons about scene dynamics, camera language, and temporal flow to produce results that feel intentional and cinematic.

With image inputs, the model extracts key visual features — appearance, texture, structure, style — and carries them faithfully into the generated video. This makes it well suited for character animation, product visualization, and style-guided generation.

The developer tier provides direct API access with full control over generation parameters including resolution, aspect ratio, duration, and random seed.


Key Capabilities

  • Image-guided generation — Provide 1 to 7 reference images to anchor subjects, environments, or visual styles.
  • Subject consistency — The model preserves key visual details from the reference images across the full video duration.
  • Rich prompt understanding — Complement image references with a prompt of up to 20,000 characters describing actions, camera movements, lighting, and mood.
  • Multi-resolution output — Generate at 720p, 1080p, or 4K.
  • Flexible aspect ratios — 16:9 landscape or 9:16 portrait.
  • Controllable duration — 4, 6, 8, or 10 seconds per generation.
  • Reproducible results — Set a fixed seed to reproduce or iterate on a specific generation.

Input Parameters

ParameterTypeRequiredDefaultDescription
modelstringYesgoogle/gemini-omni-flash/image-to-video-developerModel identifier
promptstringYesText description of the video. Max 20,000 characters.
imagesarrayYes1–7 reference image URLs. Supported formats: PNG, JPEG, JPG, WebP. Max 20MB each.
durationintegerNo8Video length in seconds. Enum: 4, 6, 8, 10.
aspect_ratiostringNo16:9Output aspect ratio. Enum: 16:9, 9:16.
resolutionstringNo720pOutput resolution. Enum: 720p, 1080p, 4k.
seedintegerNo-1Random seed for reproducibility. -1 uses a random seed.

Image Input Notes

  • Accepts 1 to 7 images per request.
  • Supported codecs: PNG, JPEG, JPG, WebP.
  • Minimum image dimensions: 128×128 pixels.
  • Each image must be under 20MB.

Use Cases

  • Character animation — Bring a character photo or illustration to life with a text description of their action.
  • Product visualization — Animate product images for marketing or e-commerce content.
  • Style transfer — Feed a reference artwork or photograph to define the visual style of the generated video.
  • Scene composition — Combine multiple reference images of different subjects to compose a coherent scene.
  • Storyboard-to-video — Convert static storyboard frames into animated previews.

Pricing

Pricing is based on output resolution and video duration. Image inputs do not incur additional charges.

ResolutionFormulaExample (8s)
720p / 1080p$0.2 + duration × $0.1$1
4k$1 + duration × $0.1$1.8

Formula: (resolution == "4k" ? $1 : $0.2) + duration × $0.1

720p and 1080p are identically priced. The 0.2/0.2 / 1 term is a fixed base charge per generation; $0.1 is the per-second rate applied to the requested duration.

Más de 300 Modelos, Comienza Ahora,

Explorar Todos los Modelos

Join our Discord community

Join the Discord community for the latest model updates, prompts, and support.