google/gemini-omni-flash/reference-to-video

referensi-ke-video

Gemini Omni Flash Reference-to-Video API by Google

google/gemini-omni-flash/reference-to-video

Reference-to-video

A natively multimodal Google DeepMind model that generates cinematic, sound-enabled videos from a text prompt plus 1-5 reference images, carrying a consistent subject, scene, or style across generations.

INPUT

Prompt *

Gambar *(0/5)

Anda dapat drag & drop file atau klik untuk mengunggah

MAX:5

Durasi

Rasio Aspek

Thinking level

Resolusi

Seed

OUTPUT

Menunggu

Video yang dihasilkan akan muncul di sini

Konfigurasikan pengaturan Anda dan klik Jalankan untuk memulai

Permintaan Anda akan dikenakan biaya $0.135 per eksekusi. Dengan $10 Anda dapat menjalankan model ini sekitar 74 kali.

Berikut yang dapat Anda lakukan selanjutnya:

Seedance 2.0 Kling v3 Vidu Wan2.7

Parameter

Contoh kode
import requests
import time

# Step 1: Start video generation
generate_url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "google/gemini-omni-flash/reference-to-video",  # Required. model name
    "prompt": "A beautiful sunset over the ocean with gentle waves",  # Required. Text prompt for generation
    "images": [
        "https://example.com/image1.jpg"
    ],  # Required. Images to use as character, scene, or style references
    "duration": 10,  # The duration of the generated video in seconds. (min: 3, max: 10)
    "aspect_ratio": "16:9",  # The aspect ratio of the generated video. options: 16:9 | 9:16
    "resolution": "720p",  # The resolution of the generated video. options: 720p
    "thinking_level": "default",  # Controls the amount of internal reasoning the model performs before generating a response. options: default | high | low
    "seed": -1,  # The random seed to use for the generation
}

generate_response = requests.post(generate_url, headers=headers, json=data)
generate_result = generate_response.json()
prediction_id = generate_result["data"]["id"]

# Step 2: Poll for result
poll_url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"

def check_status():
    while True:
        response = requests.get(poll_url, headers={"Authorization": "Bearer $ATLASCLOUD_API_KEY"})
        result = response.json()

        if result["data"]["status"] in ["completed", "succeeded"]:
            print("Generated video:", result["data"]["outputs"][0])
            return result["data"]["outputs"][0]
        elif result["data"]["status"] == "failed":
            raise Exception(result["data"]["error"] or "Generation failed")
        else:
            # Still processing, wait 2 seconds
            time.sleep(2)

video_url = check_status()

Instalasi

Instal paket dependensi yang diperlukan.

pip install requests

Autentikasi

Semua permintaan API memerlukan autentikasi melalui API key. Anda bisa mendapatkan API key dari dasbor Atlas Cloud.

export ATLASCLOUD_API_KEY="your-api-key-here"

HTTP Headers

import os

API_KEY = os.environ.get("ATLASCLOUD_API_KEY")
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}

Jaga keamanan API key Anda

Jangan pernah mengekspos API key Anda di kode sisi klien atau repositori publik. Gunakan variabel lingkungan atau proxy backend sebagai gantinya.

Kirim permintaan

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "your-model",
    "prompt": "A beautiful landscape"
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

Kirim Permintaan

Kirim permintaan pembuatan asinkron. API mengembalikan prediction ID yang dapat Anda gunakan untuk memeriksa status dan mengambil hasil.

POST/api/v1/model/generateVideo

Isi Permintaan

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}

data = {
    "model": "google/gemini-omni-flash/reference-to-video",
    "prompt": "A beautiful sunset over the ocean with gentle waves"
}

response = requests.post(url, headers=headers, json=data)
result = response.json()

print(f"Prediction ID: {result['data']['id']}")
print(f"Status: {result['data']['status']}")

Respons

{
  "code": 200,
  "data": {
    "id": "pred_abc123",
    "status": "processing",
    "model": "model-name",
    "created_at": "2025-01-01T00:00:00Z"
  }
}

Periksa Status

Polling prediction endpoint untuk memeriksa status permintaan Anda saat ini.

GET/api/v1/model/prediction/{prediction_id}

Contoh Polling

import requests
import time

prediction_id = "pred_abc123"
url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

while True:
    response = requests.get(url, headers=headers)
    result = response.json()
    status = result["data"]["status"]
    print(f"Status: {status}")

    if status in ["completed", "succeeded"]:
        output_url = result["data"]["outputs"][0]
        print(f"Output URL: {output_url}")
        break
    elif status == "failed":
        print(f"Error: {result['data'].get('error', 'Unknown')}")
        break

    time.sleep(3)

Nilai Status

processingPermintaan masih diproses.

completedPembuatan selesai. Output tersedia.

succeededPembuatan berhasil. Output tersedia.

failedPembuatan gagal. Periksa field error.

Respons Selesai

{
  "data": {
    "id": "pred_abc123",
    "status": "completed",
    "outputs": [
      "https://storage.atlascloud.ai/outputs/result.mp4"
    ],
    "metrics": {
      "predict_time": 45.2
    },
    "created_at": "2025-01-01T00:00:00Z",
    "completed_at": "2025-01-01T00:00:10Z"
  }
}

Unggah File

Unggah file ke penyimpanan Atlas Cloud dan dapatkan URL yang dapat Anda gunakan dalam permintaan API Anda. Gunakan multipart/form-data untuk mengunggah.

POST/api/v1/model/uploadMedia

Contoh Unggah

import requests

url = "https://api.atlascloud.ai/api/v1/model/uploadMedia"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

with open("image.png", "rb") as f:
    files = {"file": ("image.png", f, "image/png")}
    response = requests.post(url, headers=headers, files=files)

result = response.json()
download_url = result["data"]["download_url"]
print(f"File URL: {download_url}")

Respons

{
  "data": {
    "download_url": "https://storage.atlascloud.ai/uploads/abc123/image.png",
    "file_name": "image.png",
    "content_type": "image/png",
    "size": 1024000
  }
}

Input Schema

Parameter berikut diterima di isi permintaan.

Total: 8Wajib: 3Opsional: 5

modelstringrequired

model name

Default: "google/gemini-omni-flash/reference-to-video"

promptstringrequired

Text prompt for generation. Describes the target content, style, camera language, or character actions. Maximum 20,000 characters.

imagesarray[string]required

Images to use as character, scene, or style references. Accepts 1 to 5 images when combined with a video reference. Supported formats: PNG, JPEG, JPG, WebP. Each image is limited to 20MB. Supports both a public URL and a base64-encoded image for each item.

Min items: 1Max items: 5

durationinteger

The duration of the generated video in seconds.

Default: 10Min: 3Max: 10

aspect_ratiostring

The aspect ratio of the generated video.

Default: "16:9"

16:99:16

resolutionstring

The resolution of the generated video.

Default: "720p"

720p

thinking_levelstring

Controls the amount of internal reasoning the model performs before generating a response. Higher levels may improve quality on complex tasks but increase latency.

Default: "default"

defaulthighlow

seedinteger

The random seed to use for the generation. -1 means a random seed will be used.

Default: -1

Contoh Isi Permintaan

{
  "model": "google/gemini-omni-flash/reference-to-video",
  "prompt": "A beautiful landscape",
  "images": [
    "https://example.com/file.jpg"
  ],
  "duration": 10,
  "aspect_ratio": "16:9",
  "resolution": "720p",
  "thinking_level": "default",
  "seed": -1
}

Output Schema

API mengembalikan respons prediction dengan URL output yang dihasilkan.

codeinteger

HTTP status code of the response.

messagestring

Human-readable message; non-empty on failure.

dataobject

Contoh Respons

{
  "id": "pred_abc123",
  "status": "completed",
  "model": "model-name",
  "outputs": [
    "https://storage.atlascloud.ai/outputs/result.mp4"
  ],
  "metrics": {
    "predict_time": 45.2
  },
  "created_at": "2025-01-01T00:00:00Z",
  "completed_at": "2025-01-01T00:00:10Z"
}

Atlas Cloud Skills

Atlas Cloud Skills mengintegrasikan 400+ model AI langsung ke asisten pengkodean AI Anda. Satu perintah untuk menginstal, lalu gunakan bahasa alami untuk menghasilkan gambar, video, dan mengobrol dengan LLM.

Klien yang Didukung

Claude Code

OpenAI Codex

Gemini CLI

Cursor

Windsurf

VS Code

Trae

GitHub Copilot

Cline

Roo Code

Amp

Goose

Replit

40+ klien yang didukung

Instalasi

npx skills add AtlasCloudAI/atlas-cloud-skills

Atur API Key

Dapatkan API key dari dasbor Atlas Cloud dan atur sebagai variabel lingkungan.

export ATLASCLOUD_API_KEY="your-api-key-here"

Kemampuan

Setelah diinstal, Anda dapat menggunakan bahasa alami di asisten AI Anda untuk mengakses semua model Atlas Cloud.

Pembuatan GambarBuat gambar dengan model seperti Nano Banana 2, Z-Image, dan lainnya.

Pembuatan VideoBuat video dari teks atau gambar dengan Kling, Vidu, Veo, dll.

Obrolan LLMMengobrol dengan Qwen, DeepSeek, dan model bahasa besar lainnya.

Unggah MediaUnggah file lokal untuk pengeditan gambar dan alur kerja gambar-ke-video.

Pelajari lebih lanjut

github.com/AtlasCloudAI/atlas-cloud-skills

MCP Server

Atlas Cloud MCP Server menghubungkan IDE Anda dengan 400+ model AI melalui Model Context Protocol. Berfungsi dengan klien apa pun yang kompatibel dengan MCP.

Klien yang Didukung

Cursor

VS Code

Windsurf

Claude Code

OpenAI Codex

Gemini CLI

Cline

Roo Code

100+ klien yang didukung

Instalasi

npx -y atlascloud-mcp

Konfigurasi

Tambahkan konfigurasi berikut ke file pengaturan MCP di IDE Anda.

{
  "mcpServers": {
    "atlascloud": {
      "command": "npx",
      "args": [
        "-y",
        "atlascloud-mcp"
      ],
      "env": {
        "ATLASCLOUD_API_KEY": "your-api-key-here"
      }
    }
  }
}

Alat yang Tersedia

atlas_generate_imageBuat gambar dari prompt teks.

atlas_generate_videoBuat video dari teks atau gambar.

atlas_chatMengobrol dengan model bahasa besar.

atlas_list_modelsJelajahi 400+ model AI yang tersedia.

atlas_quick_generatePembuatan konten satu langkah dengan pemilihan model terbaik otomatis.

atlas_upload_mediaUnggah file lokal untuk alur kerja API.

Pelajari lebih lanjut

github.com/AtlasCloudAI/mcp-server

Schema API

{
  "info": {
    "title": "AtlasCloud API",
    "version": "1.0.0",
    "description": "The AtlasCloud API."
  },
  "openapi": "3.0.0",
  "paths": {
    "/api/v1/model/generateVideo": {
      "post": {
        "requestBody": {
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/Input"
              }
            }
          },
          "required": true
        },
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "The request status."
          }
        }
      },
      "x-api-name": "model_run"
    },
    "/api/v1/model/prediction/{request_id}": {
      "get": {
        "parameters": [
          {
            "in": "path",
            "name": "request_id",
            "required": true,
            "schema": {
              "description": "Request ID",
              "type": "string"
            }
          }
        ],
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "Result of the request."
          }
        }
      },
      "x-api-name": "model_result"
    }
  },
  "components": {
    "schemas": {
      "Input": {
        "properties": {
          "model": {
            "type": "string",
            "description": "model name",
            "default": "google/gemini-omni-flash/reference-to-video"
          },
          "prompt": {
            "description": "Text prompt for generation. Describes the target content, style, camera language, or character actions. Maximum 20,000 characters.",
            "type": "string"
          },
          "images": {
            "description": "Images to use as character, scene, or style references. Accepts 1 to 5 images when combined with a video reference. Supported formats: PNG, JPEG, JPG, WebP. Each image is limited to 20MB. Supports both a public URL and a base64-encoded image for each item.",
            "items": {
              "type": "string",
              "format": "uri"
            },
            "maxItems": 5,
            "minItems": 1,
            "type": "array",
            "x-ui-component": "uploaders"
          },
          "duration": {
            "default": 10,
            "description": "The duration of the generated video in seconds.",
            "maximum": 10,
            "minimum": 3,
            "type": "integer",
            "x-ui-component": "select"
          },
          "aspect_ratio": {
            "default": "16:9",
            "description": "The aspect ratio of the generated video.",
            "enum": [
              "16:9",
              "9:16"
            ],
            "type": "string",
            "x-placeholder": "Select one",
            "x-ui-component": "select"
          },
          "resolution": {
            "default": "720p",
            "description": "The resolution of the generated video.",
            "enum": [
              "720p"
            ],
            "type": "string",
            "x-placeholder": "Select one",
            "x-ui-component": "select"
          },
          "thinking_level": {
            "description": "Controls the amount of internal reasoning the model performs before generating a response. Higher levels may improve quality on complex tasks but increase latency.",
            "default": "default",
            "enum": [
              "default",
              "high",
              "low"
            ],
            "type": "string"
          },
          "seed": {
            "default": -1,
            "description": "The random seed to use for the generation. -1 means a random seed will be used.",
            "type": "integer"
          }
        },
        "required": [
          "model",
          "prompt",
          "images"
        ],
        "type": "object",
        "x-order-properties": [
          "model",
          "prompt",
          "images",
          "duration",
          "aspect_ratio",
          "thinking_level",
          "resolution",
          "seed"
        ]
      },
      "PredictionResponse": {
        "type": "object",
        "properties": {
          "code": {
            "description": "HTTP status code of the response.",
            "type": "integer"
          },
          "message": {
            "description": "Human-readable message; non-empty on failure.",
            "type": "string"
          },
          "data": {
            "type": "object",
            "properties": {
              "id": {
                "description": "Unique identifier for the prediction.",
                "type": "string"
              },
              "model": {
                "description": "Model ID used for the prediction.",
                "type": "string"
              },
              "outputs": {
                "description": "Array of URLs to the generated content. Null when status is not completed.",
                "type": "array",
                "items": {
                  "type": "string"
                },
                "nullable": true
              },
              "urls": {
                "description": "Object containing related API endpoints.",
                "type": "object",
                "properties": {
                  "get": {
                    "description": "URL to poll for the prediction result.",
                    "type": "string",
                    "format": "uri"
                  }
                }
              },
              "has_nsfw_contents": {
                "description": "Array of boolean values indicating NSFW detection for each output. Null if not applicable.",
                "type": "array",
                "items": {
                  "type": "boolean"
                },
                "nullable": true
              },
              "status": {
                "description": "Status of the task: created, processing, completed, timeout, or failed.",
                "type": "string"
              },
              "created_at": {
                "description": "ISO timestamp of when the request was created (e.g., \"2023-04-01T12:34:56.789Z\").",
                "format": "date-time",
                "type": "string"
              },
              "error": {
                "description": "Error message if the task failed, empty string otherwise.",
                "type": "string"
              },
              "error_code": {
                "description": "Error code if the task failed.",
                "type": "integer"
              },
              "executionTime": {
                "description": "Total execution time in milliseconds.",
                "type": "number"
              },
              "timings": {
                "description": "Detailed timing breakdown.",
                "type": "object",
                "properties": {
                  "inference": {
                    "description": "Inference time in milliseconds.",
                    "type": "number"
                  }
                }
              }
            }
          }
        }
      }
    },
    "securitySchemes": {
      "apiKeyAuth": {
        "in": "header",
        "name": "Authorization",
        "type": "apiKey"
      }
    }
  },
  "servers": [
    {
      "url": "https://api.atlascloud.ai"
    }
  ]
}

Template Prompt untuk LLM

# google/gemini-omni-flash/reference-to-video

> A natively multimodal Google DeepMind model that generates cinematic, sound-enabled videos from a text prompt plus 1-5 reference images, carrying a consistent subject, scene, or style across generations.


## Overview

- **Submit endpoint (POST)**: `https://api.atlascloud.ai/api/v1/model/generateVideo` — start an async generation; returns a `prediction_id`
- **Poll endpoint (GET)**: `https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}` — poll this until the prediction finishes
- **Model ID**: `google/gemini-omni-flash/reference-to-video`


## API Information

This model can be used via our HTTP API or more conveniently via our client libraries.
See the input and output schema below, as well as the usage examples.


### Input Schema

The API accepts the following input parameters:

- **`model`** (`string`, _required_):
  model name
  - Default: `"google/gemini-omni-flash/reference-to-video"`

- **`prompt`** (`string`, _required_):
  Text prompt for generation. Describes the target content, style, camera language, or character actions. Maximum 20,000 characters.

- **`images`** (`array[string]`, _required_):
  Images to use as character, scene, or style references. Accepts 1 to 5 images when combined with a video reference. Supported formats: PNG, JPEG, JPG, WebP. Each image is limited to 20MB. Supports both a public URL and a base64-encoded image for each item.
  - Min items: 1
  - Max items: 5

- **`duration`** (`integer`, _optional_):
  The duration of the generated video in seconds.
  - Default: `10`
  - Min: 3
  - Max: 10

- **`aspect_ratio`** (`string`, _optional_):
  The aspect ratio of the generated video.
  - Default: `"16:9"`
  - Options: "16:9", "9:16"

- **`thinking_level`** (`string`, _optional_):
  Controls the amount of internal reasoning the model performs before generating a response. Higher levels may improve quality on complex tasks but increase latency.
  - Default: `"default"`
  - Options: "default", "high", "low"

- **`resolution`** (`string`, _optional_):
  The resolution of the generated video.
  - Default: `"720p"`
  - Options: "720p"

- **`seed`** (`integer`, _optional_):
  The random seed to use for the generation. -1 means a random seed will be used.
  - Default: `-1`



**Required Parameters Example**:

```json
{
  "model": "google/gemini-omni-flash/reference-to-video",
  "prompt": "",
  "images": [
    ""
  ]
}
```


**Full Example**:

```json
{
  "model": "google/gemini-omni-flash/reference-to-video",
  "prompt": "",
  "images": [
    ""
  ],
  "duration": 10,
  "aspect_ratio": "16:9",
  "thinking_level": "default",
  "resolution": "720p",
  "seed": -1
}
```


### Output Schema

The API returns the following output format:


- **`code`** (`integer`, _optional_):
  HTTP status code of the response.

- **`message`** (`string`, _optional_):
  Human-readable message; non-empty on failure.

- **`data`** (`object`, _optional_):
  - Properties:
    - **`id`** (`string`, _optional_):
      Unique identifier for the prediction.

    - **`model`** (`string`, _optional_):
      Model ID used for the prediction.

    - **`outputs`** (`array[string]`, _optional_):
      Array of URLs to the generated content. Null when status is not completed.

    - **`urls`** (`object`, _optional_):
      Object containing related API endpoints.
      - Properties:
        - **`get`** (`string`, _optional_):
          URL to poll for the prediction result.


    - **`has_nsfw_contents`** (`array[boolean]`, _optional_):
      Array of boolean values indicating NSFW detection for each output. Null if not applicable.

    - **`status`** (`string`, _optional_):
      Status of the task: created, processing, completed, timeout, or failed.

    - **`created_at`** (`string`, _optional_):
      ISO timestamp of when the request was created (e.g., "2023-04-01T12:34:56.789Z").

    - **`error`** (`string`, _optional_):
      Error message if the task failed, empty string otherwise.

    - **`error_code`** (`integer`, _optional_):
      Error code if the task failed.

    - **`executionTime`** (`number`, _optional_):
      Total execution time in milliseconds.

    - **`timings`** (`object`, _optional_):
      Detailed timing breakdown.
      - Properties:
        - **`inference`** (`number`, _optional_):
          Inference time in milliseconds.





**Example Response**:

```json
{
  "code": 0,
  "message": "",
  "data": {
    "id": "",
    "model": "",
    "outputs": [
      ""
    ],
    "urls": {
      "get": ""
    },
    "has_nsfw_contents": [],
    "status": "",
    "created_at": "",
    "error": "",
    "error_code": 0,
    "executionTime": 0,
    "timings": {
      "inference": 0
    }
  }
}
```


## Usage Examples

### cURL

```bash
# Step 1: Start generation (async)
curl -X POST "https://api.atlascloud.ai/api/v1/model/generateVideo" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "google/gemini-omni-flash/reference-to-video",
  "prompt": "",
  "images": [
    ""
  ],
  "duration": 10,
  "aspect_ratio": "16:9",
  "thinking_level": "default",
  "resolution": "720p",
  "seed": -1
}'

# Response will contain: {"code": 200, "data": {"id": "prediction_id", "status": "processing"}}

# Step 2: Poll for result (replace {prediction_id} with the id returned above)
curl -X GET "https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY"

# Keep polling until status is "completed", "succeeded" or "failed"
# When completed, outputs will contain the generated content URL(s)
```

## Additional Resources

### Documentation

- [Model Playground](https://www.atlascloud.ai/models/google/gemini-omni-flash/reference-to-video)

The little doll is jumping happily.

Memuat...

Gemini Omni Flash — Reference to Video

Model ID: google/gemini-omni-flash/reference-to-video

Gemini Omni Flash is Google DeepMind's high-performance, natively multimodal model built for high-speed video generation, editing, and cinematic control. This variant accepts a text prompt plus one or more reference images, generating a video that carries the referenced subject, scene, or style into a newly described scene.

Overview

Gemini Omni Flash (gemini-omni-flash-preview) was introduced by Google alongside Nano Banana 2 Lite as a new generation of multimodal media models. Unlike traditional pipelines that stitch modalities together, Omni Flash is a single transformer that processes text, images, audio, and video simultaneously, producing output that is more cohesive, consistent, and controllable.

What sets it apart from earlier video models (such as the Veo family) is that it natively generates audio with every video — dialogue, ambience, music, and sound design are produced together with the picture rather than added afterward. The model is grounded in Gemini's real-world knowledge, so it reasons about physics, narrative logic, culture, and visual composition to produce results that feel intentional and cinematic. Generated media carries an invisible SynthID watermark.

AtlasCloud exposes Gemini Omni Flash through four endpoints — text-to-video, image-to-video, reference-to-video, and video-edit. All four route to the same gemini-omni-flash-preview model and differ only by the input modality they accept, corresponding to the model's task parameter (text_to_video, image_to_video, reference_to_video, edit). This endpoint maps to reference_to_video.

Inputs

This variant takes a text prompt and 1–5 reference images. The images are used as character, scene, or style references, and the prompt describes the new scene to build around them. Because Omni Flash maintains subject, object, and style consistency, this is the best choice for keeping a recurring character or a consistent visual identity across generations.

Prompt — Natural-language description of the target scene, action, camera language, mood, and audio (up to 20,000 characters).
Images — 1 to 5 reference images. PNG, JPEG, JPG, or WebP, each up to 20 MB. Supplied as public URLs or base64-encoded images.

Key Capabilities

Subject & style consistency — Carry a referenced character, object, or look across scenes and generations.
Multi-reference conditioning — Blend up to 5 reference images to guide subject, scene, and style at once.
Rich prompt understanding — Direct camera movement, action, mood, style, and audio in a single prompt of up to 20,000 characters.
Native audio generation — Every clip is rendered with a synchronized soundtrack (speech, music, effects) driven by your description.
World-grounded realism — Physics, motion, and scene dynamics informed by Gemini's real-world knowledge.
Adjustable reasoning — The thinking_level control trades latency for quality on complex prompts.
Reproducible results — Set a fixed seed to reproduce or iterate on a specific generation.

Input Parameters

Parameter	Type	Required	Default	Description
`model`	string	Yes	`google/gemini-omni-flash/reference-to-video`	Model identifier
`prompt`	string	Yes	—	Text description of the target scene. Max 20,000 characters.
`images`	array of string (uri)	Yes	—	1–5 reference images for character, scene, or style. PNG/JPEG/JPG/WebP, ≤20 MB each. URL or base64.
`duration`	integer	No	`10`	Video length in seconds. Range: `3`–`10`.
`aspect_ratio`	string	No	`16:9`	Output aspect ratio. Enum: `16:9`, `9:16`.
`thinking_level`	string	No	`default`	Internal reasoning effort. Enum: `default`, `high`, `low`.
`resolution`	string	No	`720p`	Output resolution. Enum: `720p`.
`seed`	integer	No	`-1`	Random seed for reproducibility. `-1` uses a random seed.

Use Cases

Consistent characters — Keep the same protagonist, mascot, or presenter across a series of clips.
Brand identity — Reproduce a product, logo, or visual style consistently across marketing videos.
Style transfer — Apply the look and feel of reference art to a newly described scene.
Episodic content — Maintain visual continuity across multiple generations in a storyline.
Personalized media — Generate videos featuring specific subjects supplied as references.

Pricing

Billing is based on the duration of the generated video, charged at a flat per-second rate.

SKU	Rate
Per second of output	$0.135

Formula: max(3, duration) × $0.135

Billing is per second, with a 3-second minimum — durations below 3s are billed as 3s.
Example: a 10-second video costs 10 × $0.135 = $1.35.
Example: a 3-second video costs 3 × $0.135 = $0.405.

Gemini Omni Flash Reference-to-Video API by Google

INPUT

OUTPUT

Parameter

Contoh kode

Instalasi

Autentikasi

HTTP Headers

Kirim permintaan

Kirim Permintaan

Isi Permintaan

Respons

Periksa Status

Contoh Polling

Nilai Status

Respons Selesai

Unggah File

Contoh Unggah

Respons

Input Schema

Contoh Isi Permintaan

Output Schema

Contoh Respons

Atlas Cloud Skills

Klien yang Didukung

Instalasi

Atur API Key

Kemampuan

MCP Server

Klien yang Didukung

Instalasi

Konfigurasi

Alat yang Tersedia

Schema API

Template Prompt untuk LLM

Gemini Omni Flash — Reference to Video

Overview

Inputs

Key Capabilities

Input Parameters

Use Cases

Pricing

Jelajahi Model Serupa

Gemini Omni Flash Image-to-Video Developer

Gemini Omni Flash Text-to-Video Developer

Veo 3.1 Lite Text-to-video

Veo 3.1 Lite Start-End Frame to Video

Veo 3.1 Lite Image-to-video

Veo3.1 Fast Image-to-video

Veo3.1 Fast Text-to-video

Veo3.1 Image-to-video

Veo3.1 Reference-to-video

Veo3.1 Text-to-video

Gemini Omni Flash Image-to-Video

Gemini Omni Flash Video Edit

Gemini Omni Flash Text-to-Video

Gemini Omni Flash Reference-to-Video Developer

Sync.so Lipsync v3

VEED Lipsync

Satu API untuk semua AI multimedia.

Join our Discord community

INPUT

OUTPUT

Parameter

Contoh kode

Instalasi

Autentikasi

HTTP Headers

Kirim permintaan

Kirim Permintaan

Isi Permintaan

Respons

Periksa Status

Contoh Polling

Nilai Status

Respons Selesai

Unggah File

Contoh Unggah

Respons

Input Schema