google/nano-banana-2/reference-to-image

圖生圖

Nano Banana 2 Reference-to-Image API by Google

google/nano-banana-2/reference-to-image

Reference-to-image

Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions.

輸入

提示詞 *

圖片(0/10)

可拖曳檔案至此，或點擊上傳

MAX:10

Video clips *

最少: 1 / 最多: 1

寬高比

解析度

Thinking level

Media resolution

Output format

Enable web search

Enable image search

輸出

閒置

生成的圖片將在這裡顯示

設定參數後點擊執行開始生成

每次執行將花費 $0.08。$10 可執行約 125 次。

你可以繼續：

圖生影片圖生圖

參數

程式碼範例
import requests
import time

# Step 1: Start image generation
generate_url = "https://api.atlascloud.ai/api/v1/model/generateImage"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "google/nano-banana-2/reference-to-image",  # Required. model name
    "aspect_ratio": "example_value",  # The aspect ratio of the generated media
    "enable_base64_output": False,  # If enabled, the output will be encoded into a BASE64 string instead of a URL
    "enable_sync_mode": False,  # If set to true, the function will wait for the result to be generated and uploaded before returning the response
    "enable_web_search": False,  # If enabled, the model will use web search to ground the generation with real-time information
    "enable_image_search": False,  # If enabled, the model will use image search to ground the generation with real-time information
    "images": [
        "https://example.com/image1.jpg"
    ],  # List of URLs of input images for editing
    "output_format": "default",  # The format of the output image. options: default | png | jpeg
    "prompt": "A beautiful landscape with mountains and lake",  # Required. The positive prompt for the generation
    "resolution": "1k",  # The resolution of the output image. options: 1k | 2k | 4k
    "media_resolution": "default",  # Controls how input media is processed. options: default | low | medium | high
    "thinking_level": "default",  # Controls the amount of internal reasoning the model performs before generating a response. options: default | high | minimal
    "video_clips": [
        {
            "url": "example_url",
            "start": 0,
            "ends": 0,
            "fps": 1
        }
    ],  # Required. Source video clips to use as references for generation
}

generate_response = requests.post(generate_url, headers=headers, json=data)
generate_result = generate_response.json()
prediction_id = generate_result["data"]["id"]

# Step 2: Poll for result
poll_url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"

def check_status():
    while True:
        response = requests.get(poll_url, headers={"Authorization": "Bearer $ATLASCLOUD_API_KEY"})
        result = response.json()

        if result["data"]["status"] == "completed":
            print("Generated image:", result["data"]["outputs"][0])
            return result["data"]["outputs"][0]
        elif result["data"]["status"] == "failed":
            raise Exception(result["data"]["error"] or "Generation failed")
        else:
            # Still processing, wait 2 seconds
            time.sleep(2)

image_url = check_status()

安裝

安裝所需的相依套件。

pip install requests

驗證

所有 API 請求都需要透過 API Key 進行認證。您可以在 Atlas Cloud 控制台取得 API Key。

export ATLASCLOUD_API_KEY="your-api-key-here"

HTTP 標頭

import os

API_KEY = os.environ.get("ATLASCLOUD_API_KEY")
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}

保護好您的 API Key

切勿在客戶端程式碼或公開儲存庫中暴露您的 API Key。請使用環境變數或後端代理。

提交請求

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateImage"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "your-model",
    "prompt": "A beautiful landscape"
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

提交請求

提交一個非同步生成請求。API 會傳回一個 prediction ID，您可以用它來檢查狀態與取得結果。

POST/api/v1/model/generateImage

請求主體

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateImage"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}

data = {
    "model": "google/nano-banana-2/reference-to-image",
    "prompt": "A beautiful landscape with mountains and lake"
}

response = requests.post(url, headers=headers, json=data)
result = response.json()

print(f"Prediction ID: {result['data']['id']}")
print(f"Status: {result['data']['status']}")

回應

{
  "code": 200,
  "data": {
    "id": "pred_abc123",
    "status": "processing",
    "model": "model-name",
    "created_at": "2025-01-01T00:00:00Z"
  }
}

檢查狀態

輪詢 prediction 端點以檢查請求的當前狀態。

GET/api/v1/model/prediction/{prediction_id}

輪詢範例

import requests
import time

prediction_id = "pred_abc123"
url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

while True:
    response = requests.get(url, headers=headers)
    result = response.json()
    status = result["data"]["status"]
    print(f"Status: {status}")

    if status in ["completed", "succeeded"]:
        output_url = result["data"]["outputs"][0]
        print(f"Output URL: {output_url}")
        break
    elif status == "failed":
        print(f"Error: {result['data'].get('error', 'Unknown')}")
        break

    time.sleep(3)

狀態值

processing請求仍在處理中。

completed生成完成，輸出可用。

succeeded生成成功，輸出可用。

failed生成失敗，請檢查 error 欄位。

完成回應

{
  "data": {
    "id": "pred_abc123",
    "status": "completed",
    "outputs": [
      "https://storage.atlascloud.ai/outputs/result.png"
    ],
    "metrics": {
      "predict_time": 8.3
    },
    "created_at": "2025-01-01T00:00:00Z",
    "completed_at": "2025-01-01T00:00:10Z"
  }
}

上傳檔案

將檔案上傳到 Atlas Cloud 儲存空間，取得可在 API 請求中使用的 URL。使用 multipart/form-data 上傳。

POST/api/v1/model/uploadMedia

上傳範例

import requests

url = "https://api.atlascloud.ai/api/v1/model/uploadMedia"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

with open("image.png", "rb") as f:
    files = {"file": ("image.png", f, "image/png")}
    response = requests.post(url, headers=headers, files=files)

result = response.json()
download_url = result["data"]["download_url"]
print(f"File URL: {download_url}")

回應

{
  "data": {
    "download_url": "https://storage.atlascloud.ai/uploads/abc123/image.png",
    "file_name": "image.png",
    "content_type": "image/png",
    "size": 1024000
  }
}

Input Schema

以下參數在請求主體中被接受。

總計: 13必填: 3選填: 10

modelstringrequired

model name

Default: "google/nano-banana-2/reference-to-image"

aspect_ratiostring

The aspect ratio of the generated media.

1:13:22:33:44:34:55:49:1616:921:9

enable_base64_outputboolean

If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.

Default: false

enable_sync_modeboolean

If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.

Default: false

enable_web_searchboolean

If enabled, the model will use web search to ground the generation with real-time information.

Default: false

enable_image_searchboolean

If enabled, the model will use image search to ground the generation with real-time information.

Default: false

imagesarray[string]

List of URLs of input images for editing. The maximum number of images is 10.

Min items: 0Max items: 10

output_formatstring

The format of the output image.

Default: "default"

defaultpngjpeg

promptstringrequired

The positive prompt for the generation.

resolutionstring

The resolution of the output image.

Default: "1k"

1k2k4k

media_resolutionstring

Controls how input media is processed. LOW reduces tokens per image/video, possibly losing detail but allowing longer videos in context. Supported values: HIGH, MEDIUM, LOW.

Default: "default"

defaultlowmediumhigh

thinking_levelstring

Controls the amount of internal reasoning the model performs before generating a response. Higher levels may improve quality on complex tasks but increase latency.

Default: "default"

defaulthighminimal

video_clipsarray[object]required

Source video clips to use as references for generation. Supports 1 video clip.

Min items: 1Max items: 1

urlstringrequired

URL of the source video clip. Support HTTP URL or YouTube video URL. Video in HTTP URL is limited to 15MB.

Format: uri

startnumberrequired

Start time in seconds for trimming the video clip.

Default: 0Min: 0

endsnumberrequired

End time in seconds for trimming the video clip. Set 0 to clip the whole video

Default: 0Min: 0

fpsnumberrequired

FPS of the video clip.

Default: 1Min: 0Max: 24

範例請求主體

{
  "model": "google/nano-banana-2/reference-to-image",
  "enable_base64_output": false,
  "enable_sync_mode": false,
  "enable_web_search": false,
  "enable_image_search": false,
  "output_format": "default",
  "prompt": "A beautiful landscape",
  "resolution": "1k",
  "media_resolution": "default",
  "thinking_level": "default",
  "video_clips": [
    {
      "url": "example_url",
      "start": 0,
      "ends": 0,
      "fps": 1
    }
  ]
}

Output Schema

API 傳回包含生成輸出 URL 的 prediction 回應。

codeinteger

HTTP status code of the response.

messagestring

Human-readable message; non-empty on failure.

dataobject

範例回應

{
  "id": "pred_abc123",
  "status": "completed",
  "model": "model-name",
  "outputs": [
    "https://storage.atlascloud.ai/outputs/result.png"
  ],
  "metrics": {
    "predict_time": 8.3
  },
  "created_at": "2025-01-01T00:00:00Z",
  "completed_at": "2025-01-01T00:00:10Z"
}

Atlas Cloud Skills

Atlas Cloud Skills 將 400+ AI 模型直接整合到您的 AI 程式碼助手中。一條命令安裝，即可用自然語言生成圖片、影片，以及與 LLM 對話。

支援的客戶端

Claude Code

OpenAI Codex

Gemini CLI

Cursor

Windsurf

VS Code

Trae

GitHub Copilot

Cline

Roo Code

Amp

Goose

Replit

40+ 支援的客戶端

安裝

npx skills add AtlasCloudAI/atlas-cloud-skills

設定 API Key

從 Atlas Cloud 控制台取得 API Key，並將其設定為環境變數。

export ATLASCLOUD_API_KEY="your-api-key-here"

功能

安裝完成後，您可以在 AI 助手中使用自然語言存取所有 Atlas Cloud 模型。

圖片生成使用 Nano Banana 2、Z-Image 等模型生成圖片。

影片創作使用 Kling、Vidu、Veo 等從文字或圖片創建影片。

LLM 對話與 Qwen、DeepSeek 及其他大型語言模型對話。

媒體上傳上傳本機檔案用於圖片編輯和圖生影片工作流程。

MCP Server

Atlas Cloud MCP Server 透過 Model Context Protocol 將您的 IDE 與 400+ AI 模型連接。支援任何相容 MCP 的客戶端。

支援的客戶端

Cursor

VS Code

Windsurf

Claude Code

OpenAI Codex

Gemini CLI

Cline

Roo Code

100+ 支援的客戶端

安裝

npx -y atlascloud-mcp

設定

將以下設定新增到您的 IDE 的 MCP 設定檔中。

{
  "mcpServers": {
    "atlascloud": {
      "command": "npx",
      "args": [
        "-y",
        "atlascloud-mcp"
      ],
      "env": {
        "ATLASCLOUD_API_KEY": "your-api-key-here"
      }
    }
  }
}

可用工具

atlas_generate_image根據文字提示生成圖片。

atlas_generate_video從文字或圖片創建影片。

atlas_chat與大型語言模型對話。

atlas_list_models瀏覽 400+ 可用 AI 模型。

atlas_quick_generate一步式內容創建，自動選擇最佳模型。

atlas_upload_media上傳本機檔案用於 API 工作流程。

了解更多

github.com/AtlasCloudAI/mcp-server

API Schema

{
  "components": {
    "schemas": {
      "Input": {
        "properties": {
          "model": {
            "type": "string",
            "description": "model name",
            "default": "google/nano-banana-2/reference-to-image"
          },
          "aspect_ratio": {
            "description": "The aspect ratio of the generated media.",
            "enum": [
              "1:1",
              "3:2",
              "2:3",
              "3:4",
              "4:3",
              "4:5",
              "5:4",
              "9:16",
              "16:9",
              "21:9"
            ],
            "type": "string",
            "x-placeholder": "Select aspect ratio"
          },
          "enable_base64_output": {
            "default": false,
            "description": "If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.",
            "disabled": true,
            "type": "boolean"
          },
          "enable_sync_mode": {
            "default": false,
            "description": "If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.",
            "disabled": true,
            "type": "boolean"
          },
          "enable_web_search": {
            "default": false,
            "description": "If enabled, the model will use web search to ground the generation with real-time information.",
            "type": "boolean"
          },
          "enable_image_search": {
            "default": false,
            "description": "If enabled, the model will use image search to ground the generation with real-time information.",
            "type": "boolean"
          },
          "images": {
            "description": "List of URLs of input images for editing. The maximum number of images is 10.",
            "items": {
              "type": "string"
            },
            "maxItems": 10,
            "minItems": 0,
            "type": "array",
            "x-ui-component": "uploaders"
          },
          "output_format": {
            "default": "default",
            "description": "The format of the output image.",
            "enum": [
              "default",
              "png",
              "jpeg"
            ],
            "type": "string"
          },
          "prompt": {
            "description": "The positive prompt for the generation.",
            "type": "string"
          },
          "resolution": {
            "default": "1k",
            "description": "The resolution of the output image.",
            "enum": [
              "1k",
              "2k",
              "4k"
            ],
            "type": "string"
          },
          "media_resolution": {
            "default": "default",
            "description": "Controls how input media is processed. LOW reduces tokens per image/video, possibly losing detail but allowing longer videos in context. Supported values: HIGH, MEDIUM, LOW.",
            "enum": [
              "default",
              "low",
              "medium",
              "high"
            ],
            "type": "string"
          },
          "thinking_level": {
            "description": "Controls the amount of internal reasoning the model performs before generating a response. Higher levels may improve quality on complex tasks but increase latency.",
            "default": "default",
            "enum": [
              "default",
              "high",
              "minimal"
            ],
            "type": "string"
          },
          "video_clips": {
            "description": "Source video clips to use as references for generation. Supports 1 video clip.",
            "type": "array",
            "items": {
              "type": "object",
              "required": [
                "url",
                "start",
                "ends",
                "fps"
              ],
              "properties": {
                "url": {
                  "type": "string",
                  "format": "uri",
                  "description": "URL of the source video clip. Support HTTP URL or YouTube video URL. Video in HTTP URL is limited to 15MB.",
                  "x-ui-component": "uploader"
                },
                "start": {
                  "type": "number",
                  "description": "Start time in seconds for trimming the video clip.",
                  "default": 0,
                  "minimum": 0
                },
                "ends": {
                  "type": "number",
                  "description": "End time in seconds for trimming the video clip. Set 0 to clip the whole video",
                  "default": 0,
                  "minimum": 0
                },
                "fps": {
                  "type": "number",
                  "description": "FPS of the video clip.",
                  "default": 1,
                  "minimum": 0,
                  "maximum": 24
                }
              }
            },
            "minItems": 1,
            "maxItems": 1
          }
        },
        "required": [
          "model",
          "prompt",
          "video_clips"
        ],
        "type": "object",
        "x-order-properties": [
          "model",
          "prompt",
          "images",
          "video_clips",
          "aspect_ratio",
          "resolution",
          "thinking_level",
          "media_resolution",
          "output_format",
          "enable_web_search",
          "enable_image_search",
          "enable_sync_mode",
          "enable_base64_output"
        ]
      },
      "PredictionResponse": {
        "type": "object",
        "properties": {
          "code": {
            "description": "HTTP status code of the response.",
            "type": "integer"
          },
          "message": {
            "description": "Human-readable message; non-empty on failure.",
            "type": "string"
          },
          "data": {
            "type": "object",
            "properties": {
              "id": {
                "description": "Unique identifier for the prediction.",
                "type": "string"
              },
              "model": {
                "description": "Model ID used for the prediction.",
                "type": "string"
              },
              "outputs": {
                "description": "Array of URLs to the generated content. Null when status is not completed.",
                "type": "array",
                "items": {
                  "type": "string"
                },
                "nullable": true
              },
              "urls": {
                "description": "Object containing related API endpoints.",
                "type": "object",
                "properties": {
                  "get": {
                    "description": "URL to poll for the prediction result.",
                    "type": "string",
                    "format": "uri"
                  }
                }
              },
              "has_nsfw_contents": {
                "description": "Array of boolean values indicating NSFW detection for each output. Null if not applicable.",
                "type": "array",
                "items": {
                  "type": "boolean"
                },
                "nullable": true
              },
              "status": {
                "description": "Status of the task: created, processing, completed, timeout, or failed.",
                "type": "string"
              },
              "created_at": {
                "description": "ISO timestamp of when the request was created (e.g., \"2023-04-01T12:34:56.789Z\").",
                "format": "date-time",
                "type": "string"
              },
              "error": {
                "description": "Error message if the task failed, empty string otherwise.",
                "type": "string"
              },
              "error_code": {
                "description": "Error code if the task failed.",
                "type": "integer"
              },
              "executionTime": {
                "description": "Total execution time in milliseconds.",
                "type": "number"
              },
              "timings": {
                "description": "Detailed timing breakdown.",
                "type": "object",
                "properties": {
                  "inference": {
                    "description": "Inference time in milliseconds.",
                    "type": "number"
                  }
                }
              }
            }
          }
        }
      }
    },
    "securitySchemes": {
      "apiKeyAuth": {
        "in": "header",
        "name": "Authorization",
        "type": "apiKey"
      }
    }
  },
  "info": {
    "description": "The AtlasCloud API.",
    "title": "AtlasCloud API",
    "version": "1.0.0"
  },
  "openapi": "3.0.0",
  "paths": {
    "/api/v1/model/generateImage": {
      "post": {
        "requestBody": {
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/Input"
              }
            }
          },
          "required": true
        },
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "The request status."
          }
        }
      },
      "x-api-name": "model_run"
    },
    "/api/v1/model/prediction/{request_id}": {
      "get": {
        "parameters": [
          {
            "in": "path",
            "name": "request_id",
            "required": true,
            "schema": {
              "description": "Request ID",
              "type": "string"
            }
          }
        ],
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "Result of the request."
          }
        }
      },
      "x-api-name": "model_result"
    }
  },
  "servers": [
    {
      "url": "https://api.atlascloud.ai"
    }
  ]
}

LLM 友善的提示詞範本

# google/nano-banana-2/reference-to-image

> Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions.


## Overview

- **Submit endpoint (POST)**: `https://api.atlascloud.ai/api/v1/model/generateImage` — start an async generation; returns a `prediction_id`
- **Poll endpoint (GET)**: `https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}` — poll this until the prediction finishes
- **Model ID**: `google/nano-banana-2/reference-to-image`


## API Information

This model can be used via our HTTP API or more conveniently via our client libraries.
See the input and output schema below, as well as the usage examples.


### Input Schema

The API accepts the following input parameters:

- **`model`** (`string`, _required_):
  model name
  - Default: `"google/nano-banana-2/reference-to-image"`

- **`prompt`** (`string`, _required_):
  The positive prompt for the generation.

- **`images`** (`array[string]`, _optional_):
  List of URLs of input images for editing. The maximum number of images is 10.
  - Min items: 0
  - Max items: 10

- **`video_clips`** (`array[object]`, _required_):
  Source video clips to use as references for generation. Supports 1 video clip.
  - Min items: 1
  - Max items: 1
  - Item properties:
    - **`url`** (`string`, _required_):
      URL of the source video clip. Support HTTP URL or YouTube video URL. Video in HTTP URL is limited to 15MB.

    - **`start`** (`number`, _required_):
      Start time in seconds for trimming the video clip.
      - Default: `0`
      - Min: 0

    - **`ends`** (`number`, _required_):
      End time in seconds for trimming the video clip. Set 0 to clip the whole video
      - Default: `0`
      - Min: 0

    - **`fps`** (`number`, _required_):
      FPS of the video clip.
      - Default: `1`
      - Min: 0
      - Max: 24


- **`aspect_ratio`** (`string`, _optional_):
  The aspect ratio of the generated media.
  - Options: "1:1", "3:2", "2:3", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9"

- **`resolution`** (`string`, _optional_):
  The resolution of the output image.
  - Default: `"1k"`
  - Options: "1k", "2k", "4k"

- **`thinking_level`** (`string`, _optional_):
  Controls the amount of internal reasoning the model performs before generating a response. Higher levels may improve quality on complex tasks but increase latency.
  - Default: `"default"`
  - Options: "default", "high", "minimal"

- **`media_resolution`** (`string`, _optional_):
  Controls how input media is processed. LOW reduces tokens per image/video, possibly losing detail but allowing longer videos in context. Supported values: HIGH, MEDIUM, LOW.
  - Default: `"default"`
  - Options: "default", "low", "medium", "high"

- **`output_format`** (`string`, _optional_):
  The format of the output image.
  - Default: `"default"`
  - Options: "default", "png", "jpeg"

- **`enable_web_search`** (`boolean`, _optional_):
  If enabled, the model will use web search to ground the generation with real-time information.
  - Default: `false`

- **`enable_image_search`** (`boolean`, _optional_):
  If enabled, the model will use image search to ground the generation with real-time information.
  - Default: `false`

- **`enable_sync_mode`** (`boolean`, _optional_):
  If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.
  - Default: `false`

- **`enable_base64_output`** (`boolean`, _optional_):
  If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.
  - Default: `false`



**Required Parameters Example**:

```json
{
  "model": "google/nano-banana-2/reference-to-image",
  "prompt": "",
  "video_clips": [
    {
      "url": "",
      "start": 0,
      "ends": 0,
      "fps": 1
    }
  ]
}
```


**Full Example**:

```json
{
  "model": "google/nano-banana-2/reference-to-image",
  "prompt": "",
  "images": [
    ""
  ],
  "video_clips": [
    {
      "url": "",
      "start": 0,
      "ends": 0,
      "fps": 1
    }
  ],
  "aspect_ratio": "1:1",
  "resolution": "1k",
  "thinking_level": "default",
  "media_resolution": "default",
  "output_format": "default",
  "enable_web_search": false,
  "enable_image_search": false,
  "enable_sync_mode": false,
  "enable_base64_output": false
}
```


### Output Schema

The API returns the following output format:


- **`code`** (`integer`, _optional_):
  HTTP status code of the response.

- **`message`** (`string`, _optional_):
  Human-readable message; non-empty on failure.

- **`data`** (`object`, _optional_):
  - Properties:
    - **`id`** (`string`, _optional_):
      Unique identifier for the prediction.

    - **`model`** (`string`, _optional_):
      Model ID used for the prediction.

    - **`outputs`** (`array[string]`, _optional_):
      Array of URLs to the generated content. Null when status is not completed.

    - **`urls`** (`object`, _optional_):
      Object containing related API endpoints.
      - Properties:
        - **`get`** (`string`, _optional_):
          URL to poll for the prediction result.


    - **`has_nsfw_contents`** (`array[boolean]`, _optional_):
      Array of boolean values indicating NSFW detection for each output. Null if not applicable.

    - **`status`** (`string`, _optional_):
      Status of the task: created, processing, completed, timeout, or failed.

    - **`created_at`** (`string`, _optional_):
      ISO timestamp of when the request was created (e.g., "2023-04-01T12:34:56.789Z").

    - **`error`** (`string`, _optional_):
      Error message if the task failed, empty string otherwise.

    - **`error_code`** (`integer`, _optional_):
      Error code if the task failed.

    - **`executionTime`** (`number`, _optional_):
      Total execution time in milliseconds.

    - **`timings`** (`object`, _optional_):
      Detailed timing breakdown.
      - Properties:
        - **`inference`** (`number`, _optional_):
          Inference time in milliseconds.





**Example Response**:

```json
{
  "code": 0,
  "message": "",
  "data": {
    "id": "",
    "model": "",
    "outputs": [
      ""
    ],
    "urls": {
      "get": ""
    },
    "has_nsfw_contents": [],
    "status": "",
    "created_at": "",
    "error": "",
    "error_code": 0,
    "executionTime": 0,
    "timings": {
      "inference": 0
    }
  }
}
```


## Usage Examples

### cURL

```bash
# Step 1: Start generation (async)
curl -X POST "https://api.atlascloud.ai/api/v1/model/generateImage" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "google/nano-banana-2/reference-to-image",
  "prompt": "",
  "images": [
    ""
  ],
  "video_clips": [
    {
      "url": "",
      "start": 0,
      "ends": 0,
      "fps": 1
    }
  ],
  "aspect_ratio": "1:1",
  "resolution": "1k",
  "thinking_level": "default",
  "media_resolution": "default",
  "output_format": "default",
  "enable_web_search": false,
  "enable_image_search": false,
  "enable_sync_mode": false,
  "enable_base64_output": false
}'

# Response will contain: {"code": 200, "data": {"id": "prediction_id", "status": "processing"}}

# Step 2: Poll for result (replace {prediction_id} with the id returned above)
curl -X GET "https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY"

# Keep polling until status is "completed", "succeeded" or "failed"
# When completed, outputs will contain the generated content URL(s)
```

## Additional Resources

### Documentation

- [Model Playground](https://www.atlascloud.ai/models/google/nano-banana-2/reference-to-image)

Generate a poster image that captures the key themes of this video.

Integrate the character from the image into the video's environment, fully rendering the character to ensure their proportions match those of an average male, and place them within the scene in a walking motion; the character's lighting should be consistent with the ambient atmosphere. No text should be included.

載入中...

新一代影像生成

最高 4K 解析度輸出（512px / 1K / 2K / 4K 四階）
10+ 種長寬比，包括 21:9、1:4、8:1 等
精準清晰的圖中文字渲染
接近 Pro 級畫質（約 95%），Flash 級速度

智慧編輯與一致性

跨場景最多 5 個角色的一致性維持
單一工作流程中最多 14 個物件的保真度
透過自然語言進行精準編輯（移除、替換、調整姿態）
多圖融合與無縫合成

Nano Banana 2 新功能

比 Pro 快 3-5 倍

採用 Gemini 3.1 Flash 架構，Nano Banana 2 標準影像生成僅需 4-8 秒，而 Nano Banana Pro 需要 10-20 秒。

影像搜尋增強

NB2 的亮點功能 — 生成過程中可透過 Google 搜尋擷取真實世界參考影像，大幅提升地標、名人和品牌標誌的生成準確度。

精準文字渲染

為行銷素材、賀卡和在地化內容生成準確清晰的文字，甚至可以在影像中直接翻譯和在地化文字。

多角色一致性

跨場景維持最多 5 個角色和 14 個物件的視覺一致性 — 非常適合分鏡、漫畫創作和行銷活動。

Text Rendering

Marketing Mockup with Text

Generate marketing visuals with accurate, legible text — one of NB2's standout improvements

Prompt

A minimalist coffee shop promotional poster with the text 'MORNING BREW — Fresh Roasted Daily' in elegant serif font, warm earth tones, steam rising from a ceramic cup, clean layout with plenty of whitespace

Character Consistency

Multi-Scene Character

Maintain character consistency across multiple scenes — supports up to 5 characters per workflow

Prompt

A young woman with short red hair and freckles, wearing a green jacket, standing in a rainy Tokyo street at night with neon reflections on wet pavement, cinematic lighting, photorealistic

Photo to Action Figure

Person to Action Figure

Transform people from photos into collectible action figures with custom packaging

Prompt

Transform the person in the photo into an action figure, styled after [CHARACTER_NAME] from [SOURCE / CONTEXT]. Next to the figure, display the accessories including [ITEM_1], [ITEM_2], and [ITEM_3]. On the top of the toy box, write "[BOX_LABEL_TOP]", and underneath it, "[BOX_LABEL_BOTTOM]". Place the box in a [BACKGROUND_SETTING] environment.

Search Grounding

Real-World Reference Generation

Leverage Image Search Grounding to generate accurate real-world subjects like landmarks and brands

Prompt

A photorealistic aerial view of the Eiffel Tower at golden hour, with the Seine River winding through Paris below, warm sunset light casting long shadows, high detail, 4K resolution

Product Photography

Product Design Render

Create professional product photography with precise control over lighting and composition

Prompt

A frosted glass perfume bottle with a marble cap on a white marble surface, soft studio lighting from the left, subtle reflections, minimalist luxury aesthetic, product photography style

Style Transfer

Artistic Style Transformation

Apply diverse artistic styles while maintaining subject integrity

Prompt

Transform this photo into Studio Ghibli animation style, keeping the same composition and subjects, lush watercolor backgrounds, soft diffused lighting, whimsical atmosphere

4K Output

Ultra High Resolution Scene

Generate detailed scenes at up to 4K resolution with rich textures

Prompt

A cozy Japanese ramen shop interior at night, steam rising from bowls, warm amber lighting, detailed wooden counter with various condiments, a chef working in the background, 4K, ultra detailed

使用情境

🎬

分鏡與漫畫創作

📸

產品攝影

📊

行銷設計稿

📱

社群媒體內容

🔤

文字疊加設計

👤

角色設計

✨

照片編輯與修圖

🎨

品牌視覺內容

為什麼選擇 Nano Banana 2？

⚡

Flash 級速度

比 Nano Banana Pro 快 3-5 倍，標準生成時間僅 4-8 秒

🎯

接近 Pro 級畫質

在大多數情境下可達到 Pro 約 95% 的畫質水準

💰

更具性價比

成本約為 Nano Banana Pro 的一半 — 讓高品質 AI 影像生成更加普及

技術規格

架構：Gemini 3.1 Flash (GEMPIX2)

解析度支援：512px 至 4K（512px / 1K / 2K / 4K 四階）

長寬比：1:1, 4:3, 3:4, 2:3, 3:2, 16:9, 9:16, 1:4, 4:1, 8:1, 21:9

一致性：單一工作流程最多 5 個角色 + 14 個物件

內容安全：SynthID 浮水印，相容 C2PA 標準

API 存取：Gemini API、Vertex AI、AI Studio、Gemini CLI

立即體驗 Nano Banana 2

Pro 級畫質 Flash 級速度 — 輕鬆建立具有角色一致性、文字渲染和 4K 解析度支援的精美視覺內容。

✨免費額度立即開始

⚡即時 API 存取

🌐無需任何設定

Google Nano Banana 2 Reference to Image

Nano Banana 2 Reference to Image (Gemini 3.1 Flash Image) is Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions. Built on the same cutting-edge model as Nano Banana 2 Edit, it adds the ability to use video content as a rich reference source — extracting visual context, themes, and key frames to synthesize new images with precision and semantic awareness.

This model is ideal for creating thumbnails, posters, promotional artwork, and scene summaries by leveraging the visual richness of existing video content alongside natural language guidance.

Why Choose This?

Video as reference — Provide a video clip (HTTP URL or YouTube URL) and let the model extract its visual context to guide image generation.
Multi-image reference — Optionally upload up to 10 additional reference images to complement the video input for complex compositions.
Natural language control — Describe exactly what you want with a text prompt; the model understands context, themes, and relationships from both the video and text.
Thinking levels — Choose how much internal reasoning the model applies — higher thinking levels improve quality on complex tasks.
Media resolution control — Balance detail and token usage for input video frames with LOW, MEDIUM, or HIGH media resolution modes.
Web & image search grounding — Optionally enable real-time web or image search to enrich generation with current information.
Multi-resolution output — Generate at 1K, 2K, or 4K resolution.
Flexible aspect ratios — Multiple options including 1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, and 21:9.
Format choice — Export in PNG or JPEG format.

How It Works

The model analyzes your video clip by sampling frames at the specified FPS rate, then interprets the visual content within its multimodal context window. Combined with your text prompt and any additional reference images, it synthesizes a new image grounded in the video's themes, style, and key visual elements. This makes it especially powerful for creating content that is visually consistent with existing video assets.

Parameters

Core Inputs

Parameter	Required	Description
prompt	Yes	Text description of the desired output image
video_clips	Yes	Source video clip(s) for reference generation (max: 1, see below)
images	No	Additional reference images (max: 10, click "+ Add Item" to add more)

Video Clip Fields

Field	Required	Description
url	Yes	URL of the source video clip. Supports HTTP URL or YouTube video URL. HTTP video is limited to 15MB.
start	Yes	Start time in seconds for trimming the video clip (min: 0)
ends	Yes	End time in seconds for trimming the video clip. Set 0 to use the whole video.
fps	Yes	Frame sampling rate (FPS) of the video clip. Range: 0–24. Lower values reduce token usage.

Output Options

Parameter	Required	Description
aspect_ratio	No	Aspect ratio: 1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
resolution	No	Output resolution: 1k (default), 2k, 4k
output_format	No	Output format: png (default), jpeg

Advanced Options

Parameter	Required	Description
thinking_level	No	Reasoning depth: default, high, minimal. Higher levels improve quality on complex tasks but increase latency.
media_resolution	No	How input media frames are processed: default, low, medium, high. LOW reduces tokens per frame, allowing longer videos.
enable_web_search	No	If enabled, grounds generation with real-time web information.
enable_image_search	No	If enabled, grounds generation with real-time image search results.

How to Use

Provide a video clip — enter the video URL (HTTP or YouTube) and set start/end times and FPS sampling rate.
Write your prompt — describe the output image clearly (e.g., "Create a cinematic poster based on the key scenes in this video").
Add reference images (optional) — upload additional images to guide composition or style.
Choose aspect ratio (optional) — select a preset or leave empty for default.
Select resolution — choose 1K, 2K, or 4K based on your quality needs.
Choose output format — PNG for transparency support, JPEG for smaller file size.
Adjust advanced settings (optional) — set thinking level, media resolution, or enable search grounding.
Run — submit and download your generated image.

Pricing

The total cost is determined by the output image resolution multiplied by the number of output images, plus optional per-request fees for video clip input, web search, and image search grounding.

SKU Prices

SKU	Description	Unit Price
sku_1k	1K resolution output image	$0.08
sku_2k	2K resolution output image	$0.12
sku_4k	4K resolution output image	$0.16
sku_video_clip	Video clip input (per request)	$0.07
sku_web_search	Web search grounding (per request)	$0.014
sku_image_search	Image search grounding (per request)	$0.014

Pricing Formula

cost = (resolution == "2k" ? sku_2k : (resolution == "4k" ? sku_4k : sku_1k)) * images
     + (enable_web_search ? sku_web_search : 0)
     + (enable_image_search ? sku_image_search : 0)
     + (len(video_clips) > 0 ? sku_video_clip : 0)

Examples:

Resolution	Video Clip	Web Search	Image Search	Total Cost
1K	Yes	No	No	$0.08 +$ 0.07 = $0.15
2K	Yes	No	No	$0.12 +$ 0.07 = $0.19
4K	Yes	No	No	$0.16 +$ 0.07 = $0.23
1K	Yes	Yes	No	$0.08 +$ 0.07 + $0.014 = $ 0.164
1K	Yes	Yes	Yes	$0.08 +$ 0.07 + $0.014 +$ 0.014 = $0.178
1K	No	No	No	$0.08
2K	No	No	No	$0.12
4K	No	No	No	$0.16

The video clip fee ( $0.07), web search fee ($ 0.014), and image search fee ($0.014) are each charged once per request when the respective feature is enabled, regardless of content volume.

Best Use Cases

Video Thumbnail Generation — Automatically create compelling thumbnails that reflect the video's content and mood.
Promotional Posters — Generate movie-style or campaign posters grounded in actual video footage.
Scene Summarization Art — Create visual summaries or highlight artwork from long-form video content.
Brand Content Creation — Produce consistent image assets from brand video campaigns.
Educational Infographics — Transform instructional videos into static visual materials.
Social Media Assets — Generate platform-optimized images (vertical, square, landscape) from video content.

Pro Tips

Use low FPS (0.5–1) for long videos to keep token usage within limits while still capturing key frames.
Set precise start/end times to focus the model on the most relevant segment of your video.
Combine specific text prompts with the video input — vague prompts may produce generic results.
Add reference images alongside the video to guide composition style more precisely.
Use thinking_level: high for complex scene interpretations or when visual fidelity matters most.
Set media_resolution: low when analyzing long videos to allow more frames within the context window.
2K offers excellent quality at a reasonable price — only $0.04 more than 1K per image.
YouTube URLs are supported directly — no need to download and re-upload public videos.

Notes

Both prompt and video_clips are required fields.
Maximum video clips: 1 per request.
HTTP video URLs are limited to 15MB; use YouTube URLs for larger videos.
Maximum additional reference images: 10.
FPS range: 0–24. Higher FPS captures more frames but consumes more tokens.
The video clip fee ($0.07) is a flat per-request charge, not per frame or per second.
If aspect_ratio is not selected, the model uses a default ratio.
4K resolution costs 2× the standard 1K rate.
Ensure your content and prompts comply with Google's Safety Guidelines.

Nano Banana 2 Edit — Edit images using text prompts and reference images (no video input).
Nano Banana 2 Text-to-Image — Generate images from text prompts only.
Nano Banana Pro Edit — Pro tier editing with enhanced quality.
Nano Banana Pro Text-to-Image — Pro tier image generation.

探索類似模型

NEW

圖生圖

DEV

Nano Banana 2 Lite Edit Developer

Google's fastest and most cost-efficient Nano Banana image model for editing, applying natural-language edits and multi-image composition to up to 14 reference images with low latency.

Nano Banana 2 Lite Text-to-Image Developer

Google's fastest and most cost-efficient Nano Banana image model, turning natural-language text prompts into high-quality 1k images in as little as 4 seconds for rapid, high-volume generation.

Nano Banana 2 Lite Edit

Nano banana lite is the efficiency-focused model in the image generation family. Sub-2 second latency with cost-effective generation and editing, fast multi-turn local edits, and 14 supported aspect ratios.

Nano Banana 2 Lite Text-to-image

Nano Banana 2 Reference to Image Developer

Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions.

Nano Banana 2 Text-to-Image Developer

Google's lightweight yet powerful AI image generation model, built for creators who need fast, high-quality visuals from simple text prompts.

Nano Banana 2 Text-to-Image

Google's lightweight yet powerful AI image generation model, built for creators who need fast, high-quality visuals from simple text prompts.

Nano Banana 2 Edit Developer

Google's advanced AI-powered image editing and generation model, designed to make visual transformation as intuitive as describing it in words.

Nano Banana 2 Edit

Google's advanced AI-powered image editing and generation model, designed to make visual transformation as intuitive as describing it in words.

Nano Banana Pro Text-to-image Ultra

Nano Banana Pro is the next-generation Nano Banana image model, delivering sharper detail, richer color control, and faster diffusion for production-ready visuals.

Nano Banana Pro Edit Ultra

Nano Banana Pro Edit is an image editing tool built on the Nano Banana model family, designed for precise, AI-powered visual adjustments.

Nano Banana Pro Text-to-image

Nano Banana Pro is the next-generation Nano Banana image model, delivering sharper detail, richer color control, and faster diffusion for production-ready visuals.

From

$0.14/張