google/nano-banana-2/reference-to-image

Hình ảnh-Hình ảnh

Nano Banana 2 Reference-to-Image API by Google

google/nano-banana-2/reference-to-image

Reference-to-image

Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions.

Đầu vào

Lời nhắc *

Hình Ảnh(0/10)

Bạn có thể kéo thả tệp vào đây hoặc nhấp để tải lên

MAX:10

Video clips *

TỐI THIỂU: 1 / TỐI ĐA: 1

Tỷ Lệ Khung Hình

Độ phân giải

Thinking level

Media resolution

Output format

Enable web search

Enable image search

Đầu ra

Nhàn rỗi

Hình ảnh đã tạo của bạn sẽ xuất hiện ở đây

Cấu hình tham số và nhấp Chạy để bắt đầu tạo

Mỗi lần chạy có giá $0.08. Với $10, bạn có thể chạy khoảng 125 lần.

Bạn có thể tiếp tục với:

Hình ảnh sang video Hình ảnh sang hình ảnh

Tham số

Ví dụ mã
import requests
import time

# Step 1: Start image generation
generate_url = "https://api.atlascloud.ai/api/v1/model/generateImage"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "google/nano-banana-2/reference-to-image",  # Required. model name
    "aspect_ratio": "example_value",  # The aspect ratio of the generated media
    "enable_base64_output": False,  # If enabled, the output will be encoded into a BASE64 string instead of a URL
    "enable_sync_mode": False,  # If set to true, the function will wait for the result to be generated and uploaded before returning the response
    "enable_web_search": False,  # If enabled, the model will use web search to ground the generation with real-time information
    "enable_image_search": False,  # If enabled, the model will use image search to ground the generation with real-time information
    "images": [
        "https://example.com/image1.jpg"
    ],  # List of URLs of input images for editing
    "output_format": "default",  # The format of the output image. options: default | png | jpeg
    "prompt": "A beautiful landscape with mountains and lake",  # Required. The positive prompt for the generation
    "resolution": "1k",  # The resolution of the output image. options: 1k | 2k | 4k
    "media_resolution": "default",  # Controls how input media is processed. options: default | low | medium | high
    "thinking_level": "default",  # Controls the amount of internal reasoning the model performs before generating a response. options: default | high | minimal
    "video_clips": [
        {
            "url": "example_url",
            "start": 0,
            "ends": 0,
            "fps": 1
        }
    ],  # Required. Source video clips to use as references for generation
}

generate_response = requests.post(generate_url, headers=headers, json=data)
generate_result = generate_response.json()
prediction_id = generate_result["data"]["id"]

# Step 2: Poll for result
poll_url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"

def check_status():
    while True:
        response = requests.get(poll_url, headers={"Authorization": "Bearer $ATLASCLOUD_API_KEY"})
        result = response.json()

        if result["data"]["status"] == "completed":
            print("Generated image:", result["data"]["outputs"][0])
            return result["data"]["outputs"][0]
        elif result["data"]["status"] == "failed":
            raise Exception(result["data"]["error"] or "Generation failed")
        else:
            # Still processing, wait 2 seconds
            time.sleep(2)

image_url = check_status()

Cài đặt

Cài đặt gói cần thiết cho ngôn ngữ lập trình của bạn.

pip install requests

Xác thực

Tất cả các yêu cầu API đều cần xác thực thông qua khóa API. Bạn có thể lấy khóa API từ bảng điều khiển Atlas Cloud.

export ATLASCLOUD_API_KEY="your-api-key-here"

HTTP Headers

import os

API_KEY = os.environ.get("ATLASCLOUD_API_KEY")
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}

Bảo mật khóa API của bạn

Không bao giờ để lộ khóa API trong mã phía máy khách hoặc kho lưu trữ công khai. Thay vào đó, hãy sử dụng biến môi trường hoặc proxy phía máy chủ.

Gửi yêu cầu

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateImage"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "your-model",
    "prompt": "A beautiful landscape"
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

Gửi yêu cầu

Gửi một yêu cầu tạo nội dung không đồng bộ. API trả về một prediction ID mà bạn có thể dùng để kiểm tra trạng thái và lấy kết quả.

POST/api/v1/model/generateImage

Nội dung yêu cầu

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateImage"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}

data = {
    "model": "google/nano-banana-2/reference-to-image",
    "prompt": "A beautiful landscape with mountains and lake"
}

response = requests.post(url, headers=headers, json=data)
result = response.json()

print(f"Prediction ID: {result['data']['id']}")
print(f"Status: {result['data']['status']}")

Phản hồi

{
  "code": 200,
  "data": {
    "id": "pred_abc123",
    "status": "processing",
    "model": "model-name",
    "created_at": "2025-01-01T00:00:00Z"
  }
}

Kiểm tra trạng thái

Truy vấn (poll) endpoint prediction để kiểm tra trạng thái hiện tại của yêu cầu.

GET/api/v1/model/prediction/{prediction_id}

Ví dụ truy vấn

import requests
import time

prediction_id = "pred_abc123"
url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

while True:
    response = requests.get(url, headers=headers)
    result = response.json()
    status = result["data"]["status"]
    print(f"Status: {status}")

    if status in ["completed", "succeeded"]:
        output_url = result["data"]["outputs"][0]
        print(f"Output URL: {output_url}")
        break
    elif status == "failed":
        print(f"Error: {result['data'].get('error', 'Unknown')}")
        break

    time.sleep(3)

Giá trị trạng thái

processingYêu cầu vẫn đang được xử lý.

completedQuá trình tạo đã hoàn tất. Kết quả đầu ra đã sẵn sàng.

succeededQuá trình tạo thành công. Kết quả đầu ra đã sẵn sàng.

failedTạo nội dung thất bại. Hãy kiểm tra trường error.

Phản hồi hoàn tất

{
  "data": {
    "id": "pred_abc123",
    "status": "completed",
    "outputs": [
      "https://storage.atlascloud.ai/outputs/result.png"
    ],
    "metrics": {
      "predict_time": 8.3
    },
    "created_at": "2025-01-01T00:00:00Z",
    "completed_at": "2025-01-01T00:00:10Z"
  }
}

Tải tệp lên

Tải tệp lên bộ nhớ Atlas Cloud và nhận URL mà bạn có thể sử dụng trong các yêu cầu API của mình. Sử dụng multipart/form-data để tải lên.

POST/api/v1/model/uploadMedia

Ví dụ tải lên

import requests

url = "https://api.atlascloud.ai/api/v1/model/uploadMedia"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

with open("image.png", "rb") as f:
    files = {"file": ("image.png", f, "image/png")}
    response = requests.post(url, headers=headers, files=files)

result = response.json()
download_url = result["data"]["download_url"]
print(f"File URL: {download_url}")

Phản hồi

{
  "data": {
    "download_url": "https://storage.atlascloud.ai/uploads/abc123/image.png",
    "file_name": "image.png",
    "content_type": "image/png",
    "size": 1024000
  }
}

Input Schema

Các tham số sau được chấp nhận trong nội dung yêu cầu.

Tổng cộng: 13Bắt buộc: 3Tùy chọn: 10

modelstringrequired

model name

Default: "google/nano-banana-2/reference-to-image"

aspect_ratiostring

The aspect ratio of the generated media.

1:13:22:33:44:34:55:49:1616:921:9

enable_base64_outputboolean

If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.

Default: false

enable_sync_modeboolean

If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.

Default: false

enable_web_searchboolean

If enabled, the model will use web search to ground the generation with real-time information.

Default: false

enable_image_searchboolean

If enabled, the model will use image search to ground the generation with real-time information.

Default: false

imagesarray[string]

List of URLs of input images for editing. The maximum number of images is 10.

Min items: 0Max items: 10

output_formatstring

The format of the output image.

Default: "default"

defaultpngjpeg

promptstringrequired

The positive prompt for the generation.

resolutionstring

The resolution of the output image.

Default: "1k"

1k2k4k

media_resolutionstring

Controls how input media is processed. LOW reduces tokens per image/video, possibly losing detail but allowing longer videos in context. Supported values: HIGH, MEDIUM, LOW.

Default: "default"

defaultlowmediumhigh

thinking_levelstring

Controls the amount of internal reasoning the model performs before generating a response. Higher levels may improve quality on complex tasks but increase latency.

Default: "default"

defaulthighminimal

video_clipsarray[object]required

Source video clips to use as references for generation. Supports 1 video clip.

Min items: 1Max items: 1

urlstringrequired

URL of the source video clip. Support HTTP URL or YouTube video URL. Video in HTTP URL is limited to 15MB.

Format: uri

startnumberrequired

Start time in seconds for trimming the video clip.

Default: 0Min: 0

endsnumberrequired

End time in seconds for trimming the video clip. Set 0 to clip the whole video

Default: 0Min: 0

fpsnumberrequired

FPS of the video clip.

Default: 1Min: 0Max: 24

Ví dụ nội dung yêu cầu

{
  "model": "google/nano-banana-2/reference-to-image",
  "enable_base64_output": false,
  "enable_sync_mode": false,
  "enable_web_search": false,
  "enable_image_search": false,
  "output_format": "default",
  "prompt": "A beautiful landscape",
  "resolution": "1k",
  "media_resolution": "default",
  "thinking_level": "default",
  "video_clips": [
    {
      "url": "example_url",
      "start": 0,
      "ends": 0,
      "fps": 1
    }
  ]
}

Output Schema

API trả về phản hồi prediction kèm theo các URL đầu ra đã tạo.

codeinteger

HTTP status code of the response.

messagestring

Human-readable message; non-empty on failure.

dataobject

Ví dụ phản hồi

{
  "id": "pred_abc123",
  "status": "completed",
  "model": "model-name",
  "outputs": [
    "https://storage.atlascloud.ai/outputs/result.png"
  ],
  "metrics": {
    "predict_time": 8.3
  },
  "created_at": "2025-01-01T00:00:00Z",
  "completed_at": "2025-01-01T00:00:10Z"
}

Atlas Cloud Skills

Atlas Cloud Skills tích hợp hơn 400 mô hình AI trực tiếp vào trợ lý lập trình AI của bạn. Một lệnh để cài đặt, sau đó sử dụng ngôn ngữ tự nhiên để tạo hình ảnh, video và trò chuyện với LLM.

Ứng dụng được hỗ trợ

Claude Code

OpenAI Codex

Gemini CLI

Cursor

Windsurf

VS Code

Trae

GitHub Copilot

Cline

Roo Code

Amp

Goose

Replit

40+ ứng dụng được hỗ trợ

Cài đặt

npx skills add AtlasCloudAI/atlas-cloud-skills

Thiết lập khóa API

Lấy khóa API từ bảng điều khiển Atlas Cloud và đặt nó làm biến môi trường.

export ATLASCLOUD_API_KEY="your-api-key-here"

Khả năng

Sau khi cài đặt, bạn có thể sử dụng ngôn ngữ tự nhiên trong trợ lý AI để truy cập tất cả các mô hình Atlas Cloud.

Tạo hình ảnhTạo hình ảnh với các mô hình như Nano Banana 2, Z-Image và nhiều hơn nữa.

Tạo videoTạo video từ văn bản hoặc hình ảnh với Kling, Vidu, Veo, v.v.

Trò chuyện LLMTrò chuyện với Qwen, DeepSeek và các mô hình ngôn ngữ lớn khác.

Tải lên phương tiệnTải tệp cục bộ lên để chỉnh sửa hình ảnh và quy trình chuyển hình ảnh sang video.

Tìm hiểu thêm

github.com/AtlasCloudAI/atlas-cloud-skills

MCP Server

Atlas Cloud MCP Server kết nối IDE của bạn với hơn 400 mô hình AI thông qua Model Context Protocol. Hoạt động với bất kỳ ứng dụng tương thích MCP nào.

Ứng dụng được hỗ trợ

Cursor

VS Code

Windsurf

Claude Code

OpenAI Codex

Gemini CLI

Cline

Roo Code

100+ ứng dụng được hỗ trợ

Cài đặt

npx -y atlascloud-mcp

Cấu hình

Thêm cấu hình sau vào tệp cài đặt MCP của IDE.

{
  "mcpServers": {
    "atlascloud": {
      "command": "npx",
      "args": [
        "-y",
        "atlascloud-mcp"
      ],
      "env": {
        "ATLASCLOUD_API_KEY": "your-api-key-here"
      }
    }
  }
}

Công cụ khả dụng

atlas_generate_imageTạo hình ảnh từ mô tả văn bản.

atlas_generate_videoTạo video từ văn bản hoặc hình ảnh.

atlas_chatTrò chuyện với các mô hình ngôn ngữ lớn.

atlas_list_modelsDuyệt hơn 400 mô hình AI khả dụng.

atlas_quick_generateTạo nội dung một bước với khả năng tự động chọn mô hình tốt nhất.

atlas_upload_mediaTải tệp cục bộ lên cho quy trình API.

Tìm hiểu thêm

github.com/AtlasCloudAI/mcp-server

Schema API

{
  "components": {
    "schemas": {
      "Input": {
        "properties": {
          "model": {
            "type": "string",
            "description": "model name",
            "default": "google/nano-banana-2/reference-to-image"
          },
          "aspect_ratio": {
            "description": "The aspect ratio of the generated media.",
            "enum": [
              "1:1",
              "3:2",
              "2:3",
              "3:4",
              "4:3",
              "4:5",
              "5:4",
              "9:16",
              "16:9",
              "21:9"
            ],
            "type": "string",
            "x-placeholder": "Select aspect ratio"
          },
          "enable_base64_output": {
            "default": false,
            "description": "If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.",
            "disabled": true,
            "type": "boolean"
          },
          "enable_sync_mode": {
            "default": false,
            "description": "If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.",
            "disabled": true,
            "type": "boolean"
          },
          "enable_web_search": {
            "default": false,
            "description": "If enabled, the model will use web search to ground the generation with real-time information.",
            "type": "boolean"
          },
          "enable_image_search": {
            "default": false,
            "description": "If enabled, the model will use image search to ground the generation with real-time information.",
            "type": "boolean"
          },
          "images": {
            "description": "List of URLs of input images for editing. The maximum number of images is 10.",
            "items": {
              "type": "string"
            },
            "maxItems": 10,
            "minItems": 0,
            "type": "array",
            "x-ui-component": "uploaders"
          },
          "output_format": {
            "default": "default",
            "description": "The format of the output image.",
            "enum": [
              "default",
              "png",
              "jpeg"
            ],
            "type": "string"
          },
          "prompt": {
            "description": "The positive prompt for the generation.",
            "type": "string"
          },
          "resolution": {
            "default": "1k",
            "description": "The resolution of the output image.",
            "enum": [
              "1k",
              "2k",
              "4k"
            ],
            "type": "string"
          },
          "media_resolution": {
            "default": "default",
            "description": "Controls how input media is processed. LOW reduces tokens per image/video, possibly losing detail but allowing longer videos in context. Supported values: HIGH, MEDIUM, LOW.",
            "enum": [
              "default",
              "low",
              "medium",
              "high"
            ],
            "type": "string"
          },
          "thinking_level": {
            "description": "Controls the amount of internal reasoning the model performs before generating a response. Higher levels may improve quality on complex tasks but increase latency.",
            "default": "default",
            "enum": [
              "default",
              "high",
              "minimal"
            ],
            "type": "string"
          },
          "video_clips": {
            "description": "Source video clips to use as references for generation. Supports 1 video clip.",
            "type": "array",
            "items": {
              "type": "object",
              "required": [
                "url",
                "start",
                "ends",
                "fps"
              ],
              "properties": {
                "url": {
                  "type": "string",
                  "format": "uri",
                  "description": "URL of the source video clip. Support HTTP URL or YouTube video URL. Video in HTTP URL is limited to 15MB.",
                  "x-ui-component": "uploader"
                },
                "start": {
                  "type": "number",
                  "description": "Start time in seconds for trimming the video clip.",
                  "default": 0,
                  "minimum": 0
                },
                "ends": {
                  "type": "number",
                  "description": "End time in seconds for trimming the video clip. Set 0 to clip the whole video",
                  "default": 0,
                  "minimum": 0
                },
                "fps": {
                  "type": "number",
                  "description": "FPS of the video clip.",
                  "default": 1,
                  "minimum": 0,
                  "maximum": 24
                }
              }
            },
            "minItems": 1,
            "maxItems": 1
          }
        },
        "required": [
          "model",
          "prompt",
          "video_clips"
        ],
        "type": "object",
        "x-order-properties": [
          "model",
          "prompt",
          "images",
          "video_clips",
          "aspect_ratio",
          "resolution",
          "thinking_level",
          "media_resolution",
          "output_format",
          "enable_web_search",
          "enable_image_search",
          "enable_sync_mode",
          "enable_base64_output"
        ]
      },
      "PredictionResponse": {
        "type": "object",
        "properties": {
          "code": {
            "description": "HTTP status code of the response.",
            "type": "integer"
          },
          "message": {
            "description": "Human-readable message; non-empty on failure.",
            "type": "string"
          },
          "data": {
            "type": "object",
            "properties": {
              "id": {
                "description": "Unique identifier for the prediction.",
                "type": "string"
              },
              "model": {
                "description": "Model ID used for the prediction.",
                "type": "string"
              },
              "outputs": {
                "description": "Array of URLs to the generated content. Null when status is not completed.",
                "type": "array",
                "items": {
                  "type": "string"
                },
                "nullable": true
              },
              "urls": {
                "description": "Object containing related API endpoints.",
                "type": "object",
                "properties": {
                  "get": {
                    "description": "URL to poll for the prediction result.",
                    "type": "string",
                    "format": "uri"
                  }
                }
              },
              "has_nsfw_contents": {
                "description": "Array of boolean values indicating NSFW detection for each output. Null if not applicable.",
                "type": "array",
                "items": {
                  "type": "boolean"
                },
                "nullable": true
              },
              "status": {
                "description": "Status of the task: created, processing, completed, timeout, or failed.",
                "type": "string"
              },
              "created_at": {
                "description": "ISO timestamp of when the request was created (e.g., \"2023-04-01T12:34:56.789Z\").",
                "format": "date-time",
                "type": "string"
              },
              "error": {
                "description": "Error message if the task failed, empty string otherwise.",
                "type": "string"
              },
              "error_code": {
                "description": "Error code if the task failed.",
                "type": "integer"
              },
              "executionTime": {
                "description": "Total execution time in milliseconds.",
                "type": "number"
              },
              "timings": {
                "description": "Detailed timing breakdown.",
                "type": "object",
                "properties": {
                  "inference": {
                    "description": "Inference time in milliseconds.",
                    "type": "number"
                  }
                }
              }
            }
          }
        }
      }
    },
    "securitySchemes": {
      "apiKeyAuth": {
        "in": "header",
        "name": "Authorization",
        "type": "apiKey"
      }
    }
  },
  "info": {
    "description": "The AtlasCloud API.",
    "title": "AtlasCloud API",
    "version": "1.0.0"
  },
  "openapi": "3.0.0",
  "paths": {
    "/api/v1/model/generateImage": {
      "post": {
        "requestBody": {
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/Input"
              }
            }
          },
          "required": true
        },
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "The request status."
          }
        }
      },
      "x-api-name": "model_run"
    },
    "/api/v1/model/prediction/{request_id}": {
      "get": {
        "parameters": [
          {
            "in": "path",
            "name": "request_id",
            "required": true,
            "schema": {
              "description": "Request ID",
              "type": "string"
            }
          }
        ],
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "Result of the request."
          }
        }
      },
      "x-api-name": "model_result"
    }
  },
  "servers": [
    {
      "url": "https://api.atlascloud.ai"
    }
  ]
}

Mẫu Prompt Thân thiện với LLM

# google/nano-banana-2/reference-to-image

> Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions.


## Overview

- **Submit endpoint (POST)**: `https://api.atlascloud.ai/api/v1/model/generateImage` — start an async generation; returns a `prediction_id`
- **Poll endpoint (GET)**: `https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}` — poll this until the prediction finishes
- **Model ID**: `google/nano-banana-2/reference-to-image`


## API Information

This model can be used via our HTTP API or more conveniently via our client libraries.
See the input and output schema below, as well as the usage examples.


### Input Schema

The API accepts the following input parameters:

- **`model`** (`string`, _required_):
  model name
  - Default: `"google/nano-banana-2/reference-to-image"`

- **`prompt`** (`string`, _required_):
  The positive prompt for the generation.

- **`images`** (`array[string]`, _optional_):
  List of URLs of input images for editing. The maximum number of images is 10.
  - Min items: 0
  - Max items: 10

- **`video_clips`** (`array[object]`, _required_):
  Source video clips to use as references for generation. Supports 1 video clip.
  - Min items: 1
  - Max items: 1
  - Item properties:
    - **`url`** (`string`, _required_):
      URL of the source video clip. Support HTTP URL or YouTube video URL. Video in HTTP URL is limited to 15MB.

    - **`start`** (`number`, _required_):
      Start time in seconds for trimming the video clip.
      - Default: `0`
      - Min: 0

    - **`ends`** (`number`, _required_):
      End time in seconds for trimming the video clip. Set 0 to clip the whole video
      - Default: `0`
      - Min: 0

    - **`fps`** (`number`, _required_):
      FPS of the video clip.
      - Default: `1`
      - Min: 0
      - Max: 24


- **`aspect_ratio`** (`string`, _optional_):
  The aspect ratio of the generated media.
  - Options: "1:1", "3:2", "2:3", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9"

- **`resolution`** (`string`, _optional_):
  The resolution of the output image.
  - Default: `"1k"`
  - Options: "1k", "2k", "4k"

- **`thinking_level`** (`string`, _optional_):
  Controls the amount of internal reasoning the model performs before generating a response. Higher levels may improve quality on complex tasks but increase latency.
  - Default: `"default"`
  - Options: "default", "high", "minimal"

- **`media_resolution`** (`string`, _optional_):
  Controls how input media is processed. LOW reduces tokens per image/video, possibly losing detail but allowing longer videos in context. Supported values: HIGH, MEDIUM, LOW.
  - Default: `"default"`
  - Options: "default", "low", "medium", "high"

- **`output_format`** (`string`, _optional_):
  The format of the output image.
  - Default: `"default"`
  - Options: "default", "png", "jpeg"

- **`enable_web_search`** (`boolean`, _optional_):
  If enabled, the model will use web search to ground the generation with real-time information.
  - Default: `false`

- **`enable_image_search`** (`boolean`, _optional_):
  If enabled, the model will use image search to ground the generation with real-time information.
  - Default: `false`

- **`enable_sync_mode`** (`boolean`, _optional_):
  If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.
  - Default: `false`

- **`enable_base64_output`** (`boolean`, _optional_):
  If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.
  - Default: `false`



**Required Parameters Example**:

```json
{
  "model": "google/nano-banana-2/reference-to-image",
  "prompt": "",
  "video_clips": [
    {
      "url": "",
      "start": 0,
      "ends": 0,
      "fps": 1
    }
  ]
}
```


**Full Example**:

```json
{
  "model": "google/nano-banana-2/reference-to-image",
  "prompt": "",
  "images": [
    ""
  ],
  "video_clips": [
    {
      "url": "",
      "start": 0,
      "ends": 0,
      "fps": 1
    }
  ],
  "aspect_ratio": "1:1",
  "resolution": "1k",
  "thinking_level": "default",
  "media_resolution": "default",
  "output_format": "default",
  "enable_web_search": false,
  "enable_image_search": false,
  "enable_sync_mode": false,
  "enable_base64_output": false
}
```


### Output Schema

The API returns the following output format:


- **`code`** (`integer`, _optional_):
  HTTP status code of the response.

- **`message`** (`string`, _optional_):
  Human-readable message; non-empty on failure.

- **`data`** (`object`, _optional_):
  - Properties:
    - **`id`** (`string`, _optional_):
      Unique identifier for the prediction.

    - **`model`** (`string`, _optional_):
      Model ID used for the prediction.

    - **`outputs`** (`array[string]`, _optional_):
      Array of URLs to the generated content. Null when status is not completed.

    - **`urls`** (`object`, _optional_):
      Object containing related API endpoints.
      - Properties:
        - **`get`** (`string`, _optional_):
          URL to poll for the prediction result.


    - **`has_nsfw_contents`** (`array[boolean]`, _optional_):
      Array of boolean values indicating NSFW detection for each output. Null if not applicable.

    - **`status`** (`string`, _optional_):
      Status of the task: created, processing, completed, timeout, or failed.

    - **`created_at`** (`string`, _optional_):
      ISO timestamp of when the request was created (e.g., "2023-04-01T12:34:56.789Z").

    - **`error`** (`string`, _optional_):
      Error message if the task failed, empty string otherwise.

    - **`error_code`** (`integer`, _optional_):
      Error code if the task failed.

    - **`executionTime`** (`number`, _optional_):
      Total execution time in milliseconds.

    - **`timings`** (`object`, _optional_):
      Detailed timing breakdown.
      - Properties:
        - **`inference`** (`number`, _optional_):
          Inference time in milliseconds.





**Example Response**:

```json
{
  "code": 0,
  "message": "",
  "data": {
    "id": "",
    "model": "",
    "outputs": [
      ""
    ],
    "urls": {
      "get": ""
    },
    "has_nsfw_contents": [],
    "status": "",
    "created_at": "",
    "error": "",
    "error_code": 0,
    "executionTime": 0,
    "timings": {
      "inference": 0
    }
  }
}
```


## Usage Examples

### cURL

```bash
# Step 1: Start generation (async)
curl -X POST "https://api.atlascloud.ai/api/v1/model/generateImage" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "google/nano-banana-2/reference-to-image",
  "prompt": "",
  "images": [
    ""
  ],
  "video_clips": [
    {
      "url": "",
      "start": 0,
      "ends": 0,
      "fps": 1
    }
  ],
  "aspect_ratio": "1:1",
  "resolution": "1k",
  "thinking_level": "default",
  "media_resolution": "default",
  "output_format": "default",
  "enable_web_search": false,
  "enable_image_search": false,
  "enable_sync_mode": false,
  "enable_base64_output": false
}'

# Response will contain: {"code": 200, "data": {"id": "prediction_id", "status": "processing"}}

# Step 2: Poll for result (replace {prediction_id} with the id returned above)
curl -X GET "https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY"

# Keep polling until status is "completed", "succeeded" or "failed"
# When completed, outputs will contain the generated content URL(s)
```

## Additional Resources

### Documentation

- [Model Playground](https://www.atlascloud.ai/models/google/nano-banana-2/reference-to-image)

Generate a poster image that captures the key themes of this video.

Integrate the character from the image into the video's environment, fully rendering the character to ensure their proportions match those of an average male, and place them within the scene in a walking motion; the character's lighting should be consistent with the ambient atmosphere. No text should be included.

Đang tải...

Tạo Hình Ảnh Thế Hệ Mới

Đầu ra độ phân giải lên đến 4K (cấp 512px / 1K / 2K / 4K)
10+ tỷ lệ khung hình bao gồm 21:9, 1:4, 8:1 và nhiều hơn nữa
Hiển thị văn bản chính xác và dễ đọc trong hình ảnh
Chất lượng gần Pro (~95%) ở tốc độ Flash

Chỉnh Sửa Thông Minh & Tính Nhất Quán

Tính nhất quán nhân vật cho tối đa 5 nhân vật xuyên suốt các cảnh
Độ chính xác đối tượng cho tối đa 14 đối tượng trong một quy trình
Chỉnh sửa có mục tiêu bằng ngôn ngữ tự nhiên (xóa, thay thế, đổi tư thế)
Kết hợp đa hình ảnh và bố cục liền mạch

Có Gì Mới Trong Nano Banana 2

Nhanh Hơn Pro 3-5 Lần

Được xây dựng trên kiến trúc Gemini 3.1 Flash, Nano Banana 2 tạo hình ảnh tiêu chuẩn trong 4-8 giây — so với 10-20 giây của Nano Banana Pro.

Tăng Cường Bằng Tìm Kiếm Hình Ảnh

Tính năng nổi bật của NB2 — có thể truy xuất hình ảnh tham chiếu thực tế qua Google Search trong quá trình tạo ảnh, cải thiện đáng kể độ chính xác cho các địa danh, người nổi tiếng và logo thương hiệu.

Hiển Thị Văn Bản Chính Xác

Tạo văn bản chính xác, dễ đọc cho mockup tiếp thị, thiệp chúc mừng và nội dung bản địa hóa. Bạn thậm chí có thể dịch và bản địa hóa văn bản ngay trong hình ảnh.

Nhất Quán Đa Nhân Vật

Duy trì tính nhất quán hình ảnh cho tối đa 5 nhân vật và 14 đối tượng xuyên suốt các cảnh — hoàn hảo cho storyboard, truyện tranh và chiến dịch tiếp thị.

Text Rendering

Marketing Mockup with Text

Generate marketing visuals with accurate, legible text — one of NB2's standout improvements

Prompt

A minimalist coffee shop promotional poster with the text 'MORNING BREW — Fresh Roasted Daily' in elegant serif font, warm earth tones, steam rising from a ceramic cup, clean layout with plenty of whitespace

Character Consistency

Multi-Scene Character

Maintain character consistency across multiple scenes — supports up to 5 characters per workflow

Prompt

A young woman with short red hair and freckles, wearing a green jacket, standing in a rainy Tokyo street at night with neon reflections on wet pavement, cinematic lighting, photorealistic

Photo to Action Figure

Person to Action Figure

Transform people from photos into collectible action figures with custom packaging

Prompt

Transform the person in the photo into an action figure, styled after [CHARACTER_NAME] from [SOURCE / CONTEXT]. Next to the figure, display the accessories including [ITEM_1], [ITEM_2], and [ITEM_3]. On the top of the toy box, write "[BOX_LABEL_TOP]", and underneath it, "[BOX_LABEL_BOTTOM]". Place the box in a [BACKGROUND_SETTING] environment.

Search Grounding

Real-World Reference Generation

Leverage Image Search Grounding to generate accurate real-world subjects like landmarks and brands

Prompt

A photorealistic aerial view of the Eiffel Tower at golden hour, with the Seine River winding through Paris below, warm sunset light casting long shadows, high detail, 4K resolution

Product Photography

Product Design Render

Create professional product photography with precise control over lighting and composition

Prompt

A frosted glass perfume bottle with a marble cap on a white marble surface, soft studio lighting from the left, subtle reflections, minimalist luxury aesthetic, product photography style

Style Transfer

Artistic Style Transformation

Apply diverse artistic styles while maintaining subject integrity

Prompt

Transform this photo into Studio Ghibli animation style, keeping the same composition and subjects, lush watercolor backgrounds, soft diffused lighting, whimsical atmosphere

4K Output

Ultra High Resolution Scene

Generate detailed scenes at up to 4K resolution with rich textures

Prompt

A cozy Japanese ramen shop interior at night, steam rising from bowls, warm amber lighting, detailed wooden counter with various condiments, a chef working in the background, 4K, ultra detailed

Trường Hợp Sử Dụng

🎬

Storyboard & Truyện Tranh

📸

Chụp Ảnh Sản Phẩm

📊

Mockup Tiếp Thị

📱

Nội Dung Mạng Xã Hội

🔤

Thiết Kế Chữ Phủ Lớp

👤

Thiết Kế Nhân Vật

✨

Chỉnh Sửa & Retouch Ảnh

🎨

Nội Dung Hình Ảnh Thương Hiệu

Tại Sao Chọn Nano Banana 2?

⚡

Tốc Độ Flash

Nhanh hơn Nano Banana Pro 3-5 lần với thời gian tạo ảnh tiêu chuẩn 4-8 giây

🎯

Chất Lượng Gần Pro

Đạt khoảng 95% chất lượng hình ảnh của Pro trong hầu hết các tình huống

💰

Tiết Kiệm Chi Phí

Chi phí chỉ bằng khoảng một nửa Nano Banana Pro — giúp tạo hình ảnh AI chất lượng cao trở nên dễ tiếp cận hơn

Thông Số Kỹ Thuật

Kiến trúc:Gemini 3.1 Flash (GEMPIX2)

Hỗ trợ Độ phân giải:512px đến 4K (cấp 512px / 1K / 2K / 4K)

Tỷ lệ Khung hình:1:1, 4:3, 3:4, 2:3, 3:2, 16:9, 9:16, 1:4, 4:1, 8:1, 21:9

Tính nhất quán:Tối đa 5 nhân vật + 14 đối tượng mỗi quy trình

An toàn Nội dung:Hình mờ SynthID, tương thích tiêu chuẩn C2PA

Truy cập API:Gemini API, Vertex AI, AI Studio, Gemini CLI

Trải Nghiệm Nano Banana 2

Tạo hình ảnh cấp Pro ở tốc độ Flash — tạo hình ảnh ấn tượng với tính nhất quán nhân vật, hiển thị văn bản và hỗ trợ độ phân giải 4K.

✨Tín Dụng Miễn Phí Để Bắt Đầu

⚡Truy Cập API Ngay Lập Tức

🌐Không Cần Cài Đặt

Google Nano Banana 2 Reference to Image

Nano Banana 2 Reference to Image (Gemini 3.1 Flash Image) is Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions. Built on the same cutting-edge model as Nano Banana 2 Edit, it adds the ability to use video content as a rich reference source — extracting visual context, themes, and key frames to synthesize new images with precision and semantic awareness.

This model is ideal for creating thumbnails, posters, promotional artwork, and scene summaries by leveraging the visual richness of existing video content alongside natural language guidance.

Why Choose This?

Video as reference — Provide a video clip (HTTP URL or YouTube URL) and let the model extract its visual context to guide image generation.
Multi-image reference — Optionally upload up to 10 additional reference images to complement the video input for complex compositions.
Natural language control — Describe exactly what you want with a text prompt; the model understands context, themes, and relationships from both the video and text.
Thinking levels — Choose how much internal reasoning the model applies — higher thinking levels improve quality on complex tasks.
Media resolution control — Balance detail and token usage for input video frames with LOW, MEDIUM, or HIGH media resolution modes.
Web & image search grounding — Optionally enable real-time web or image search to enrich generation with current information.
Multi-resolution output — Generate at 1K, 2K, or 4K resolution.
Flexible aspect ratios — Multiple options including 1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, and 21:9.
Format choice — Export in PNG or JPEG format.

How It Works

The model analyzes your video clip by sampling frames at the specified FPS rate, then interprets the visual content within its multimodal context window. Combined with your text prompt and any additional reference images, it synthesizes a new image grounded in the video's themes, style, and key visual elements. This makes it especially powerful for creating content that is visually consistent with existing video assets.

Parameters

Core Inputs

Parameter	Required	Description
prompt	Yes	Text description of the desired output image
video_clips	Yes	Source video clip(s) for reference generation (max: 1, see below)
images	No	Additional reference images (max: 10, click "+ Add Item" to add more)

Video Clip Fields

Field	Required	Description
url	Yes	URL of the source video clip. Supports HTTP URL or YouTube video URL. HTTP video is limited to 15MB.
start	Yes	Start time in seconds for trimming the video clip (min: 0)
ends	Yes	End time in seconds for trimming the video clip. Set 0 to use the whole video.
fps	Yes	Frame sampling rate (FPS) of the video clip. Range: 0–24. Lower values reduce token usage.

Output Options

Parameter	Required	Description
aspect_ratio	No	Aspect ratio: 1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
resolution	No	Output resolution: 1k (default), 2k, 4k
output_format	No	Output format: png (default), jpeg

Advanced Options

Parameter	Required	Description
thinking_level	No	Reasoning depth: default, high, minimal. Higher levels improve quality on complex tasks but increase latency.
media_resolution	No	How input media frames are processed: default, low, medium, high. LOW reduces tokens per frame, allowing longer videos.
enable_web_search	No	If enabled, grounds generation with real-time web information.
enable_image_search	No	If enabled, grounds generation with real-time image search results.

How to Use

Provide a video clip — enter the video URL (HTTP or YouTube) and set start/end times and FPS sampling rate.
Write your prompt — describe the output image clearly (e.g., "Create a cinematic poster based on the key scenes in this video").
Add reference images (optional) — upload additional images to guide composition or style.
Choose aspect ratio (optional) — select a preset or leave empty for default.
Select resolution — choose 1K, 2K, or 4K based on your quality needs.
Choose output format — PNG for transparency support, JPEG for smaller file size.
Adjust advanced settings (optional) — set thinking level, media resolution, or enable search grounding.
Run — submit and download your generated image.

Pricing

The total cost is determined by the output image resolution multiplied by the number of output images, plus optional per-request fees for video clip input, web search, and image search grounding.

SKU Prices

SKU	Description	Unit Price
sku_1k	1K resolution output image	$0.08
sku_2k	2K resolution output image	$0.12
sku_4k	4K resolution output image	$0.16
sku_video_clip	Video clip input (per request)	$0.07
sku_web_search	Web search grounding (per request)	$0.014
sku_image_search	Image search grounding (per request)	$0.014

Pricing Formula

cost = (resolution == "2k" ? sku_2k : (resolution == "4k" ? sku_4k : sku_1k)) * images
     + (enable_web_search ? sku_web_search : 0)
     + (enable_image_search ? sku_image_search : 0)
     + (len(video_clips) > 0 ? sku_video_clip : 0)

Examples:

Resolution	Video Clip	Web Search	Image Search	Total Cost
1K	Yes	No	No	$0.08 +$ 0.07 = $0.15
2K	Yes	No	No	$0.12 +$ 0.07 = $0.19
4K	Yes	No	No	$0.16 +$ 0.07 = $0.23
1K	Yes	Yes	No	$0.08 +$ 0.07 + $0.014 = $ 0.164
1K	Yes	Yes	Yes	$0.08 +$ 0.07 + $0.014 +$ 0.014 = $0.178
1K	No	No	No	$0.08
2K	No	No	No	$0.12
4K	No	No	No	$0.16

The video clip fee ( $0.07), web search fee ($ 0.014), and image search fee ($0.014) are each charged once per request when the respective feature is enabled, regardless of content volume.

Best Use Cases

Video Thumbnail Generation — Automatically create compelling thumbnails that reflect the video's content and mood.
Promotional Posters — Generate movie-style or campaign posters grounded in actual video footage.
Scene Summarization Art — Create visual summaries or highlight artwork from long-form video content.
Brand Content Creation — Produce consistent image assets from brand video campaigns.
Educational Infographics — Transform instructional videos into static visual materials.
Social Media Assets — Generate platform-optimized images (vertical, square, landscape) from video content.

Pro Tips

Use low FPS (0.5–1) for long videos to keep token usage within limits while still capturing key frames.
Set precise start/end times to focus the model on the most relevant segment of your video.
Combine specific text prompts with the video input — vague prompts may produce generic results.
Add reference images alongside the video to guide composition style more precisely.
Use thinking_level: high for complex scene interpretations or when visual fidelity matters most.
Set media_resolution: low when analyzing long videos to allow more frames within the context window.
2K offers excellent quality at a reasonable price — only $0.04 more than 1K per image.
YouTube URLs are supported directly — no need to download and re-upload public videos.

Notes

Both prompt and video_clips are required fields.
Maximum video clips: 1 per request.
HTTP video URLs are limited to 15MB; use YouTube URLs for larger videos.
Maximum additional reference images: 10.
FPS range: 0–24. Higher FPS captures more frames but consumes more tokens.
The video clip fee ($0.07) is a flat per-request charge, not per frame or per second.
If aspect_ratio is not selected, the model uses a default ratio.
4K resolution costs 2× the standard 1K rate.
Ensure your content and prompts comply with Google's Safety Guidelines.

Nano Banana 2 Edit — Edit images using text prompts and reference images (no video input).
Nano Banana 2 Text-to-Image — Generate images from text prompts only.
Nano Banana Pro Edit — Pro tier editing with enhanced quality.
Nano Banana Pro Text-to-Image — Pro tier image generation.

Khám phá Các Mô hình Tương tự

NEW

Hình ảnh-Hình ảnh

DEV

Nano Banana 2 Lite Edit Developer

Google's fastest and most cost-efficient Nano Banana image model for editing, applying natural-language edits and multi-image composition to up to 14 reference images with low latency.

Nano Banana 2 Lite Text-to-Image Developer

Google's fastest and most cost-efficient Nano Banana image model, turning natural-language text prompts into high-quality 1k images in as little as 4 seconds for rapid, high-volume generation.

Nano Banana 2 Lite Edit

Nano banana lite is the efficiency-focused model in the image generation family. Sub-2 second latency with cost-effective generation and editing, fast multi-turn local edits, and 14 supported aspect ratios.

Nano Banana 2 Lite Text-to-image

Nano Banana 2 Reference to Image Developer

Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions.

Nano Banana 2 Text-to-Image Developer

Google's lightweight yet powerful AI image generation model, built for creators who need fast, high-quality visuals from simple text prompts.

Nano Banana 2 Text-to-Image

Google's lightweight yet powerful AI image generation model, built for creators who need fast, high-quality visuals from simple text prompts.

Nano Banana 2 Edit Developer

Google's advanced AI-powered image editing and generation model, designed to make visual transformation as intuitive as describing it in words.

Nano Banana 2 Edit

Google's advanced AI-powered image editing and generation model, designed to make visual transformation as intuitive as describing it in words.

Nano Banana Pro Text-to-image Ultra

Nano Banana Pro is the next-generation Nano Banana image model, delivering sharper detail, richer color control, and faster diffusion for production-ready visuals.

Nano Banana Pro Edit Ultra

Nano Banana Pro Edit is an image editing tool built on the Nano Banana model family, designed for precise, AI-powered visual adjustments.

Nano Banana Pro Text-to-image

Nano Banana Pro is the next-generation Nano Banana image model, delivering sharper detail, richer color control, and faster diffusion for production-ready visuals.

From

$0.14/HÌNH ẢNH