google/nano-banana-pro/text-to-image-developer

文生圖

PRODEV

Nano Banana Pro Text-to-Image Developer API by Google

google/nano-banana-pro/text-to-image-developer

Text-to-image-developer

Open and Advanced Large-Scale Image Generative Models.

輸入

提示詞 *

寬高比

解析度

Enable web search

輸出

閒置

生成的圖片將在這裡顯示

設定參數後點擊執行開始生成

每次執行將花費 $0.07。$10 可執行約 142 次。

你可以繼續：

圖生影片圖生圖

參數

程式碼範例
import requests
import time

# Step 1: Start image generation
generate_url = "https://api.atlascloud.ai/api/v1/model/generateImage"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "google/nano-banana-pro/text-to-image-developer",  # Required. model name
    "aspect_ratio": "example_value",  # The aspect ratio of the generated media
    "enable_base64_output": False,  # If enabled, the output will be encoded into a BASE64 string instead of a URL
    "enable_sync_mode": False,  # If set to true, the function will wait for the result to be generated and uploaded before returning the response
    "enable_web_search": False,  # If enabled, the model will use web search to ground the generation with real-time information
    "prompt": "A beautiful landscape with mountains and lake",  # Required. The positive prompt for the generation
    "resolution": "1k",  # The resolution of the output image. options: 1k | 2k | 4k
}

generate_response = requests.post(generate_url, headers=headers, json=data)
generate_result = generate_response.json()
prediction_id = generate_result["data"]["id"]

# Step 2: Poll for result
poll_url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"

def check_status():
    while True:
        response = requests.get(poll_url, headers={"Authorization": "Bearer $ATLASCLOUD_API_KEY"})
        result = response.json()

        if result["data"]["status"] == "completed":
            print("Generated image:", result["data"]["outputs"][0])
            return result["data"]["outputs"][0]
        elif result["data"]["status"] == "failed":
            raise Exception(result["data"]["error"] or "Generation failed")
        else:
            # Still processing, wait 2 seconds
            time.sleep(2)

image_url = check_status()

安裝

安裝所需的相依套件。

pip install requests

驗證

所有 API 請求都需要透過 API Key 進行認證。您可以在 Atlas Cloud 控制台取得 API Key。

export ATLASCLOUD_API_KEY="your-api-key-here"

HTTP 標頭

import os

API_KEY = os.environ.get("ATLASCLOUD_API_KEY")
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}

保護好您的 API Key

切勿在客戶端程式碼或公開儲存庫中暴露您的 API Key。請使用環境變數或後端代理。

提交請求

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateImage"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "your-model",
    "prompt": "A beautiful landscape"
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

提交請求

提交一個非同步生成請求。API 會傳回一個 prediction ID，您可以用它來檢查狀態與取得結果。

POST/api/v1/model/generateImage

請求主體

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateImage"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}

data = {
    "model": "google/nano-banana-pro/text-to-image-developer",
    "prompt": "A beautiful landscape with mountains and lake"
}

response = requests.post(url, headers=headers, json=data)
result = response.json()

print(f"Prediction ID: {result['data']['id']}")
print(f"Status: {result['data']['status']}")

回應

{
  "code": 200,
  "data": {
    "id": "pred_abc123",
    "status": "processing",
    "model": "model-name",
    "created_at": "2025-01-01T00:00:00Z"
  }
}

檢查狀態

輪詢 prediction 端點以檢查請求的當前狀態。

GET/api/v1/model/prediction/{prediction_id}

輪詢範例

import requests
import time

prediction_id = "pred_abc123"
url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

while True:
    response = requests.get(url, headers=headers)
    result = response.json()
    status = result["data"]["status"]
    print(f"Status: {status}")

    if status in ["completed", "succeeded"]:
        output_url = result["data"]["outputs"][0]
        print(f"Output URL: {output_url}")
        break
    elif status == "failed":
        print(f"Error: {result['data'].get('error', 'Unknown')}")
        break

    time.sleep(3)

狀態值

processing請求仍在處理中。

completed生成完成，輸出可用。

succeeded生成成功，輸出可用。

failed生成失敗，請檢查 error 欄位。

完成回應

{
  "data": {
    "id": "pred_abc123",
    "status": "completed",
    "outputs": [
      "https://storage.atlascloud.ai/outputs/result.png"
    ],
    "metrics": {
      "predict_time": 8.3
    },
    "created_at": "2025-01-01T00:00:00Z",
    "completed_at": "2025-01-01T00:00:10Z"
  }
}

上傳檔案

將檔案上傳到 Atlas Cloud 儲存空間，取得可在 API 請求中使用的 URL。使用 multipart/form-data 上傳。

POST/api/v1/model/uploadMedia

上傳範例

import requests

url = "https://api.atlascloud.ai/api/v1/model/uploadMedia"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

with open("image.png", "rb") as f:
    files = {"file": ("image.png", f, "image/png")}
    response = requests.post(url, headers=headers, files=files)

result = response.json()
download_url = result["data"]["download_url"]
print(f"File URL: {download_url}")

回應

{
  "data": {
    "download_url": "https://storage.atlascloud.ai/uploads/abc123/image.png",
    "file_name": "image.png",
    "content_type": "image/png",
    "size": 1024000
  }
}

Input Schema

以下參數在請求主體中被接受。

總計: 7必填: 2選填: 5

modelstringrequired

model name

Default: "google/nano-banana-pro/text-to-image-developer"

aspect_ratiostring

The aspect ratio of the generated media.

1:13:22:33:44:34:55:49:1616:921:9

enable_base64_outputboolean

If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.

Default: false

enable_sync_modeboolean

If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.

Default: false

enable_web_searchboolean

If enabled, the model will use web search to ground the generation with real-time information.

Default: false

promptstringrequired

The positive prompt for the generation.

resolutionstring

The resolution of the output image.

Default: "1k"

1k2k4k

範例請求主體

{
  "model": "google/nano-banana-pro/text-to-image-developer",
  "enable_base64_output": false,
  "enable_sync_mode": false,
  "enable_web_search": false,
  "prompt": "A beautiful landscape",
  "resolution": "1k"
}

Output Schema

API 傳回包含生成輸出 URL 的 prediction 回應。

created_atstring

ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”).

idstring

Unique identifier for the prediction, the ID of the prediction to get.

modelstring

Model ID used for the prediction.

outputsarray

Array of URLs to the generated content (empty when status is not completed).

statusstring

Status of the task: created, processing, completed, or failed.

範例回應

{
  "id": "pred_abc123",
  "status": "completed",
  "model": "model-name",
  "outputs": [
    "https://storage.atlascloud.ai/outputs/result.png"
  ],
  "metrics": {
    "predict_time": 8.3
  },
  "created_at": "2025-01-01T00:00:00Z",
  "completed_at": "2025-01-01T00:00:10Z"
}

Atlas Cloud Skills

Atlas Cloud Skills 將 400+ AI 模型直接整合到您的 AI 程式碼助手中。一條命令安裝，即可用自然語言生成圖片、影片，以及與 LLM 對話。

支援的客戶端

Claude Code

OpenAI Codex

Gemini CLI

Cursor

Windsurf

VS Code

Trae

GitHub Copilot

Cline

Roo Code

Amp

Goose

Replit

40+ 支援的客戶端

安裝

npx skills add AtlasCloudAI/atlas-cloud-skills

設定 API Key

從 Atlas Cloud 控制台取得 API Key，並將其設定為環境變數。

export ATLASCLOUD_API_KEY="your-api-key-here"

功能

安裝完成後，您可以在 AI 助手中使用自然語言存取所有 Atlas Cloud 模型。

圖片生成使用 Nano Banana 2、Z-Image 等模型生成圖片。

影片創作使用 Kling、Vidu、Veo 等從文字或圖片創建影片。

LLM 對話與 Qwen、DeepSeek 及其他大型語言模型對話。

媒體上傳上傳本機檔案用於圖片編輯和圖生影片工作流程。

MCP Server

Atlas Cloud MCP Server 透過 Model Context Protocol 將您的 IDE 與 400+ AI 模型連接。支援任何相容 MCP 的客戶端。

支援的客戶端

Cursor

VS Code

Windsurf

Claude Code

OpenAI Codex

Gemini CLI

Cline

Roo Code

100+ 支援的客戶端

安裝

npx -y atlascloud-mcp

設定

將以下設定新增到您的 IDE 的 MCP 設定檔中。

{
  "mcpServers": {
    "atlascloud": {
      "command": "npx",
      "args": [
        "-y",
        "atlascloud-mcp"
      ],
      "env": {
        "ATLASCLOUD_API_KEY": "your-api-key-here"
      }
    }
  }
}

可用工具

atlas_generate_image根據文字提示生成圖片。

atlas_generate_video從文字或圖片創建影片。

atlas_chat與大型語言模型對話。

atlas_list_models瀏覽 400+ 可用 AI 模型。

atlas_quick_generate一步式內容創建，自動選擇最佳模型。

atlas_upload_media上傳本機檔案用於 API 工作流程。

了解更多

github.com/AtlasCloudAI/mcp-server

API Schema

{
  "components": {
    "schemas": {
      "Input": {
        "properties": {
          "model": {
            "type": "string",
            "description": "model name",
            "default": "google/nano-banana-pro/text-to-image-developer"
          },
          "aspect_ratio": {
            "description": "The aspect ratio of the generated media.",
            "enum": [
              "1:1",
              "3:2",
              "2:3",
              "3:4",
              "4:3",
              "4:5",
              "5:4",
              "9:16",
              "16:9",
              "21:9"
            ],
            "type": "string",
            "x-placeholder": "Select aspect ratio"
          },
          "enable_base64_output": {
            "default": false,
            "description": "If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.",
            "disabled": true,
            "type": "boolean"
          },
          "enable_sync_mode": {
            "default": false,
            "description": "If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.",
            "disabled": true,
            "type": "boolean"
          },
          "enable_web_search": {
            "default": false,
            "description": "If enabled, the model will use web search to ground the generation with real-time information.",
            "type": "boolean"
          },
          "prompt": {
            "description": "The positive prompt for the generation.",
            "type": "string"
          },
          "resolution": {
            "default": "1k",
            "description": "The resolution of the output image.",
            "enum": [
              "1k",
              "2k",
              "4k"
            ],
            "type": "string"
          }
        },
        "required": [
          "model",
          "prompt"
        ],
        "seed": {
          "title": "Seed",
          "type": "integer"
        },
        "type": "object",
        "x-order-properties": [
          "model",
          "prompt",
          "aspect_ratio",
          "resolution",
          "enable_web_search",
          "enable_sync_mode",
          "enable_base64_output"
        ]
      },
      "PredictionResponse": {
        "properties": {
          "created_at": {
            "description": "ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”).",
            "format": "date-time",
            "type": "string"
          },
          "has_nsfw_contents": {
            "description": "Array of boolean values indicating NSFW detection for each output.",
            "items": {
              "type": "boolean"
            },
            "type": "array"
          },
          "id": {
            "description": "Unique identifier for the prediction, the ID of the prediction to get.",
            "type": "string"
          },
          "model": {
            "description": "Model ID used for the prediction.",
            "type": "string"
          },
          "outputs": {
            "description": "Array of URLs to the generated content (empty when status is not completed).",
            "items": {
              "type": "string"
            },
            "type": "array"
          },
          "status": {
            "description": "Status of the task: created, processing, completed, or failed.",
            "type": "string"
          },
          "urls": {
            "description": "Object containing related API endpoints.",
            "type": "object"
          }
        },
        "type": "object"
      }
    },
    "securitySchemes": {
      "apiKeyAuth": {
        "in": "header",
        "name": "Authorization",
        "type": "apiKey"
      }
    }
  },
  "info": {
    "description": "The AtlasCloud API.",
    "title": "AtlasCloud API",
    "version": "1.0.0"
  },
  "openapi": "3.0.0",
  "paths": {
    "/api/v1/model/generateImage": {
      "post": {
        "requestBody": {
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/Input"
              }
            }
          },
          "required": true
        },
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "The request status."
          }
        }
      },
      "x-api-name": "model_run"
    },
    "/api/v1/model/prediction/{request_id}": {
      "get": {
        "parameters": [
          {
            "in": "path",
            "name": "request_id",
            "required": true,
            "schema": {
              "description": "Request ID",
              "type": "string"
            }
          }
        ],
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "Result of the request."
          }
        }
      },
      "x-api-name": "model_result"
    }
  },
  "servers": [
    {
      "url": "https://api.atlascloud.ai"
    }
  ]
}

LLM 友善的提示詞範本

# google/nano-banana-pro/text-to-image-developer

> Open and Advanced Large-Scale Image Generative Models.


## Overview

- **Submit endpoint (POST)**: `https://api.atlascloud.ai/api/v1/model/generateImage` — start an async generation; returns a `prediction_id`
- **Poll endpoint (GET)**: `https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}` — poll this until the prediction finishes
- **Model ID**: `google/nano-banana-pro/text-to-image-developer`


## API Information

This model can be used via our HTTP API or more conveniently via our client libraries.
See the input and output schema below, as well as the usage examples.


### Input Schema

The API accepts the following input parameters:

- **`model`** (`string`, _required_):
  model name
  - Default: `"google/nano-banana-pro/text-to-image-developer"`

- **`prompt`** (`string`, _required_):
  The positive prompt for the generation.

- **`aspect_ratio`** (`string`, _optional_):
  The aspect ratio of the generated media.
  - Options: "1:1", "3:2", "2:3", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9"

- **`resolution`** (`string`, _optional_):
  The resolution of the output image.
  - Default: `"1k"`
  - Options: "1k", "2k", "4k"

- **`enable_web_search`** (`boolean`, _optional_):
  If enabled, the model will use web search to ground the generation with real-time information.
  - Default: `false`

- **`enable_sync_mode`** (`boolean`, _optional_):
  If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.
  - Default: `false`

- **`enable_base64_output`** (`boolean`, _optional_):
  If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.
  - Default: `false`



**Required Parameters Example**:

```json
{
  "model": "google/nano-banana-pro/text-to-image-developer",
  "prompt": ""
}
```


**Full Example**:

```json
{
  "model": "google/nano-banana-pro/text-to-image-developer",
  "prompt": "",
  "aspect_ratio": "1:1",
  "resolution": "1k",
  "enable_web_search": false,
  "enable_sync_mode": false,
  "enable_base64_output": false
}
```


### Output Schema

The API returns the following output format:


- **`created_at`** (`string`, _optional_):
  ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”).

- **`has_nsfw_contents`** (`array[boolean]`, _optional_):
  Array of boolean values indicating NSFW detection for each output.

- **`id`** (`string`, _optional_):
  Unique identifier for the prediction, the ID of the prediction to get.

- **`model`** (`string`, _optional_):
  Model ID used for the prediction.

- **`outputs`** (`array[string]`, _optional_):
  Array of URLs to the generated content (empty when status is not completed).

- **`status`** (`string`, _optional_):
  Status of the task: created, processing, completed, or failed.

- **`urls`** (`object`, _optional_):
  Object containing related API endpoints.



**Example Response**:

```json
{
  "created_at": "",
  "has_nsfw_contents": [],
  "id": "",
  "model": "",
  "outputs": [
    ""
  ],
  "status": "",
  "urls": {}
}
```


## Usage Examples

### cURL

```bash
# Step 1: Start generation (async)
curl -X POST "https://api.atlascloud.ai/api/v1/model/generateImage" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "google/nano-banana-pro/text-to-image-developer",
  "prompt": "",
  "aspect_ratio": "1:1",
  "resolution": "1k",
  "enable_web_search": false,
  "enable_sync_mode": false,
  "enable_base64_output": false
}'

# Response will contain: {"code": 200, "data": {"id": "prediction_id", "status": "processing"}}

# Step 2: Poll for result (replace {prediction_id} with the id returned above)
curl -X GET "https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY"

# Keep polling until status is "completed", "succeeded" or "failed"
# When completed, outputs will contain the generated content URL(s)
```

## Additional Resources

### Documentation

- [Model Playground](https://www.atlascloud.ai/models/google/nano-banana-pro/text-to-image-developer)

Create a vibrant and modern magazine cover for Women’s Health, themed for April 2025. The main background is a warm, orange gradient with soft shadows, evoking a fresh spring mood. Centered is a stylish young woman sitting confidently on color-blocked orange cubes. She has long, voluminous, wavy blonde hair and a natural, glowing complexion. She’s dressed in a forest green zip-up windbreaker jacket with loose sleeves and an orange top underneath, paired with white athletic crew socks branded ‘SAMOLA’ and retro-style white sneakers with thick black stripes and tan soles. One leg is propped up, creating a confident, athletic pose. Her expression is calm and poised. Include magazine headlines in stylish fonts, balancing black, white, and lime green text, placed thoughtfully around the subject: • Top left: ‘Covid: five years on’ in pale lime green with subtext in black: ‘Has the pandemic reshaped your identity?’ • Top right: ‘Spring forward’ in bold black with subtext: ‘How to eat, travel and sweat for your healthiest season yet’ • Center right: ‘15 skincare habits beauty founders swear by’ with large lime green ‘15’ • Bottom left: ‘FAKE VIEWS: Inside the scroll holes telling women how to “fix” themselves’ in black and pale pink • Bottom left corner with a green plus sign: ‘The workout that experts are calling a magic pill’ • Bottom right over the box: ‘Em the nutritionist’ in elegant white serif font, with yellow subheading: ‘In the kitchen with wellness’s favourite foodie’ Design should reflect an empowering, clean, editorial style, with an emphasis on health, wellness, and bold femininity. Lighting should be studio-bright, shadows soft and controlled.

Plant vs. Zombies 3D Peashooter Picture

Same female character, 3-angle turnaround (front, side, 3/4), consistent face and lighting, detailed outfit texture, animation design sheet.

Generate a screenshot of a windows 11 desktop, with google chrome open, showing a YouTube thumbnail of Mr. Beast on YouTube.com

Please create a solution layout for a mathematics problem with a paper-texture background. Requirements: split the canvas into left and right sections—\emph{left:} schematic of the plan (arrows/notes, scale, directions); \emph{right:} step-by-step derivation. Use consistent annotations in the figure: known quantities, unknowns, key relations, and coordinate axes or normals. Box the final answer and include a check. \textbf{Problem:} Given $v_0=20\,\text{m/s}$ and $\theta=30^\circ$, find the time of flight, the maximum height, and the range, and output the position at $t=1\,\text{s}$. Take $g=10\,\text{m/s}^2$. draw the question and solution.

A perfectly reflective chrome (Chrome) mirror ball placed on a black and white checkerboard.

High-quality flat lay photography creating a DIY infographic that simply explains how solar energy works, arranged on a clean, light gray textured background. The visual story flows from left to right in clear steps. Simple, clean black arrows are hand-drawn onto the background to guide the viewer's eye from the sun to the house, clearly marking the flow of energy. The overall mood is educational, modern, and easy to understand. The image is shot from a top-down, bird's-eye view with soft, even lighting that minimizes shadows and keeps the focus on the process. Format 16:9

An intense, cinematic 3D animation style render of a boxing match taking place inside a large sizzling frying pan. The main characters are an anthropomorphic French fry wearing a red, white, and blue sweatband and boxing gloves, fighting against an anthropomorphic onion wearing a blue wrestling singlet and a mustache. They are in a dynamic fighting pose with hot oil splashing around their feet and flying vegetable particles. In the background, a crowd of anthropomorphic burgers, potatoes, and hot dogs are watching and cheering. The lighting is professional studio lighting with a kitchen background, high quality, octane render, hyper-realistic food textures, 8k resolution.

actor standing on set surrounded by two large cinema cameras, LED walls behind creating a sci-fi backdrop illusion, crew marking positions on the floor, realistic production lighting, ultra-real cinematic style

behind-the-scenes of a high-end commercial shoot, fashion model under giant soft lights, photographers, stylists fixing details, production assistants holding reflectors, studio filled with pro equipment, crisp realistic image

Generate a black-and-white comic for me about a Japanese high school student being late for school.

a realistic person blended into an artistic collage of textures, shapes, torn paper layers, bold contrasting typography, experimental poster layout, overlapping elements, vibrant color palette, modern graphic design aesthetic, high-resolution details

載入中...

高級圖像生成

多圖融合技術
跨代角色一致性
風格保持轉換
最高 4K 高分辨率輸出

智能編輯工具

基於文本的智能編輯
對象添加和移除
背景替換
風格遷移和藝術效果

Transform to Figure

Photo to Character Figure

Transform any photo into a realistic character figure with packaging and display

Prompt

turn this photo into a character figure. Behind it, place a box with the character's image printed on it, and a computer showing the Blender modeling process on its screen. In front of the box, add a round plastic base with the character figure standing on it. set the scene indoors if possible

Anime to Real

Anime to Cosplay

Transform anime illustrations into realistic cosplay photography

Prompt

Generate a highly detailed photo of a girl cosplaying this illustration, at Comiket. Exactly replicate the same pose, body posture, hand gestures, facial expression, and camera framing as in the original illustration. Keep the same angle, perspective, and composition, without any deviation

Photo to Action Figure

Person to Action Figure

Transform people from photos into collectible action figures with custom packaging

Prompt

Transform the the person in the photo into an action figure, styled after [CHARACTER_NAME] from [SOURCE / CONTEXT]. Next to the figure, display the accessories including [ITEM_1], [ITEM_2], and [ITEM_3]. On the top of the toy box, write "[BOX_LABEL_TOP]", and underneath it, "[BOX_LABEL_BOTTOM]". Place the box in a [BACKGROUND_SETTING] environment. Visualize this in a highly realistic way with attention to fine details.

Photo to Funko Pop

Person to Funko Pop Figure

Transform photos into Funko Pop style collectible figures with custom packaging

Prompt

Transform the person in the photo into the style of a Funko Pop figure packaging box, presented in an isometric perspective. Label the packaging with the title 'ZHOGUE'. Inside the box, showcase the figure based on the person in the photo, accompanied by their essential items (such as cosmetics, bags, or others). Next to the box, also display the actual figure itself outside of the packaging, rendered in a realistic and lifelike style.

Design to Reality

Product Design to Photorealistic Render

Transform product design sketches into photorealistic renders

Prompt

turn this illustration of a perfume into a realistic version, Frosted glass bottle with a marble cap

Face Reference Control

Transform to Q-Version Character

Create cartoon characters with face shape reference control

Prompt

Transform the person from image 1 into a Q-version character design based on the face shape from image 2

Architecture to Model

Building to 3D Architecture Model

Convert architectural photos into detailed physical models

Prompt

convert this photo into a architecture model. Behind the model, there should be a cardboard box with an image of the architecture from the photo on it. There should also be a computer, with the content on the computer screen showing the Blender modeling process of the figurine. In front of the cardboard box, place a cardstock and put the architecture model from the photo I provided on it. I hope the PVC material can be clearly presented. It would be even better if the background is indoors.

技術亮點

性能

閃電般快速生成

針對速度進行優化，大多數任務的生成時間不到 2 秒，非常適合實時應用和快速原型製作工作流。

品質

卓越的輸出品質

利用 Google 先進的 AI 架構，生成高度細節、逼真的圖像，具有準確的光照、紋理和構圖。

創新

新視角合成

革命性的 2D 轉 3D 轉換能力，能夠從單張圖像創建多個視角，為內容創作開闢新的可能性。

使用案例

📸

產品攝影

🎨

數字藝術創作

✨

照片增強

📊

營銷視覺

👤

角色設計

👔

虛擬試穿

📱

社交媒體

🔄

照片修復

為什麼選擇 Nano Banana？

🚀

無需設置

無需複雜配置或安裝，立即開始創作

🎯

精確控制

通過直觀的文本命令微調創作的每個方面

🔄

一致的結果

在多次生成中保持角色和風格的一致性

技術規格

模型架構：由 Google AI Studio 提供支持

處理速度：平均生成時間 < 2 秒

分辨率支持：最高 4096x4096 像素

格式支持：PNG、JPEG、WebP 輸出格式

多模態輸入：文本、圖像和組合提示詞

API 集成：RESTful API 與完整文檔

體驗 Nano Banana AI 的強大功能

加入數千名創作者和企業，用 Google 最先進的圖像 AI 技術改變您的視覺內容。

✨免費額度開始

⚡即時訪問

🌐隨處可用

Nano Banana Pro : A state-of-the-art, multimodal reasoning and image generation model by Google DeepMind

Model Card Overview

Field	Description
Model Name	Nano Banana Pro (also known as Gemini 3 Pro Image)
Developer	Google DeepMind
Release Date	November 20, 2025
Model Type	Multimodal Reasoning and Image Generation
Related Links	Official Product Page, Model Card (PDF)

Introduction

Nano Banana Pro, officially designated as Gemini 3 Pro Image, represents the next generation in Google's series of highly-capable, natively multimodal models. It is designed for professional asset production, integrating the advanced reasoning capabilities of the Gemini 3 Pro foundation model with a sophisticated image generation engine. The primary goal of Nano Banana Pro is to provide users with studio-quality precision and control, enabling the creation of complex, high-fidelity visuals from textual and image-based prompts. Its core contribution lies in its ability to understand and execute intricate instructions, maintain character and scene consistency, and render legible text directly within generated images, setting a new standard for professional creative workflows.

Key Features & Innovations

Nano Banana Pro introduces several technical breakthroughs that distinguish it from prior models:

Superior Text Rendering: The model excels at generating images that contain clear, accurate, and stylistically coherent text, making it ideal for creating posters, diagrams, and marketing materials.
Advanced Creative Controls: Users can exercise fine-grained control over image outputs, including camera angles, lighting transformations (e.g., day to night), color grading, depth of field, and localized editing.
High-Fidelity Consistency: It can maintain the consistency of up to 14 input images and blend up to 5 distinct characters seamlessly into complex compositions, ensuring visual coherence across a series of generated images.
Deep Real-World Knowledge: Built on Gemini 3 Pro, the model leverages a vast understanding of the world to generate contextually rich and factually grounded visuals, from detailed infographics to historically accurate scenes.
Multilingual Capabilities: The model can accurately render and translate text across multiple languages within an image, facilitating the localization of visual content.
Complex Composition from Multiple Inputs: Nano Banana Pro can synthesize elements from multiple source images and text prompts to create a single, cohesive scene, enabling complex creative concepts.

Model Architecture & Technical Details

Nano Banana Pro's architecture is fundamentally based on the Gemini 3 Pro model. While specific architectural details are not fully disclosed, the following technical information is available:

Foundation Model: Gemini 3 Pro
Inputs: The model accepts text strings and images as input, with a large context window of up to 1 million tokens.
Outputs: It generates high-resolution images (up to 4K) with a 64K token output capacity for handling complex generation tasks.
Training Infrastructure:
- Hardware: The model was trained on Google's custom-designed Tensor Processing Units (TPUs), which are optimized for large-scale machine learning computations and high-bandwidth memory access.
- Software: The training process utilized JAX and ML Pathways, Google's high-performance frameworks for machine learning research.
Knowledge Cutoff: The model's internal knowledge base has a cutoff date of January 2025.

Intended Use & Applications

Nano Banana Pro is intended for professional and creative applications that require a high degree of precision, control, and visual fidelity. It is well-suited for a variety of downstream tasks and application scenarios:

Professional Content Creation: Generating production-ready assets for marketing campaigns, advertising, and branding.
Design and Prototyping: Creating detailed product mockups, storyboards for film and animation, and architectural visualizations.
Informational Graphics: Designing complex and accurate infographics, educational diagrams, and data visualizations.
Artistic and Creative Expression: Enabling artists and designers to explore novel visual styles and create complex, multi-element compositions.

Performance

Nano Banana Pro's performance has been evaluated through extensive human evaluations and benchmarked against other leading image generation models. The results, measured in Elo scores, demonstrate its strong capabilities across a wide range of tasks.

A technical report also notes a performance dichotomy: while the model produces subjectively superior visual quality by hallucinating plausible details, it can lag behind specialist models in traditional quantitative metrics due to the stochastic nature of generative models.

Existing Capabilities (Elo Score Comparison)

Capability	Gemini 3 Pro Image	Gemini 2.5 Flash Image	GPT-Image 1	Seedream v4 4k	Flux Pro Kontext Max
Text Rendering	1198 ± 18	997 ± 10	1150 ± 14	1019 ± 13	854 ± 13
Stylization	1098 ± 11	933 ± 7	1069 ± 9	991 ± 9	908 ± 11
Multi-Turn	1186 ± 19	1045 ± 24	1079 ± 32	990 ± 32	889 ± 37
General Image Editing	1127 ± 13	996 ± 8	1011 ± 13	965 ± 12	902 ± 13
Character Editing	1176 ± 16	1075 ± 8	1016 ± 10	889 ± 10	843 ± 10
Object/Env. Editing	1102 ± 19	1025 ± 9	930 ± 12	983 ± 13	961 ± 10
General Text-to-Image	1094 ± 16	1037 ± 8	1025 ± 9	1011 ± 9	907 ± 9

New Capabilities (Elo Score Comparison)

Capability	Gemini 3 Pro Image	Gemini 2.5 Flash Image	GPT-Image 1	Seedream v4 4k	Flux Pro Kontext Max
Multi-character Editing	1213 ± 16	950 ± 10	997 ± 13	840 ± 19	-
Chart Editing	1209 ± 18	971 ± 10	994 ± 16	934 ± 16	893 ± 15
Text Editing	1202 ± 23	1001 ± 10	996 ± 14	860 ± 15	943 ± 12
Factuality - Edu	1169 ± 25	1050 ± 11	1084 ± 25	969 ± 22	884 ± 26
Infographics	1268 ± 17	1162 ± 11	1087 ± 12	1049 ± 12	824 ± 15
Visual Design	1104 ± 16	1083 ± 7	1028 ± 11	1038 ± 12	907 ± 11

探索類似模型

NEW

圖生圖

DEV

Nano Banana 2 Lite Edit Developer

Google's fastest and most cost-efficient Nano Banana image model for editing, applying natural-language edits and multi-image composition to up to 14 reference images with low latency.

Nano Banana 2 Lite Text-to-Image Developer

Google's fastest and most cost-efficient Nano Banana image model, turning natural-language text prompts into high-quality 1k images in as little as 4 seconds for rapid, high-volume generation.

Nano Banana 2 Lite Edit

Nano banana lite is the efficiency-focused model in the image generation family. Sub-2 second latency with cost-effective generation and editing, fast multi-turn local edits, and 14 supported aspect ratios.

Nano Banana 2 Lite Text-to-image

Nano Banana 2 Reference to Image

Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions.

Nano Banana 2 Reference to Image Developer

Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions.

Nano Banana 2 Text-to-Image Developer

Google's lightweight yet powerful AI image generation model, built for creators who need fast, high-quality visuals from simple text prompts.

Nano Banana 2 Text-to-Image

Google's lightweight yet powerful AI image generation model, built for creators who need fast, high-quality visuals from simple text prompts.

Nano Banana 2 Edit Developer

Google's advanced AI-powered image editing and generation model, designed to make visual transformation as intuitive as describing it in words.

Nano Banana 2 Edit

Google's advanced AI-powered image editing and generation model, designed to make visual transformation as intuitive as describing it in words.

Nano Banana Pro Text-to-image Ultra

Nano Banana Pro is the next-generation Nano Banana image model, delivering sharper detail, richer color control, and faster diffusion for production-ready visuals.

Nano Banana Pro Edit Ultra

Nano Banana Pro Edit is an image editing tool built on the Nano Banana model family, designed for precise, AI-powered visual adjustments.

Nano Banana Pro Text-to-image

Nano Banana Pro is the next-generation Nano Banana image model, delivering sharper detail, richer color control, and faster diffusion for production-ready visuals.

From

$0.14/張