
Nano Banana Pro Text-to-Image Ultra API by Google
Nano Banana Pro is the next-generation Nano Banana image model, delivering sharper detail, richer color control, and faster diffusion for production-ready visuals.
코드 예시
import requests
import time
# Step 1: Start image generation
generate_url = "https://api.atlascloud.ai/api/v1/model/generateImage"
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
"model": "google/nano-banana-pro/text-to-image-ultra",
"prompt": "A beautiful landscape with mountains and lake",
"width": 512,
"height": 512,
"steps": 20,
"guidance_scale": 7.5,
}
generate_response = requests.post(generate_url, headers=headers, json=data)
generate_result = generate_response.json()
prediction_id = generate_result["data"]["id"]
# Step 2: Poll for result
poll_url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"
def check_status():
while True:
response = requests.get(poll_url, headers={"Authorization": "Bearer $ATLASCLOUD_API_KEY"})
result = response.json()
if result["data"]["status"] == "completed":
print("Generated image:", result["data"]["outputs"][0])
return result["data"]["outputs"][0]
elif result["data"]["status"] == "failed":
raise Exception(result["data"]["error"] or "Generation failed")
else:
# Still processing, wait 2 seconds
time.sleep(2)
image_url = check_status()설치
사용하는 언어에 필요한 패키지를 설치하세요.
pip install requests인증
모든 API 요청에는 API 키를 통한 인증이 필요합니다. Atlas Cloud 대시보드에서 API 키를 받을 수 있습니다.
export ATLASCLOUD_API_KEY="your-api-key-here"HTTP 헤더
import os
API_KEY = os.environ.get("ATLASCLOUD_API_KEY")
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {API_KEY}"
}클라이언트 측 코드나 공개 저장소에 API 키를 노출하지 마세요. 대신 환경 변수 또는 백엔드 프록시를 사용하세요.
요청 제출
import requests
url = "https://api.atlascloud.ai/api/v1/model/generateImage"
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
"model": "your-model",
"prompt": "A beautiful landscape"
}
response = requests.post(url, headers=headers, json=data)
print(response.json())요청 제출
비동기 생성 요청을 제출합니다. API는 상태 확인 및 결과 조회에 사용할 수 있는 예측 ID를 반환합니다.
/api/v1/model/generateImage요청 본문
import requests
url = "https://api.atlascloud.ai/api/v1/model/generateImage"
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
"model": "google/nano-banana-pro/text-to-image-ultra",
"input": {
"prompt": "A beautiful landscape with mountains and lake"
}
}
response = requests.post(url, headers=headers, json=data)
result = response.json()
print(f"Prediction ID: {result['id']}")
print(f"Status: {result['status']}")응답
{
"id": "pred_abc123",
"status": "processing",
"model": "model-name",
"created_at": "2025-01-01T00:00:00Z"
}상태 확인
예측 엔드포인트를 폴링하여 요청의 현재 상태를 확인합니다.
/api/v1/model/prediction/{prediction_id}폴링 예시
import requests
import time
prediction_id = "pred_abc123"
url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }
while True:
response = requests.get(url, headers=headers)
result = response.json()
status = result["data"]["status"]
print(f"Status: {status}")
if status in ["completed", "succeeded"]:
output_url = result["data"]["outputs"][0]
print(f"Output URL: {output_url}")
break
elif status == "failed":
print(f"Error: {result['data'].get('error', 'Unknown')}")
break
time.sleep(3)상태 값
processing요청이 아직 처리 중입니다.completed생성이 완료되었습니다. 출력을 사용할 수 있습니다.succeeded생성이 성공했습니다. 출력을 사용할 수 있습니다.failed생성에 실패했습니다. 오류 필드를 확인하세요.완료 응답
{
"data": {
"id": "pred_abc123",
"status": "completed",
"outputs": [
"https://storage.atlascloud.ai/outputs/result.png"
],
"metrics": {
"predict_time": 8.3
},
"created_at": "2025-01-01T00:00:00Z",
"completed_at": "2025-01-01T00:00:10Z"
}
}파일 업로드
Atlas Cloud 스토리지에 파일을 업로드하고 API 요청에 사용할 수 있는 URL을 받습니다. multipart/form-data를 사용하여 업로드합니다.
/api/v1/model/uploadMedia업로드 예시
import requests
url = "https://api.atlascloud.ai/api/v1/model/uploadMedia"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }
with open("image.png", "rb") as f:
files = {"file": ("image.png", f, "image/png")}
response = requests.post(url, headers=headers, files=files)
result = response.json()
download_url = result["data"]["download_url"]
print(f"File URL: {download_url}")응답
{
"data": {
"download_url": "https://storage.atlascloud.ai/uploads/abc123/image.png",
"file_name": "image.png",
"content_type": "image/png",
"size": 1024000
}
}입력 Schema
다음 매개변수가 요청 본문에서 사용 가능합니다.
사용 가능한 매개변수가 없습니다.
요청 본문 예시
{
"model": "google/nano-banana-pro/text-to-image-ultra"
}출력 Schema
API는 생성된 출력 URL이 포함된 예측 응답을 반환합니다.
응답 예시
{
"id": "pred_abc123",
"status": "completed",
"model": "model-name",
"outputs": [
"https://storage.atlascloud.ai/outputs/result.png"
],
"metrics": {
"predict_time": 8.3
},
"created_at": "2025-01-01T00:00:00Z",
"completed_at": "2025-01-01T00:00:10Z"
}Atlas Cloud Skills
Atlas Cloud Skills는 300개 이상의 AI 모델을 AI 코딩 어시스턴트에 직접 통합합니다. 한 번의 명령으로 설치하고 자연어로 이미지, 동영상 생성 및 LLM과 대화할 수 있습니다.
지원 클라이언트
설치
npx skills add AtlasCloudAI/atlas-cloud-skillsAPI 키 설정
Atlas Cloud 대시보드에서 API 키를 받아 환경 변수로 설정하세요.
export ATLASCLOUD_API_KEY="your-api-key-here"기능
설치 후 AI 어시스턴트에서 자연어를 사용하여 모든 Atlas Cloud 모델에 접근할 수 있습니다.
MCP Server
Atlas Cloud MCP Server는 Model Context Protocol을 통해 IDE와 300개 이상의 AI 모델을 연결합니다. MCP 호환 클라이언트에서 사용할 수 있습니다.
지원 클라이언트
설치
npx -y atlascloud-mcp설정
다음 설정을 IDE의 MCP 설정 파일에 추가하세요.
{
"mcpServers": {
"atlascloud": {
"command": "npx",
"args": [
"-y",
"atlascloud-mcp"
],
"env": {
"ATLASCLOUD_API_KEY": "your-api-key-here"
}
}
}
}사용 가능한 도구
API 스키마
스키마를 사용할 수 없음Seedance 1.5 Pro
네이티브 오디오-비주얼 동기화 생성사운드와 비전, 원테이크로 완벽 동기화
ByteDance의 혁신적인 AI 모델로 단일 통합 프로세스에서 완벽하게 동기화된 오디오와 비디오를 동시에 생성합니다. 8개 이상의 언어에서 밀리초 단위 정밀도의 립싱크를 제공하는 진정한 네이티브 오디오-비주얼 생성을 경험하세요.
- Multi-image fusion technology
- Character consistency across generations
- Style-preserving transformations
- High-resolution output up to 4K
- Text-based intelligent editing
- Object addition and removal
- Background replacement
- Style transfer and artistic effects
Prompt Examples & Templates
Explore curated prompt templates to unlock the full potential of Nano Banana AI. Click to copy any prompt and start creating immediately.

Photo to Character Figure
Transform any photo into a realistic character figure with packaging and displayturn this photo into a character figure. Behind it, place a box with the character's image printed on it, and a computer showing the Blender modeling process on its screen. In front of the box, add a round plastic base with the character figure standing on it. set the scene indoors if possible

Anime to Cosplay
Transform anime illustrations into realistic cosplay photographyGenerate a highly detailed photo of a girl cosplaying this illustration, at Comiket. Exactly replicate the same pose, body posture, hand gestures, facial expression, and camera framing as in the original illustration. Keep the same angle, perspective, and composition, without any deviation

Person to Action Figure
Transform people from photos into collectible action figures with custom packagingTransform the the person in the photo into an action figure, styled after [CHARACTER_NAME] from [SOURCE / CONTEXT]. Next to the figure, display the accessories including [ITEM_1], [ITEM_2], and [ITEM_3]. On the top of the toy box, write "[BOX_LABEL_TOP]", and underneath it, "[BOX_LABEL_BOTTOM]". Place the box in a [BACKGROUND_SETTING] environment. Visualize this in a highly realistic way with attention to fine details.

Person to Funko Pop Figure
Transform photos into Funko Pop style collectible figures with custom packagingTransform the person in the photo into the style of a Funko Pop figure packaging box, presented in an isometric perspective. Label the packaging with the title 'ZHOGUE'. Inside the box, showcase the figure based on the person in the photo, accompanied by their essential items (such as cosmetics, bags, or others). Next to the box, also display the actual figure itself outside of the packaging, rendered in a realistic and lifelike style.

Product Design to Photorealistic Render
Transform product design sketches into photorealistic rendersturn this illustration of a perfume into a realistic version, Frosted glass bottle with a marble cap

Transform to Q-Version Character
Create cartoon characters with face shape reference controlTransform the person from image 1 into a Q-version character design based on the face shape from image 2

Building to 3D Architecture Model
Convert architectural photos into detailed physical modelsconvert this photo into a architecture model. Behind the model, there should be a cardboard box with an image of the architecture from the photo on it. There should also be a computer, with the content on the computer screen showing the Blender modeling process of the figurine. In front of the cardboard box, place a cardstock and put the architecture model from the photo I provided on it. I hope the PVC material can be clearly presented. It would be even better if the background is indoors.
Technical Highlights
Optimized for speed with generation times under 2 seconds for most tasks, making it perfect for real-time applications and rapid prototyping workflows.
Leveraging Google's advanced AI architecture to produce highly detailed, photorealistic images with accurate lighting, textures, and compositions.
Revolutionary 2D-to-3D conversion capabilities enabling creation of multiple viewpoints from a single image, opening new possibilities for content creation.
완벽한 활용
Why Choose Nano Banana?
No Setup Required
Start creating immediately without complex configurations or installationsPrecision Control
Fine-tune every aspect of your creation with intuitive text commandsConsistent Results
Maintain character and style consistency across multiple generations기술 사양
네이티브 오디오-비주얼 생성 경험
Seedance 1.5 Pro의 획기적인 기술로 비디오 콘텐츠 제작을 혁신하고 있는 전 세계 영화 제작자, 광고주, 크리에이터들과 함께하세요.
Nano Banana Pro : A state-of-the-art, multimodal reasoning and image generation model by Google DeepMind
Model Card Overview
| Field | Description |
|---|---|
| Model Name | Nano Banana Pro (also known as Gemini 3 Pro Image) |
| Developer | Google DeepMind |
| Release Date | November 20, 2025 |
| Model Type | Multimodal Reasoning and Image Generation |
| Related Links | Official Product Page, Model Card (PDF) |
Introduction
Nano Banana Pro, officially designated as Gemini 3 Pro Image, represents the next generation in Google's series of highly-capable, natively multimodal models. It is designed for professional asset production, integrating the advanced reasoning capabilities of the Gemini 3 Pro foundation model with a sophisticated image generation engine. The primary goal of Nano Banana Pro is to provide users with studio-quality precision and control, enabling the creation of complex, high-fidelity visuals from textual and image-based prompts. Its core contribution lies in its ability to understand and execute intricate instructions, maintain character and scene consistency, and render legible text directly within generated images, setting a new standard for professional creative workflows.
Key Features & Innovations
Nano Banana Pro introduces several technical breakthroughs that distinguish it from prior models:
- Superior Text Rendering: The model excels at generating images that contain clear, accurate, and stylistically coherent text, making it ideal for creating posters, diagrams, and marketing materials.
- Advanced Creative Controls: Users can exercise fine-grained control over image outputs, including camera angles, lighting transformations (e.g., day to night), color grading, depth of field, and localized editing.
- High-Fidelity Consistency: It can maintain the consistency of up to 14 input images and blend up to 5 distinct characters seamlessly into complex compositions, ensuring visual coherence across a series of generated images.
- Deep Real-World Knowledge: Built on Gemini 3 Pro, the model leverages a vast understanding of the world to generate contextually rich and factually grounded visuals, from detailed infographics to historically accurate scenes.
- Multilingual Capabilities: The model can accurately render and translate text across multiple languages within an image, facilitating the localization of visual content.
- Complex Composition from Multiple Inputs: Nano Banana Pro can synthesize elements from multiple source images and text prompts to create a single, cohesive scene, enabling complex creative concepts.
Model Architecture & Technical Details
Nano Banana Pro's architecture is fundamentally based on the Gemini 3 Pro model. While specific architectural details are not fully disclosed, the following technical information is available:
- Foundation Model: Gemini 3 Pro
- Inputs: The model accepts text strings and images as input, with a large context window of up to 1 million tokens.
- Outputs: It generates high-resolution images (up to 4K) with a 64K token output capacity for handling complex generation tasks.
- Training Infrastructure:
- Hardware: The model was trained on Google's custom-designed Tensor Processing Units (TPUs), which are optimized for large-scale machine learning computations and high-bandwidth memory access.
- Software: The training process utilized JAX and ML Pathways, Google's high-performance frameworks for machine learning research.
- Knowledge Cutoff: The model's internal knowledge base has a cutoff date of January 2025.
Intended Use & Applications
Nano Banana Pro is intended for professional and creative applications that require a high degree of precision, control, and visual fidelity. It is well-suited for a variety of downstream tasks and application scenarios:
- Professional Content Creation: Generating production-ready assets for marketing campaigns, advertising, and branding.
- Design and Prototyping: Creating detailed product mockups, storyboards for film and animation, and architectural visualizations.
- Informational Graphics: Designing complex and accurate infographics, educational diagrams, and data visualizations.
- Artistic and Creative Expression: Enabling artists and designers to explore novel visual styles and create complex, multi-element compositions.
Performance
Nano Banana Pro's performance has been evaluated through extensive human evaluations and benchmarked against other leading image generation models. The results, measured in Elo scores, demonstrate its strong capabilities across a wide range of tasks.
A technical report also notes a performance dichotomy: while the model produces subjectively superior visual quality by hallucinating plausible details, it can lag behind specialist models in traditional quantitative metrics due to the stochastic nature of generative models.
Existing Capabilities (Elo Score Comparison)
| Capability | Gemini 3 Pro Image | Gemini 2.5 Flash Image | GPT-Image 1 | Seedream v4 4k | Flux Pro Kontext Max |
|---|---|---|---|---|---|
| Text Rendering | 1198 ± 18 | 997 ± 10 | 1150 ± 14 | 1019 ± 13 | 854 ± 13 |
| Stylization | 1098 ± 11 | 933 ± 7 | 1069 ± 9 | 991 ± 9 | 908 ± 11 |
| Multi-Turn | 1186 ± 19 | 1045 ± 24 | 1079 ± 32 | 990 ± 32 | 889 ± 37 |
| General Image Editing | 1127 ± 13 | 996 ± 8 | 1011 ± 13 | 965 ± 12 | 902 ± 13 |
| Character Editing | 1176 ± 16 | 1075 ± 8 | 1016 ± 10 | 889 ± 10 | 843 ± 10 |
| Object/Env. Editing | 1102 ± 19 | 1025 ± 9 | 930 ± 12 | 983 ± 13 | 961 ± 10 |
| General Text-to-Image | 1094 ± 16 | 1037 ± 8 | 1025 ± 9 | 1011 ± 9 | 907 ± 9 |
New Capabilities (Elo Score Comparison)
| Capability | Gemini 3 Pro Image | Gemini 2.5 Flash Image | GPT-Image 1 | Seedream v4 4k | Flux Pro Kontext Max |
|---|---|---|---|---|---|
| Multi-character Editing | 1213 ± 16 | 950 ± 10 | 997 ± 13 | 840 ± 19 | - |
| Chart Editing | 1209 ± 18 | 971 ± 10 | 994 ± 16 | 934 ± 16 | 893 ± 15 |
| Text Editing | 1202 ± 23 | 1001 ± 10 | 996 ± 14 | 860 ± 15 | 943 ± 12 |
| Factuality - Edu | 1169 ± 25 | 1050 ± 11 | 1084 ± 25 | 969 ± 22 | 884 ± 26 |
| Infographics | 1268 ± 17 | 1162 ± 11 | 1087 ± 12 | 1049 ± 12 | 824 ± 15 |
| Visual Design | 1104 ± 16 | 1083 ± 7 | 1028 ± 11 | 1038 ± 12 | 907 ± 11 |






