什麼是 Atlas Cloud API？它是如何運作的？

Atlas Cloud 是一個全模態 AI 推論平台，開發者僅需一組統一的 AI API，即可呼叫全球頂級的影片生成 API、影像生成 API 與 LLM API。無需逐家串接供應商，一次接入即可統一存取 300 多款精選模型，覆蓋全部模態。底層基礎設施、彈性擴容與模型更新皆由 Atlas Cloud 代管，讓你能專注於產品本身。

Atlas Cloud API 與 OpenAI API 相容嗎？

相容。Atlas Cloud 提供與 OpenAI 相容的 API 端點，可作為現有整合的真正無縫替代。若你已在使用 OpenAI SDK，只需替換 base URL 與 API Key，無需更動任何程式碼即可完成從 OpenAI 至 Atlas Cloud 的遷移。對於希望尋找模型覆蓋更廣、成本更低的 OpenAI 替代方案的開發者而言，這是最快的遷移路徑。

如何快速接入 Atlas Cloud API？

幾分鐘即可上手。註冊免費帳號、在控制台產生免費的 AI API 金鑰，並依文件中的開發者快速入門指南操作即可。多數開發者可於 5 分鐘內完成首次 API 呼叫，註冊無需綁定信用卡。

Atlas Cloud 支援哪些 AI 模型？

Atlas Cloud 支援橫跨全模態的 300 多款模型。影片生成：HappyHorse API、Seedance API、Kling API、Wan API、Veo API 與 Runway API；影像生成：Flux API、GPT Image API 與 Nano Banana API；LLM：DeepSeek API、Qwen API、GLM API 與 MiniMax API。我們堅持 Day 0 同步上線最新模型，讓你始終掌握業界前沿能力，無需在多個平台之間切換。

Atlas Cloud API 的計費方式為何？

Atlas Cloud 採用用量計費（pay-as-you-go），無月度最低消費，亦無按席次收費，用多少付多少。同等模型下，我們的 API 價格穩定低於 kie.ai 與 fal.ai，價格頁面亦公開透明的按秒或按 token 計費標準。沒有任何隱藏的基礎設施成本，所見即所付。

Atlas Cloud 是否支援串流與批次處理？

支援。Atlas Cloud 原生提供串流 API 回應（適用於 LLM 即時輸出）、面向高吞吐非同步工作負載的批次推論，以及開箱即用的結構化輸出。無論是打造低延遲對話介面，或運行每秒處理數千請求的非同步 API 管線，同一平台皆可勝任，無需切換設定或引入額外工具。

Atlas Cloud API 適合企業級正式環境嗎？

完全適合。Atlas Cloud 是為企業級 AI API 需求而生的平台，已取得 SOC 2（SOC I & II）認證與 HIPAA 法規遵循認證，資料傳輸與靜態儲存均加密。對於資料合規要求更嚴格的組織，我們亦提供專屬雲端部署與獨立基礎設施方案。平台具備產品級穩定度、高併發承載能力與專屬支援團隊，可承擔任意規模的核心業務負載。

可以將 Atlas Cloud 作為 Replicate 的替代方案嗎？

可以，事實上許多開發者正是因此而轉換。作為 Replicate 替代方案，Atlas Cloud 擁有更廣的模型矩陣、更低的單次產生成本，並提供與 OpenAI 相容的端點以簡化整合。若你亦在評估 fal.ai 或 Together AI 的替代方案，Atlas Cloud 在同一平台、同一 API 金鑰、同一計費帳戶下整合了影片、影像、LLM 與音訊，讓多模態能力一站串接。

Atlas Cloud | 全模態 AI 平台 - 對話、圖像、影片、語音統一 API

模型系列

Seedance 2.0

Seedance 2.0 API 為您提供 ByteDance 多模態影片模型的生產級存取權限——支援四模態輸入（文字、圖像、影片、音訊），並具備業界領先的「通用參考」(Universal Reference)系統，可在不同鏡頭間鎖定構圖、運鏡與角色動作。只需一次 API 呼叫即可整合導演級控制，統一定價 $0.09/秒，即時取得金鑰，無需排隊等待——由企業級高可用性與合規性提供全面保障。

探索

模型分類

Seedance 2.0 Text-to-Video

$0.112/SEC

Seedance 2.0 Image-to-Video

$0.112/SEC

Seedance 2.0 Reference-to-Video

$0.112/SEC

Seedance 2.0 Fast Text-to-Video

$0.09/SEC

Seedance 2.0 Fast Image-to-Video

$0.09/SEC

Seedance 2.0 Fast Reference-to-Video

$0.09/SEC

模型系列

Grok-Imagine

Grok Imagine Image Quality is xAI's latest AI image generation model, delivering studio-grade visuals with up to 2K resolution and razor-sharp detail. It offers best-in-class text rendering across multiple languages, photorealistic outputs with natural lighting, rich textures, and believable physics, plus tighter prompt following and image editing with reference inputs for precise creative control. Ideal for hero images, ad creatives, product renders, and brand-grade visuals.

探索

模型分類

xAI TTS v1

$0.015/PIC

Grok Imagine Video v1.5 Image-to-Video

$0.08/SEC

Grok Imagine Image Quality Text-to-Image

$0.05/PIC

Grok Imagine Image Quality Edit

$0.05/PIC

Grok Imagine Video Text-to-Video

$0.05/SEC

Grok Imagine Video Image-to-Video

$0.05/SEC

Grok Imagine Video Reference-to-Video

$0.05/SEC

Grok Imagine Video Extend

$0.07/SEC

Grok Imagine Video Edit

$0.07/SEC

Grok Imagine Image Edit

$0.02/PIC

Grok Imagine Image Text-to-Image

$0.02/PIC

模型系列

GPT Image 2

GPT Image 2 is a state-of-the-art multimodal foundation model engineered for exceptional text-to-image generation with unprecedented photorealism and creative versatility. Developed by OpenAI as the evolution of the DALL-E lineage, it transforms detailed natural language descriptions into hyper-realistic imagery at up to 4K resolution. With proprietary "Neural Rendering Engine" technology for precise visual control, GPT Image 2 delivers studio-quality results with accurate anatomy, lighting, and composition—making it the premier AI tool for professional creators, enterprises, and developers demanding production-ready visual assets.

探索

模型分類

Openai GPT Image 2 Text-to-Image

$0.009/PIC

Openai GPT Image 2 Edit

$0.01/PIC

GPT Image 2 Developer Edit

$0.01/PIC$0.005/PIC

GPT Image 2 Developer Text-to-Image

$0.009/PIC$0.004/PIC

模型系列

MAI

MAI-Image-2.5 是 Microsoft 最新推出的逼真圖像生成與編輯模型系列，專為商業設計、產品攝影和品牌級內容創作而打造。提供用於文字生成圖像和圖像編輯的標準版與 Flash 版本，以極具競爭力的價格（每張圖像起價 0.03 美元）提供同類最佳的 Arena ELO 得分。憑藉精準的文字渲染、手術刀級的編輯能力以及自然的人像生成，MAI-Image-2.5 專為需要生產級品質視覺效果且無需承擔後製處理成本的團隊而設計。

探索

模型分類

MAI-Image-2.5-Flash Text-to-image

$0.03/PIC

MAI-Image-2.5 Edit

$0.058/PIC

MAI-Image-2.5 Text-to-image

$0.05/PIC

MAI-Image-2.5-Flash Edit

$0.038/PIC

模型系列

Wan 2.7

Launching this March, Wan2.7 is the latest powerhouse in the Qwen ecosystem, delivering a massive upgrade in visual fidelity, audio synchronization, and motion consistency over version 2.6. This all-in-one AI video generator supports advanced features like first-and-last frame control, 3x3 grid synthesis, and instruction-based video editing. Outperforming competitors like Jimeng, Wan2.7 offers superior flexibility with support for real-person image inputs, up to five video references, and 1080P high-definition outputs spanning 2 to 15 seconds, making it the premier choice for professional digital storytelling and high-end content marketing.

探索

模型分類

Wan-2.7 Text-to-video

$0.1/SEC

Wan-2.7 Image-to-video

$0.1/SEC

Wan-2.7 Reference-to-video

$0.1/SEC

Wan-2.7 Video-edit

$0.1/SEC

Wan-2.7 Text-to-image

$0.03/PIC

Wan-2.7 Image-to-image

$0.03/PIC

Wan-2.7 Pro Text-to-image

$0.075/PIC

Wan-2.7 Pro Image-to-image

$0.075/PIC

模型系列

Nano Banana 2

Nano Banana 2 (by Google), is a generative image model that perfectly balances lightning-fast rendering with exceptional visual quality. With an improved price-performance ratio, it achieves breakthrough micro-detail depiction, accurate native text rendering, and complex physical structure reconstruction. It serves as a highly efficient, commercial-grade visual production tool for developers, marketing teams, and content creators.

探索

模型分類

Nano Banana 2 Reference to Image

$0.08/PIC

Nano Banana 2 Reference to Image Developer

$0.08/PIC$0.04/PIC

Nano Banana 2 Text-to-Image Developer

$0.08/PIC$0.04/PIC

Nano Banana 2 Text-to-Image

$0.08/PIC

Nano Banana 2 Edit Developer

$0.08/PIC$0.04/PIC

Nano Banana 2 Edit

$0.08/PIC

模型系列

Hunyuan 3D

Hunyuan3D is a state-of-the-art 3D generative foundation model from Tencent that turns text prompts and single images into high-quality, textured 3D meshes. Built on a two-stage pipeline—Hunyuan3D-DiT for shape generation via flow-matching diffusion and Hunyuan3D-Paint for multi-view texture synthesis—it produces clean geometry with full PBR materials ready for game engines, AR/VR, 3D printing, and DCC tools. Available in Pro (up to 1.5M faces, 4K PBR textures) and Rapid (2–3 minute lightweight generation) tiers, with both Text-to-3D and Image-to-3D entry points, Hunyuan3D is the premier AI 3D toolkit for game developers, e-commerce teams, and 3D content studios. Generations start at $0.02 each.

探索

模型分類

Hunyuan 3D Rapid Image-to-3D

$0.02/PIC

Hunyuan 3D Rapid Text-to-3D

$0.02/PIC

Hunyuan 3D Pro Image-to-3D

$0.02/PIC

Hunyuan 3D Pro Text-to-3D

$0.02/PIC

模型系列

Midjourney

Midjourney is a proprietary AI image and video generation platform developed by Midjourney, Inc. (San Francisco). Founded in 2021 by David Holz, it has become the aesthetic gold standard in generative AI — transforming text prompts into cinematic, painterly visuals at native 2K resolution. The latest V8.1 architecture, rebuilt from scratch on GPU-native PyTorch, delivers 4–5× faster generation, true 2048×2048 output without upscaling artifacts, and a signature visual style that remains unmatched by competitors. With the addition of Video V1, Midjourney extends its aesthetic into motion — animating still images into atmospheric 5-second cinematic clips. From brand campaigns to film pre-visualization to game concept art, Midjourney is the premier AI creative tool for professionals who demand both speed and artistry.

探索

模型分類

Midjourney V8.1 Remove Background

$0.086/PIC

Midjourney V8.1 Style Transfer

$0.129/PIC

Midjourney V8.1 Blend

$0.086/PIC

Midjourney V8.1 Image-to-Image

$0.086/PIC

Midjourney V8.1 Image-to-Video

$0.086/SEC

Midjourney V8.1 Text-to-Image

$0.086/PIC

模型系列

PixVerse

PixVerse, developed by AISphere, is a video generation model series built around one idea: giving creators director-level control over every frame. V6 is the flagship generation model, covering text-to-video, image-to-video, reference-to-video, start-and-end frame control, and video extension in a single cohesive pipeline. C1 takes a different approach — it is a storyboard-native model designed for multi-shot narrative production, where scene continuity and visual consistency across clips matter as much as individual frame quality. Both series are available on Atlas Cloud, starting from $0.025 per second, with no infrastructure setup required.

探索

模型分類

Pixverse v6 Video-Extend

$0.025/SEC

Pixverse c1 Image-to-Video

$0.03/SEC

Pixverse c1 Start-End-to-Video

$0.03/SEC

Pixverse c1 Reference-to-Video

$0.03/SEC

Pixverse v6 Text-to-Video

$0.025/SEC

Pixverse v6 Image-to-Video

$0.025/SEC

Pixverse v6 Start-End-to-Video

$0.025/SEC

Pixverse v6 Reference-to-Video

$0.025/SEC

Pixverse c1 Text-to-Video

$0.03/SEC

模型系列

Veo 3.1

Google DeepMind’s Veo 3.1 represents a paradigm shift in AI video generation, empowering creators with director-level narrative control and cinematic-grade audio quality that seamlessly integrates with its enhanced visual realism. By bridging the gap between imaginative concepts and photorealistic execution, this advanced model offers a transformative solution for a wide range of application scenarios, from professional filmmaking and high-end advertising to immersive digital content creation.

探索

模型分類

Veo 3.1 Lite Text-to-video

$0.05/SEC

Veo 3.1 Lite Start-End Frame to Video

$0.05/SEC

Veo 3.1 Lite Image-to-video

$0.05/SEC

Veo3.1 Fast Image-to-video

$0.08/SEC

Veo3.1 Fast Text-to-video

$0.08/SEC

Veo3.1 Image-to-video

$0.2/SEC

Veo3.1 Reference-to-video

$0.2/SEC

Veo3.1 Text-to-video

$0.2/SEC

模型系列

Seed 3D

Seed3D V2.0 is ByteDance's second-generation 3D generation foundation model, released April 23, 2026. It transforms single images, video, or text into production-ready 3D assets — complete with full PBR material maps (albedo, normal, metallic, roughness) and simulation-compatible formats. Powered by a coarse-to-fine two-stage Diffusion Transformer and unified PBR pipeline, it achieved a 92.8% win rate over Tripo 3.0 in blind evaluations by 60 professional 3D modelers — covering everything from game assets and e-commerce AR previews to robotics simulation via URDF output.

探索

模型分類

Seed3D 2.0 Image-to-3D

$0.353/PIC

模型系列

Seedream 5.0

Seedream 5.0, developed by ByteDance’s Jimeng AI, is a high-performance AI image generation model that integrates real-time search with intelligent reasoning. Purpose-built for time-sensitive content and complex visual logic, it excels at professional infographics, architectural design, and UI assistance. By blending live web insights with creative precision, Seedream 5.0 empowers commercial branding and marketing with a seamless, logic-driven workflow that turns sophisticated data into stunning, high-fidelity visuals.

探索

模型分類

Seedream v5.0 Lite Edit Sequential

$0.035/PIC$0.032/PIC

Seedream v5.0 Lite Sequential

$0.035/PIC$0.032/PIC

Seedream v5.0 Lite Edit

$0.035/PIC$0.032/PIC

Seedream v5.0 Lite

$0.035/PIC$0.032/PIC

模型系列

Kling 3.0

Kuaishou’s flagship video generation suite, Kling 3.0, features two powerhouse models—Kling 3.0 (Upgraded from Kling 2.6) and Kling 3.0 Omni (Kling O3, Upgraded from Kling O1)—both offering high-fidelity native audio integration. While Kling 3.0 excels in intelligent cinematic storytelling, multilingual lip-syncing, and precision text rendering, Kling O3 sets a new standard for professional-grade subject consistency by supporting custom subjects and voice clones derived from video or image inputs. Together, these models provide a comprehensive solution tailored for cinematic narratives, global marketing campaigns, social media content, and digital skit production.

探索

模型分類

Kling V3.0 Turbo Image-to-Video

$0.112/SEC$0.095/SEC

Kling V3.0 Turbo Text-to-Video

$0.112/SEC$0.095/SEC

Kling Video O3 4K Image-to-Video

$0.42/SEC$0.357/SEC

Kling Video O3 4K Text-to-Video

$0.42/SEC$0.357/SEC

Kling v3.0 4K Image-to-Video

$0.42/SEC$0.357/SEC

Kling v3.0 Std Image-to-Video

$0.084/SEC$0.071/SEC

Kling v3.0 Pro Image-to-Video

$0.112/SEC$0.095/SEC

Kling v3.0 Pro Text-to-Video

$0.112/SEC$0.095/SEC

Kling v3.0 4K Text-to-Video

$0.42/SEC$0.357/SEC

Kling v3.0 Std Text-to-Video

$0.084/SEC$0.071/SEC

Kling Video O3 Pro Video-Edit

$0.168/SEC$0.143/SEC

Kling Video O3 Pro Reference-to-Video

$0.112/SEC$0.095/SEC

Kling Video O3 Pro Image-to-Video

$0.112/SEC$0.095/SEC

Kling Video O3 Pro Text-to-Video

$0.112/SEC$0.095/SEC

Kling Video O3 Std Video-Edit

$0.126/SEC$0.107/SEC

Kling Video O3 Std Reference-to-Video

$0.084/SEC$0.071/SEC

Kling Video O3 Std Image-to-Video

$0.084/SEC$0.071/SEC

Kling Video O3 Std Text-to-Video

$0.084/SEC$0.071/SEC

模型系列

Seedream 4.5

Seedream 4.5, developed by ByteDance’s Jimeng AI, is a versatile, high-fidelity model that unifies creative generation with precise image editing. Engineered for professional consistency and intricate text rendering, it excels at multi-subject fusion, brand identity, and high-resolution marketing assets. By bridging spatial logic with artistic control, Seedream 4.5 empowers designers with a seamless, instruction-driven workflow that transforms complex concepts into polished, commercial-grade visuals.

模型分類

Seedream v4.5 Sequential

$0.04/PIC$0.036/PIC

Seedream v4.5 Edit Sequential

$0.04/PIC$0.036/PIC

模型系列

Vidu

Vidu, a joint innovation by Shengshu AI and Tsinghua University, is a high-performance video model powered by the original U-ViT architecture that blends Diffusion and Transformer technologies. It delivers long-form, highly consistent, and dynamic video content tailored for professional filmmaking, animation design, and creative advertising. By streamlining high-end visual production, Vidu empowers creators to transform complex ideas into cinematic reality with unprecedented efficiency.

探索

模型分類

Vidu Q3-Mix Reference to Video

$0.125/SEC$0.106/SEC

Vidu Q3 Reference to Video

$0.05/SEC$0.042/SEC

Vidu Q3-Pro Start-end-to-video

$0.05/SEC$0.042/SEC

Vidu Q3-Turbo Image-to-video

$0.04/SEC$0.034/SEC

Vidu Q3-Turbo Start-end-to-video

$0.04/SEC$0.034/SEC

Vidu Q3-Turbo Text-to-video

$0.04/SEC$0.034/SEC

Vidu Q3-Pro Image-to-video

$0.05/SEC$0.042/SEC

Vidu Q3-Pro Text-to-video

$0.05/SEC$0.042/SEC

Vidu Reference-to-Video Q1

$0.4/SEC

Vidu Reference-to-Video 2.0

$0.2/SEC

Vidu Start-End-to-Video 2.0

$0.075/SEC

Image-to-video-2.0

$0.075/SEC

Vidu Q2-Turbo Image-to-video

$0.03/SEC$0.026/SEC

Vidu Q2-Pro Reference-to-video

$0.1/SEC$0.085/SEC

Vidu Q2 Reference-to-video

$0.075/SEC$0.064/SEC

Vidu Q2-Pro-Fast Start-end-to-video

$0.04/SEC$0.034/SEC

Vidu Q2-Pro Start-end-to-video

$0.04/SEC$0.034/SEC

Vidu Q2-Turbo Start-end-to-video

$0.03/SEC$0.026/SEC

Vidu Q2-Pro-Fast Image-to-video

$0.04/SEC$0.034/SEC

Vidu Q2-Pro Image-to-video

$0.04/SEC$0.034/SEC

Vidu Q2 Text-to-video

$0.05/SEC$0.042/SEC

Vidu Q1 Image-to-video

$0.4/SEC$0.34/SEC

Vidu Q1 Reference-to-video

$0.4/SEC$0.34/SEC

Vidu Q1 Start-end-to-video

$0.4/SEC$0.34/SEC

Vidu Q1 Text-to-video

$0.4/SEC$0.34/SEC

模型系列

Qwen Image

Qwen Image 2.0 is Alibaba Cloud's latest image generation model series from the Tongyi Qianwen family, comprising 4 models optimized for different use cases. This series delivers professional-grade image generation and editing capabilities with exceptional cost-performance ratio, supporting up to 2K resolution output and demonstrating outstanding performance in prompt adherence, detail rendering, and style consistency. Whether for text-to-image or image-to-image tasks, Qwen Image 2.0 provides developers, marketing teams, and content creators with efficient and reliable visual content production solutions. The series includes two tiers: Standard and Professional. The Standard edition is ideal for daily content production and cost-effective batch image generation, while the Professional edition delivers the highest quality visual output, designed for professional production workflows with stringent image quality requirements. Qwen-Image, a lightweight 7B foundation model by Alibaba, transforms long-form prompts up to 1,000 tokens into stunning native 2K (2048x2048) resolution images. It excels in Chinese text rendering, accurately handling complex layouts and classical scripts, making it the premier AI tool for high-end graphic design and cross-cultural content creation.

探索

模型分類

Qwen Image 2.0 Text-to-image

$0.035/PIC$0.028/PIC

Qwen Image 2.0 Edit

$0.035/PIC$0.028/PIC

Qwen Image 2.0 Pro Edit

$0.075/PIC$0.06/PIC

Qwen Image 2.0 Pro Text-to-image

$0.075/PIC$0.06/PIC

Qwen-Image Edit Plus 20251215

Qwen-Image Text-to-image Max

$0.075/PIC$0.052/PIC

Qwen-Image Text-to-image Plus

Qwen Image Text-to-image

$0.035/PIC$0.024/PIC

模型系列

Happy Horse 1.0

HappyHorse-1.0 is a unified multimodal AI video generation model that climbed to the top of the Artificial Analysis Video Arena blind-test leaderboard for both text-to-video and image-to-video generation. CNBC Alibaba Group confirmed ownership of HappyHorse, developed under its Alibaba Token Hub (ATH) business unit, where it leads benchmarks outperforming ByteDance's Seedance 2.0 and others. Caixin Global Led by Zhang Di — the former VP of Kuaishou who architected Kling AI — the 15-billion parameter model generates 1080p video with synchronized audio in a single pass using a unified transformer architecture that bypasses the multi-stage pipelines used by every major competitor.

探索

模型分類

HappyHorse-1.1 Text-to-video

$0.14/SEC

HappyHorse-1.1 Image-to-video

$0.14/SEC

HappyHorse-1.1 Reference-to-video

$0.14/SEC

HappyHorse-1.0 Text-to-video

$0.14/SEC

HappyHorse-1.0 Image-to-video

$0.14/SEC

HappyHorse-1.0 Reference-to-video

$0.14/SEC

HappyHorse-1.0 Video-edit

$0.14/SEC

模型系列

Nano Banana

Google’s Nano Banana (Gemini 3 Image) series, featuring both standard and Pro models, combines deep semantic understanding with seamless integration for precise detail control. While the standard version delivers high-quality 1K outputs, Nano Banana Pro elevates professional workflows with versatile 1K/2K/4K resolution options with higher quality, making it the ideal solution for any creative or commercial application.

探索

模型分類

Nano Banana Pro Text-to-image Ultra

$0.15/PIC

Nano Banana Pro Edit Ultra

$0.15/PIC

Nano Banana Pro Text-to-image

$0.14/PIC

Nano Banana Pro Edit

$0.14/PIC

Nano Banana Pro Text-to-image Developer

$0.14/PIC$0.07/PIC

Nano Banana Text-to-image Developer

$0.038/PIC$0.019/PIC

Nano Banana Pro Edit Developer

$0.14/PIC$0.07/PIC

Nano Banana Edit Developer

$0.038/PIC$0.019/PIC

Nano Banana Text-to-image

$0.038/PIC

Nano Banana Edit

$0.038/PIC

模型系列

Hailuo Video

MiniMax Hailuo 影片模型提供原生 1080p (Pro) 和 768p (Standard) 的文生影片與圖生影片功能，具備強大的指令遵循能力以及逼真、符合物理規律的運動表現。

探索

模型分類

Hailuo-2.3 t2v Standard

$0.28/SEC

Hailuo-2.3 t2v Pro

$0.49/SEC

Hailuo-2.3 i2v Standard

Hailuo 02 t2v Standard

$0.28/SEC

Hailuo 02 i2v Standard

模型系列

Wan 2.6

Wan 2.6 is a next-generation AI video generation model from Alibaba’s Tongyi Lab, designed for professional-quality, multimodal video creation. It combines advanced narrative understanding, multi-shot storytelling, and native audio–visual synchronization to produce smooth 1080p videos up to 15 s long from text and reference inputs. Wan 2.6 also supports character consistency and role-guided generation, enabling creators to turn scripts into cohesive scenes with seamless motion and lip syncing. Its efficiency and rich creative control make it ideal for short films, advertising, social media content, and automated video workflows.

探索

模型分類

Wan-2.6 Image-to-video Flash

$0.025/SEC$0.018/SEC

Wan-2.6 Image-to-image

$0.03/PIC$0.021/PIC

Wan-2.6 Image-to-video

$0.1/SEC$0.07/SEC

Wan-2.6 Video-to-video

$0.1/SEC$0.07/SEC

Wan-2.6 Text-to-video

$0.1/SEC$0.07/SEC

Wan 2.6 Spicy Image-to-Video

$0.1/SEC$0.07/SEC

Wan-2.6 Text-to-image

$0.03/PIC$0.021/PIC

模型系列

Flux.2 Image

Developed by Black Forest Labs, FLUX.2 is a powerhouse 32-billion parameter rectified flow Transformer model that redefines creative workflows by unifying AI image generation, editing, and composition. It transforms complex text prompts into high-fidelity visuals while offering integrated tools for professional-grade editing at resolutions up to 2K, providing a streamlined, all-in-one solution for digital artists and designers seeking unmatched precision and scalability in their visual content creation.

模型分類

Flux Kontext Dev Lora

FLUX.2 Flex Text-to-image

$0.05/PIC

FLUX.2 Pro Edit

$0.03/PIC

FLUX.2 Pro Text-to-image

$0.03/PIC

Flux Dev Lora

$0.015/PIC

模型系列

GPT Image

The GPT Image Family is OpenAI's latest suite of multimodal image generation and editing models, built on the powerful GPT architecture. This family includes three tiers — GPT Image-1, GPT Image-1.5, and GPT Image-1 Mini — each available in both Text-to-Image and Image-to-Image variants. Combining GPT's world-class language understanding with DALL·E-class visual synthesis, these models deliver exceptional prompt adherence, photorealistic rendering, and creative versatility across illustration, photography, design, and visualization tasks. The series offers flexible pricing and quality tiers to match any workflow — from rapid prototyping and high-volume content production to professional-grade final deliverables. Whether you need ultra-fast iterations at minimal cost or maximum quality for brand campaigns, the GPT Image Family has a solution tailored to your needs.

探索

模型分類

Openai GPT Image-1.5 Text-to-image

$0.009/PIC$0.008/PIC

Openai GPT Image-1.5 Edit

$0.009/PIC$0.008/PIC

Openai GPT Image-1 Text-to-image

$0.011/PIC$0.009/PIC

Openai GPT Image-1 Edit

$0.011/PIC$0.009/PIC

Openai GPT Image-1 Mini Text-to-image

$0.005/PIC$0.004/PIC

Openai GPT Image-1 Mini Edit

$0.005/PIC$0.004/PIC

模型系列

Seedance 1.5

ByteDance’s Seedance 1.5 Pro is a powerful AI video generation model that seamlessly integrates native audio with film-grade cinematography. Engineered for emotional storytelling and superior visual quality, it enables creators to produce immersive, narrative-driven content for professional filmmaking and advertising, setting a new standard for artistic precision and production efficiency.

探索

模型分類

Seedance v1.5 Pro Image-to-Video

$0.052/SEC$0.047/SEC

Seedance v1.5 Pro Text-to-Video

$0.052/SEC$0.047/SEC

Seedance v1.5 Pro Image-to-Video Fast

$0.02/SEC$0.018/SEC

Seedance v1.5 Pro Text-to-Video Fast

$0.02/SEC$0.018/SEC

模型系列

ERNIE Image

ERNIE-Image is an open-weight text-to-image model developed by the ERNIE-Image Team at Baidu, built on a single-stream Diffusion Transformer (DiT) with 8B parameters and paired with a lightweight Prompt Enhancer that rewrites short prompts into richer, more structured descriptions before passing them to the diffusion backbone. NYU Shanghai RITS Released on April 15, 2026 under the Apache 2.0 license, it transforms natural language descriptions into detailed imagery with particular strength in text rendering and structured layout generation. ERNIE-Image is designed not only for strong visual quality, but for controllability in practical generation scenarios where accurate content realization matters as much as aesthetics — making it well-suited for commercial posters, comics, multi-panel layouts, and other content creation tasks that require both visual quality and precise control.

探索

模型分類

Baidu ERNIE Image Turbo Text-to-image

模型系列

Seedream 4

Seedream v4, a cutting-edge image generation model by ByteDance, redefines creative workflows by combining lightning-fast inference speeds with breathtaking 4K high-definition output. Beyond its raw performance, the model leverages advanced knowledge and reasoning to interpret complex prompts with precision, enabling seamless prompt-based editing and a vast spectrum of versatile artistic styles that make it the ultimate solution for professional design, content creation, and digital marketing.

探索

模型分類

Seedream v4

$0.03/PIC$0.027/PIC

Seedream v4 Sequential

$0.03/PIC$0.027/PIC

Seedream v4 Edit

$0.03/PIC$0.027/PIC

Seedream v4 Edit Sequential

$0.03/PIC$0.027/PIC

模型系列

Imagen Image

Imagen is Google’s diffusion-based image generation family, designed for photorealism, creativity, and scalable content workflows. With options from fast inference to ultra-high fidelity, Imagen balances speed, detail, and enterprise reliability.

模型分類

模型系列

Seedance

Seedance is ByteDance’s family of video generation models, built for speed, realism, and scale. Available in Lite and Pro versions across 480p, 720p, and 1080p, Seedance transforms text and images into smooth, cinematic video on Atlas Cloud.

探索

模型分類

Seedance v1.5 Pro Image-to-Video Spicy

$0.049/SEC

Seedance v1 Pro Fast Text-to-video

$0.01/SEC$0.009/SEC

Seedance v1 Pro Fast Image-to-video

$0.01/SEC$0.009/SEC

Seedance v1 Pro t2v 1080p

$0.122/SEC$0.11/SEC

Seedance v1 Pro t2v 720p

$0.052/SEC$0.047/SEC

Seedance v1 Pro t2v 480p

$0.024/SEC$0.022/SEC

Seedance v1 Pro i2v 720p

$0.052/SEC$0.047/SEC

Seedance v1 Pro i2v 480p

$0.024/SEC$0.022/SEC

Seedance v1 Pro i2v 1080p

$0.122/SEC$0.11/SEC

模型系列

Van Video

Built on the Wan 2.5 and 2.6 frameworks, Van Model is a flagship AI video series that delivers superior high-resolution outputs with unmatched creative freedom. By blending cinematic 3D VAE visuals with Flow Matching dynamics, it leverages proprietary compute distillation to offer ultra-fast inference speeds at a fraction of the cost, making it the premier engine for scalable, high-frequency video production on a budget.

探索

模型分類

Van-2.6 Text-to-video

$0.068/SEC

Van-2.6 Image-to-video

$0.068/SEC

Van-2.5 Image-to-video

$0.054/SEC

Van-2.5 Text-to-video

$0.068/SEC

模型系列

Wan 2.5

Wan 2.5 is Alibaba’s state-of-the-art multimodal video generation model, capable of producing high-fidelity, audio-synchronized videos from text or images. It delivers realistic motion, natural lighting, and strong prompt alignment across 480p to 1080p outputs—ideal for creative and production-grade workflows.

探索

模型分類

Wan-2.5 Text-to-video Fast

$0.071/SEC

Wan-2.5 Text-to-video

$0.05/SEC$0.035/SEC

Wan-2.5 Image-to-video

$0.05/SEC$0.035/SEC

Wan-2.5 Image-to-video Fast

$0.071/SEC

Wan-2.5 Image Edit

$0.03/PIC$0.021/PIC

Wan-2.5 Text-to-image

$0.03/PIC$0.021/PIC

模型系列

Kling

Kling AI is a text-to-video model developed by Kuaishou that creates realistic, high-quality videos from text prompts. It focuses on smooth motion, stable frames, and natural-looking scenes. Kling works well for short videos, ads, and marketing content, helping creators save time and reduce production costs. With strong performance in video consistency and realism, Kling AI is becoming a popular choice in the AI video generation space.

探索

模型分類

Kling v2.6 Pro Avatar

$0.112/SEC$0.095/SEC

Kling v2.6 Std Avatar

$0.056/SEC$0.048/SEC

Kling v2.6 Pro Motion Control

$0.112/SEC$0.095/SEC

Kling v2.6 Std Motion Control

$0.07/SEC$0.06/SEC

Kling v2.6 Pro Text-to-Video

$0.07/SEC$0.06/SEC

Kling v2.6 Pro Image-to-Video

$0.07/SEC$0.06/SEC

Kling Video O1 Image-to-video

$0.112/SEC$0.095/SEC

Kling Video O1 Text-to-video

$0.112/SEC$0.095/SEC

Kling v2.5 Turbo Pro Text-to-video

$0.07/SEC$0.06/SEC

Kling v2.5 Turbo Pro Image-to-video

$0.07/SEC$0.06/SEC

Kling v2.1 i2v Pro Start-end-frame

$0.098/SEC$0.083/SEC

Kling v1.6 Multi i2v Pro

$0.098/SEC$0.083/SEC

Kling v1.6 Multi i2v Standard

$0.056/SEC$0.048/SEC

Kling Effects

$0.25/SEC$0.212/SEC

kling v2.0 i2v Master

$0.28/SEC$0.238/SEC

Kling v2.1 t2v Master

$0.28/SEC$0.238/SEC

Kling v2.0 t2v Master

$0.28/SEC$0.238/SEC

Kling v2.1 i2v Master

$0.28/SEC$0.238/SEC

Kling v2.1 i2v Pro

$0.098/SEC$0.083/SEC

Kling v1.6 t2v Standard

$0.056/SEC$0.048/SEC

Kling v1.6 i2v Pro

$0.098/SEC$0.083/SEC

Kling v2.1 i2v Standard

$0.056/SEC$0.048/SEC

Kling v1.6 i2v Standard

$0.056/SEC$0.048/SEC

模型系列

Wan 2.2

Wan 2.2 introduces a Mixture-of-Experts (MoE) architecture that enables greater capacity and finer motion control without higher inference cost, supporting both text-to-video and image-to-video generation with high visual fidelity, smooth motion, and cinematic realism optimized for real-world GPU deployment.

探索

模型分類

Wan-2.2-spicy Image-to-video Lora

$0.04/SEC

Wan-2.2-spicy Image-to-video

$0.03/SEC

Wan-2.2-spicy Video Extend

$0.032/SEC

Wan-2.2 Video Character Swap

$0.18/SEC$0.126/SEC

Wan-2.2 Image To Animation

$0.12/SEC$0.084/SEC

模型系列

Tools

Open, advanced large-scale image generative models that power high-fidelity creation and editing with modular APIs, reproducible training, built-in safety guardrails, and elastic, production-grade inference at scale.

探索

模型分類