什么是 Atlas Cloud API？它是如何工作的？

Atlas Cloud 是一个全模态 AI 推理平台，开发者只需一个统一的 AI API，即可调用全球顶级的视频生成 API、图像生成 API 和 LLM API。无需对接多家供应商，一次接入即可统一访问 300 余款精选模型，覆盖全部模态。基础设施、弹性扩容与模型更新均由 Atlas Cloud 托管，让你专注于产品构建本身。

Atlas Cloud API 兼容 OpenAI API 吗？

兼容。Atlas Cloud 提供与 OpenAI 兼容的 API 端点，可作为现有集成的真正无缝替代。如果你正在使用 OpenAI SDK，仅需替换 base URL 与 API Key，无需修改任何业务代码即可完成从 OpenAI 到 Atlas Cloud 的迁移。对于希望寻找覆盖更广、成本更低的 OpenAI 替代方案的开发者，这是最快的迁移路径。

如何快速接入 Atlas Cloud API？

几分钟即可上手。注册一个免费账户，在控制台生成免费的 AI API 密钥，按照文档中的开发者快速入门指南操作即可。大多数开发者可在 5 分钟内完成首次 API 调用，注册无需绑定信用卡。

Atlas Cloud 支持哪些 AI 模型？

Atlas Cloud 支持覆盖全模态的 300 余款模型。视频生成：HappyHorse API、Seedance API、Kling API、Wan API、Veo API 与 Runway API；图像生成：Flux API、GPT Image API 与 Nano Banana API；LLM：DeepSeek API、Qwen API、GLM API 与 MiniMax API。我们坚持 Day 0 同步上线最新模型，让你始终掌握业界前沿能力，无需在多个平台之间切换。

Atlas Cloud API 的计费模式是怎样的？

Atlas Cloud 采用按量付费模式，无月度最低消费，无按席位计费，用多少付多少。在同等模型下，我们的 API 价格始终低于 kie.ai 与 fal.ai，价格页公开了透明的按秒或按 token 计费标准。没有任何隐藏的基础设施成本，所见即所付。

Atlas Cloud 是否支持流式输出与批量处理？

支持。Atlas Cloud 原生支持流式 API 响应（适用于 LLM 实时输出）、面向高吞吐异步任务的批量推理，以及结构化输出。无论是构建低延迟对话产品，还是运行每秒处理数千请求的异步管道，同一平台都能胜任，无需切换配置或引入额外工具链。

Atlas Cloud API 适合企业级生产环境吗？

完全适合。Atlas Cloud 为企业级 AI API 需求而生，已获得 SOC 2（SOC I & II）认证与 HIPAA 合规认证，数据传输与静态存储均加密。对于数据合规要求更严格的组织，我们还提供专有云部署方案与独立基础设施。平台具备产品级稳定性、高并发承载与专属支持团队，可承载任意规模的核心业务负载。

可以把 Atlas Cloud 作为 Replicate 的替代方案吗？

可以，事实上许多开发者正是因此而选择我们。作为 Replicate 替代方案，Atlas Cloud 拥有更广的模型矩阵、更低的单次生成成本，并提供与 OpenAI 兼容的端点以简化集成。如果你也在评估 fal.ai 或 Together AI 的替代方案，Atlas Cloud 在同一平台、同一 API 密钥、同一计费账户下整合了视频、图像、LLM 与音频，让多模态能力一站集成。

Atlas Cloud | 全模态 AI 平台 - 对话、图像、视频、语音统一 API

模型系列

Seedance 2.0

Seedance 2.0 API 为您提供字节跳动多模态视频模型的生产级访问权限——支持四模态输入（文本、图像、视频、音频），以及业界领先的“通用参考”（Universal Reference）系统，可在不同镜头间锁定构图、运镜和角色动作。只需一次 API 调用即可集成导演级控制，统一费率 $0.09/秒，即刻获取密钥，无需排队等待——由企业级正常运行时间与合规性提供保障。Seedance 2.0 原生 4K 现已于 2026 年 6 月正式上线！

探索

模型分类

Seedance 2.0 Text-to-Video

$0.112/SEC

Seedance 2.0 Image-to-Video

$0.112/SEC

Seedance 2.0 Reference-to-Video

$0.112/SEC

Seedance 2.0 Fast Text-to-Video

$0.09/SEC

Seedance 2.0 Fast Image-to-Video

$0.09/SEC

Seedance 2.0 Fast Reference-to-Video

$0.09/SEC

模型系列

Grok-Imagine

Grok Imagine Image Quality is xAI's latest AI image generation model, delivering studio-grade visuals with up to 2K resolution and razor-sharp detail. It offers best-in-class text rendering across multiple languages, photorealistic outputs with natural lighting, rich textures, and believable physics, plus tighter prompt following and image editing with reference inputs for precise creative control. Ideal for hero images, ad creatives, product renders, and brand-grade visuals.

探索

模型分类

xAI TTS v1

$0.015/PIC

Grok Imagine Video v1.5 Image-to-Video

$0.08/SEC

Grok Imagine Image Quality Text-to-Image

$0.05/PIC

Grok Imagine Image Quality Edit

$0.05/PIC

Grok Imagine Video Text-to-Video

$0.05/SEC

Grok Imagine Video Image-to-Video

$0.05/SEC

Grok Imagine Video Reference-to-Video

$0.05/SEC

Grok Imagine Video Extend

$0.07/SEC

Grok Imagine Video Edit

$0.07/SEC

Grok Imagine Image Edit

$0.02/PIC

Grok Imagine Image Text-to-Image

$0.02/PIC

模型系列

Happy Horse

HappyHorse 在 Artificial Analysis Video Arena 排行榜的文本生成视频和图像生成视频领域均位居榜首。HappyHorse 1.0 API 和 HappyHorse 1.1 API 使开发者能够直接访问 Alibaba 的统一视频模型——无需多阶段处理流程，只需一次集成即可支持两种模态。直接从您的代码中生成带有同步音频的 1080p 视频。

探索

模型分类

HappyHorse-1.1 Text-to-video

$0.14/SEC$0.112/SEC

HappyHorse-1.1 Image-to-video

$0.14/SEC$0.112/SEC

HappyHorse-1.1 Reference-to-video

$0.14/SEC$0.112/SEC

HappyHorse-1.0 Text-to-video

$0.14/SEC$0.112/SEC

HappyHorse-1.0 Image-to-video

$0.14/SEC$0.112/SEC

HappyHorse-1.0 Reference-to-video

$0.14/SEC$0.112/SEC

HappyHorse-1.0 Video-edit

$0.14/SEC$0.112/SEC

模型系列

GPT Image 2

GPT Image 2 API 为开发者提供了访问 OpenAI 最新图像模型的途径，它是 GPT Image 1.5 的继任者。该模型可生成和编辑图像，能够在拉丁和 CJK 文字上实现准确的文本渲染，并在海报、样机和信息图表方面具备强大的排版能力。在 Atlas Cloud 上，您可以通过一个统一的 API 与 300 多个模型一起访问它，并享受免费额度、99.99% 的正常运行时间，且无需 OpenAI 组织验证。

探索

模型分类

Openai GPT Image 2 Text-to-Image

$0.009/PIC

Openai GPT Image 2 Edit

$0.01/PIC

GPT Image 2 Developer Edit

$0.01/PIC$0.005/PIC

GPT Image 2 Developer Text-to-Image

$0.009/PIC$0.004/PIC

模型系列

Seedance 2.0 Mini

Seedance 2.0 Mini 将 ByteDance 的多模态视频生成技术引入到对速度和成本要求极高的工作流中。它以更轻量的占用空间提供 Seedance 2.0 的核心能力——更快的生成速度、更低的单条视频成本，并且使用您现有的同款 API 集成。对于运行高吞吐量流水线或进行大规模原型设计的团队来说，Mini 是最实用的默认选择。

探索

模型分类

Seedance 2.0 Mini Reference-to-Video

$0.056/SEC

Seedance 2.0 Mini Image-to-Video

$0.056/SEC

Seedance 2.0 Mini Text-to-Video

$0.056/SEC

模型系列

MAI

MAI-Image-2.5 是 Microsoft 最新推出的逼真图像生成与编辑模型系列，专为商业设计、产品摄影和品牌级内容创作而打造。提供用于文本生成图像和图像编辑的 standard 和 Flash 变体，以极具竞争力的价格（每张图像起价 0.03 美元）提供同类最佳的 Arena ELO 得分。凭借精准的文本渲染、手术刀级的编辑能力以及自然的人像生成，MAI-Image-2.5 专为需要生产级质量视觉效果且无需承担后期处理开销的团队而设计。

探索

模型分类

MAI-Image-2.5-Flash Text-to-image

$0.03/PIC

MAI-Image-2.5 Edit

$0.058/PIC

MAI-Image-2.5 Text-to-image

$0.05/PIC

MAI-Image-2.5-Flash Edit

$0.038/PIC

模型系列

Wan 2.7

Wan 2.7 API 为开发者提供了 Alibaba 的全能视频套件，涵盖文本生成视频、图像生成视频、参考生成视频和视频编辑，以及图像生成功能。它可以生成长达 15 秒、带同步音频的原生 1080p 视频片段，支持首尾帧控制，以及最多 5 个角色参考。在 Atlas Cloud 上，您可以通过一个统一的 API 访问它以及 300 多种模型，价格从每秒 0.10 美元起，并保证 99.99% 的正常运行时间。

探索

模型分类

Wan-2.7 Text-to-video

$0.1/SEC

Wan-2.7 Image-to-video

$0.1/SEC

Wan-2.7 Reference-to-video

$0.1/SEC

Wan-2.7 Video-edit

$0.1/SEC

Wan-2.7 Text-to-image

$0.03/PIC

Wan-2.7 Image-to-image

$0.03/PIC

Wan-2.7 Pro Text-to-image

$0.075/PIC

Wan-2.7 Pro Image-to-image

$0.075/PIC

模型系列

Nano Banana 2

使用由 Google 的 Gemini 3.1 Flash Image 模型驱动的 Nano Banana 2 API 进行构建。它可生成分辨率高达 4096x2304 的原生 4K 视觉效果，具备准确的文本渲染能力，并在生成和编辑过程中支持多达 14 张参考图像的角色一致性。在 Atlas Cloud 上，您可以通过一个统一的 API 访问它以及其他 300 多个模型，价格低至每张图像 0.04 美元，提供 99.99% 的正常运行时间，并附赠免费额度供您开始使用。

探索

模型分类

Nano Banana 2 Reference to Image

$0.08/PIC

Nano Banana 2 Reference to Image Developer

$0.08/PIC$0.04/PIC

Nano Banana 2 Text-to-Image Developer

$0.08/PIC$0.04/PIC

Nano Banana 2 Text-to-Image

$0.08/PIC

Nano Banana 2 Edit Developer

$0.08/PIC$0.04/PIC

Nano Banana 2 Edit

$0.08/PIC

模型系列

Hunyuan 3D

Hunyuan3D is a state-of-the-art 3D generative foundation model from Tencent that turns text prompts and single images into high-quality, textured 3D meshes. Built on a two-stage pipeline—Hunyuan3D-DiT for shape generation via flow-matching diffusion and Hunyuan3D-Paint for multi-view texture synthesis—it produces clean geometry with full PBR materials ready for game engines, AR/VR, 3D printing, and DCC tools. Available in Pro (up to 1.5M faces, 4K PBR textures) and Rapid (2–3 minute lightweight generation) tiers, with both Text-to-3D and Image-to-3D entry points, Hunyuan3D is the premier AI 3D toolkit for game developers, e-commerce teams, and 3D content studios. Generations start at $0.02 each.

探索

模型分类

Hunyuan 3D Rapid Image-to-3D

$0.02/PIC

Hunyuan 3D Rapid Text-to-3D

$0.02/PIC

Hunyuan 3D Pro Image-to-3D

$0.02/PIC

Hunyuan 3D Pro Text-to-3D

$0.02/PIC

模型系列

PixVerse

PixVerse, developed by AISphere, is a video generation model series built around one idea: giving creators director-level control over every frame. V6 is the flagship generation model, covering text-to-video, image-to-video, reference-to-video, start-and-end frame control, and video extension in a single cohesive pipeline. C1 takes a different approach — it is a storyboard-native model designed for multi-shot narrative production, where scene continuity and visual consistency across clips matter as much as individual frame quality. Both series are available on Atlas Cloud, starting from $0.025 per second, with no infrastructure setup required.

探索

模型分类

Pixverse v6 Video-Extend

$0.025/SEC

Pixverse c1 Image-to-Video

$0.03/SEC

Pixverse c1 Start-End-to-Video

$0.03/SEC

Pixverse c1 Reference-to-Video

$0.03/SEC

Pixverse v6 Text-to-Video

$0.025/SEC

Pixverse v6 Image-to-Video

$0.025/SEC

Pixverse v6 Start-End-to-Video

$0.025/SEC

Pixverse v6 Reference-to-Video

$0.025/SEC

Pixverse c1 Text-to-Video

$0.03/SEC

模型系列

Veo 3.1

Google DeepMind’s Veo 3.1 represents a paradigm shift in AI video generation, empowering creators with director-level narrative control and cinematic-grade audio quality that seamlessly integrates with its enhanced visual realism. By bridging the gap between imaginative concepts and photorealistic execution, this advanced model offers a transformative solution for a wide range of application scenarios, from professional filmmaking and high-end advertising to immersive digital content creation.

探索

模型分类

Veo 3.1 Lite Text-to-video

$0.05/SEC

Veo 3.1 Lite Start-End Frame to Video

$0.05/SEC

Veo 3.1 Lite Image-to-video

$0.05/SEC

Veo3.1 Fast Image-to-video

$0.08/SEC

Veo3.1 Fast Text-to-video

$0.08/SEC

Veo3.1 Image-to-video

$0.2/SEC

Veo3.1 Reference-to-video

$0.2/SEC

Veo3.1 Text-to-video

$0.2/SEC

模型系列

Youchuan

The latest Youchuan V8.1 architecture, rebuilt from scratch on GPU-native PyTorch, delivers 4–5× faster generation, true 2048×2048 output without upscaling artifacts, and a signature visual style that remains unmatched by competitors. With the addition of Video V1, Midjourney extends its aesthetic into motion — animating still images into atmospheric 5-second cinematic clips. From brand campaigns to film pre-visualization to game concept art, Youchuan is the premier AI creative tool for professionals who demand both speed and artistry.

探索

模型分类

Youchuan V8.1 Remove Background

$0.086/PIC

Youchuan V8.1 Style Transfer

$0.129/PIC

Youchuan V8.1 Blend

$0.086/PIC

Youchuan V8.1 Image-to-Image

$0.086/PIC

Youchuan V8.1 Image-to-Video

$0.086/SEC

Youchuan V8.1 Text-to-Image

$0.086/PIC

模型系列

Seed 3D

Seed3D V2.0 is ByteDance's second-generation 3D generation foundation model, released April 23, 2026. It transforms single images, video, or text into production-ready 3D assets — complete with full PBR material maps (albedo, normal, metallic, roughness) and simulation-compatible formats. Powered by a coarse-to-fine two-stage Diffusion Transformer and unified PBR pipeline, it achieved a 92.8% win rate over Tripo 3.0 in blind evaluations by 60 professional 3D modelers — covering everything from game assets and e-commerce AR previews to robotics simulation via URDF output.

探索

模型分类

Seed3D 2.0 Image-to-3D

$0.353/PIC

模型系列

Seedream 5.0

Seedream 5.0, developed by ByteDance’s Jimeng AI, is a high-performance AI image generation model that integrates real-time search with intelligent reasoning. Purpose-built for time-sensitive content and complex visual logic, it excels at professional infographics, architectural design, and UI assistance. By blending live web insights with creative precision, Seedream 5.0 empowers commercial branding and marketing with a seamless, logic-driven workflow that turns sophisticated data into stunning, high-fidelity visuals.

探索

模型分类

Seedream v5.0 Lite Edit Sequential

$0.035/PIC$0.032/PIC

Seedream v5.0 Lite Sequential

$0.035/PIC$0.032/PIC

Seedream v5.0 Lite Edit

$0.035/PIC$0.032/PIC

Seedream v5.0 Lite

$0.035/PIC$0.032/PIC

模型系列

Kling 3.0

Kuaishou’s flagship video generation suite, Kling 3.0, features two powerhouse models—Kling 3.0 (Upgraded from Kling 2.6) and Kling 3.0 Omni (Kling O3, Upgraded from Kling O1)—both offering high-fidelity native audio integration. While Kling 3.0 excels in intelligent cinematic storytelling, multilingual lip-syncing, and precision text rendering, Kling O3 sets a new standard for professional-grade subject consistency by supporting custom subjects and voice clones derived from video or image inputs. Together, these models provide a comprehensive solution tailored for cinematic narratives, global marketing campaigns, social media content, and digital skit production.

探索

模型分类

Kling V3.0 Turbo Image-to-Video

$0.112/SEC$0.095/SEC

Kling V3.0 Turbo Text-to-Video

$0.112/SEC$0.095/SEC

Kling Video O3 4K Image-to-Video

$0.42/SEC$0.357/SEC

Kling Video O3 4K Text-to-Video

$0.42/SEC$0.357/SEC

Kling v3.0 4K Image-to-Video

$0.42/SEC$0.357/SEC

Kling v3.0 Std Image-to-Video

$0.084/SEC$0.071/SEC

Kling v3.0 Pro Image-to-Video

$0.112/SEC$0.095/SEC

Kling v3.0 Pro Text-to-Video

$0.112/SEC$0.095/SEC

Kling v3.0 4K Text-to-Video

$0.42/SEC$0.357/SEC

Kling v3.0 Std Text-to-Video

$0.084/SEC$0.071/SEC

Kling Video O3 Pro Video-Edit

$0.168/SEC$0.143/SEC

Kling Video O3 Pro Reference-to-Video

$0.112/SEC$0.095/SEC

Kling Video O3 Pro Image-to-Video

$0.112/SEC$0.095/SEC

Kling Video O3 Pro Text-to-Video

$0.112/SEC$0.095/SEC

Kling Video O3 Std Video-Edit

$0.126/SEC$0.107/SEC

Kling Video O3 Std Reference-to-Video

$0.084/SEC$0.071/SEC

Kling Video O3 Std Image-to-Video

$0.084/SEC$0.071/SEC

Kling Video O3 Std Text-to-Video

$0.084/SEC$0.071/SEC

模型系列

Seedream 4.5

Seedream 4.5, developed by ByteDance’s Jimeng AI, is a versatile, high-fidelity model that unifies creative generation with precise image editing. Engineered for professional consistency and intricate text rendering, it excels at multi-subject fusion, brand identity, and high-resolution marketing assets. By bridging spatial logic with artistic control, Seedream 4.5 empowers designers with a seamless, instruction-driven workflow that transforms complex concepts into polished, commercial-grade visuals.

模型分类

Seedream v4.5 Sequential

$0.04/PIC$0.036/PIC

Seedream v4.5 Edit Sequential

$0.04/PIC$0.036/PIC

模型系列

Vidu

Vidu, a joint innovation by Shengshu AI and Tsinghua University, is a high-performance video model powered by the original U-ViT architecture that blends Diffusion and Transformer technologies. It delivers long-form, highly consistent, and dynamic video content tailored for professional filmmaking, animation design, and creative advertising. By streamlining high-end visual production, Vidu empowers creators to transform complex ideas into cinematic reality with unprecedented efficiency.

探索

模型分类

Vidu Q3-Mix Reference to Video

$0.125/SEC$0.106/SEC

Vidu Q3 Reference to Video

$0.05/SEC$0.042/SEC

Vidu Q3-Pro Start-end-to-video

$0.05/SEC$0.042/SEC

Vidu Q3-Turbo Image-to-video

$0.04/SEC$0.034/SEC

Vidu Q3-Turbo Start-end-to-video

$0.04/SEC$0.034/SEC

Vidu Q3-Turbo Text-to-video

$0.04/SEC$0.034/SEC

Vidu Q3-Pro Image-to-video

$0.05/SEC$0.042/SEC

Vidu Q3-Pro Text-to-video

$0.05/SEC$0.042/SEC

Vidu Reference-to-Video Q1

$0.4/SEC

Vidu Reference-to-Video 2.0

$0.2/SEC

Vidu Start-End-to-Video 2.0

$0.075/SEC

Image-to-video-2.0

$0.075/SEC

Vidu Q2-Turbo Image-to-video

$0.03/SEC$0.026/SEC

Vidu Q2-Pro Reference-to-video

$0.1/SEC$0.085/SEC

Vidu Q2 Reference-to-video

$0.075/SEC$0.064/SEC

Vidu Q2-Pro-Fast Start-end-to-video

$0.04/SEC$0.034/SEC

Vidu Q2-Pro Start-end-to-video

$0.04/SEC$0.034/SEC

Vidu Q2-Turbo Start-end-to-video

$0.03/SEC$0.026/SEC

Vidu Q2-Pro-Fast Image-to-video

$0.04/SEC$0.034/SEC

Vidu Q2-Pro Image-to-video

$0.04/SEC$0.034/SEC

Vidu Q2 Text-to-video

$0.05/SEC$0.042/SEC

Vidu Q1 Image-to-video

$0.4/SEC$0.34/SEC

Vidu Q1 Reference-to-video

$0.4/SEC$0.34/SEC

Vidu Q1 Start-end-to-video

$0.4/SEC$0.34/SEC

Vidu Q1 Text-to-video

$0.4/SEC$0.34/SEC

模型系列

Qwen Image

Qwen Image 2.0 is Alibaba Cloud's latest image generation model series from the Tongyi Qianwen family, comprising 4 models optimized for different use cases. This series delivers professional-grade image generation and editing capabilities with exceptional cost-performance ratio, supporting up to 2K resolution output and demonstrating outstanding performance in prompt adherence, detail rendering, and style consistency. Whether for text-to-image or image-to-image tasks, Qwen Image 2.0 provides developers, marketing teams, and content creators with efficient and reliable visual content production solutions. The series includes two tiers: Standard and Professional. The Standard edition is ideal for daily content production and cost-effective batch image generation, while the Professional edition delivers the highest quality visual output, designed for professional production workflows with stringent image quality requirements. Qwen-Image, a lightweight 7B foundation model by Alibaba, transforms long-form prompts up to 1,000 tokens into stunning native 2K (2048x2048) resolution images. It excels in Chinese text rendering, accurately handling complex layouts and classical scripts, making it the premier AI tool for high-end graphic design and cross-cultural content creation.

探索

模型分类

Qwen Image 2.0 Text-to-image

$0.035/PIC$0.028/PIC

Qwen Image 2.0 Edit

$0.035/PIC$0.028/PIC

Qwen Image 2.0 Pro Edit

$0.075/PIC$0.06/PIC

Qwen Image 2.0 Pro Text-to-image

$0.075/PIC$0.06/PIC

Qwen-Image Edit Plus 20251215

Qwen-Image Text-to-image Max

$0.075/PIC$0.052/PIC

Qwen-Image Text-to-image Plus

Qwen Image Text-to-image

$0.035/PIC$0.024/PIC

模型系列

Nano Banana

Google’s Nano Banana (Gemini 3 Image) series, featuring both standard and Pro models, combines deep semantic understanding with seamless integration for precise detail control. While the standard version delivers high-quality 1K outputs, Nano Banana Pro elevates professional workflows with versatile 1K/2K/4K resolution options with higher quality, making it the ideal solution for any creative or commercial application.

探索

模型分类

Nano Banana Pro Text-to-image Ultra

$0.15/PIC

Nano Banana Pro Edit Ultra

$0.15/PIC

Nano Banana Pro Text-to-image

$0.14/PIC

Nano Banana Pro Edit

$0.14/PIC

Nano Banana Pro Text-to-image Developer

$0.14/PIC$0.07/PIC

Nano Banana Text-to-image Developer

$0.038/PIC$0.019/PIC

Nano Banana Pro Edit Developer

$0.14/PIC$0.07/PIC

Nano Banana Edit Developer

$0.038/PIC$0.019/PIC

Nano Banana Text-to-image

$0.038/PIC

Nano Banana Edit

$0.038/PIC

模型系列

Hailuo Video

MiniMax Hailuo 视频模型提供原生 1080p (Pro) 和 768p (Standard) 的文生视频与图生视频功能，具备强大的指令遵循能力以及逼真、符合物理规律的运动表现。

探索

模型分类

Hailuo-2.3 t2v Standard

$0.28/SEC

Hailuo-2.3 t2v Pro

$0.49/SEC

Hailuo-2.3 i2v Standard

Hailuo 02 t2v Standard

$0.28/SEC

Hailuo 02 i2v Standard

模型系列

Wan 2.6

Wan 2.6 is a next-generation AI video generation model from Alibaba’s Tongyi Lab, designed for professional-quality, multimodal video creation. It combines advanced narrative understanding, multi-shot storytelling, and native audio–visual synchronization to produce smooth 1080p videos up to 15 s long from text and reference inputs. Wan 2.6 also supports character consistency and role-guided generation, enabling creators to turn scripts into cohesive scenes with seamless motion and lip syncing. Its efficiency and rich creative control make it ideal for short films, advertising, social media content, and automated video workflows.

探索

模型分类

Wan-2.6 Image-to-video Flash

$0.025/SEC$0.018/SEC

Wan-2.6 Image-to-image

$0.03/PIC$0.021/PIC

Wan-2.6 Image-to-video

$0.1/SEC$0.07/SEC

Wan-2.6 Video-to-video

$0.1/SEC$0.07/SEC

Wan-2.6 Text-to-video

$0.1/SEC$0.07/SEC

Wan 2.6 Spicy Image-to-Video

$0.1/SEC$0.07/SEC

Wan-2.6 Text-to-image

$0.03/PIC$0.021/PIC

模型系列

Flux.2 Image

Developed by Black Forest Labs, FLUX.2 is a powerhouse 32-billion parameter rectified flow Transformer model that redefines creative workflows by unifying AI image generation, editing, and composition. It transforms complex text prompts into high-fidelity visuals while offering integrated tools for professional-grade editing at resolutions up to 2K, providing a streamlined, all-in-one solution for digital artists and designers seeking unmatched precision and scalability in their visual content creation.

模型分类

Flux Kontext Dev Lora

FLUX.2 Flex Text-to-image

$0.05/PIC

FLUX.2 Pro Edit

$0.03/PIC

FLUX.2 Pro Text-to-image

$0.03/PIC

Flux Dev Lora

$0.015/PIC

模型系列

GPT Image

The GPT Image Family is OpenAI's latest suite of multimodal image generation and editing models, built on the powerful GPT architecture. This family includes three tiers — GPT Image-1, GPT Image-1.5, and GPT Image-1 Mini — each available in both Text-to-Image and Image-to-Image variants. Combining GPT's world-class language understanding with DALL·E-class visual synthesis, these models deliver exceptional prompt adherence, photorealistic rendering, and creative versatility across illustration, photography, design, and visualization tasks. The series offers flexible pricing and quality tiers to match any workflow — from rapid prototyping and high-volume content production to professional-grade final deliverables. Whether you need ultra-fast iterations at minimal cost or maximum quality for brand campaigns, the GPT Image Family has a solution tailored to your needs.

探索

模型分类

Openai GPT Image-1.5 Text-to-image

$0.009/PIC$0.008/PIC

Openai GPT Image-1.5 Edit

$0.009/PIC$0.008/PIC

Openai GPT Image-1 Text-to-image

$0.011/PIC$0.009/PIC

Openai GPT Image-1 Edit

$0.011/PIC$0.009/PIC

Openai GPT Image-1 Mini Text-to-image

$0.005/PIC$0.004/PIC

Openai GPT Image-1 Mini Edit

$0.005/PIC$0.004/PIC

模型系列

Seedance 1.5

ByteDance’s Seedance 1.5 Pro is a powerful AI video generation model that seamlessly integrates native audio with film-grade cinematography. Engineered for emotional storytelling and superior visual quality, it enables creators to produce immersive, narrative-driven content for professional filmmaking and advertising, setting a new standard for artistic precision and production efficiency.

探索

模型分类

Seedance v1.5 Pro Image-to-Video

$0.052/SEC$0.047/SEC

Seedance v1.5 Pro Text-to-Video

$0.052/SEC$0.047/SEC

Seedance v1.5 Pro Image-to-Video Fast

$0.02/SEC$0.018/SEC

Seedance v1.5 Pro Text-to-Video Fast

$0.02/SEC$0.018/SEC

模型系列

ERNIE Image

ERNIE-Image is an open-weight text-to-image model developed by the ERNIE-Image Team at Baidu, built on a single-stream Diffusion Transformer (DiT) with 8B parameters and paired with a lightweight Prompt Enhancer that rewrites short prompts into richer, more structured descriptions before passing them to the diffusion backbone. NYU Shanghai RITS Released on April 15, 2026 under the Apache 2.0 license, it transforms natural language descriptions into detailed imagery with particular strength in text rendering and structured layout generation. ERNIE-Image is designed not only for strong visual quality, but for controllability in practical generation scenarios where accurate content realization matters as much as aesthetics — making it well-suited for commercial posters, comics, multi-panel layouts, and other content creation tasks that require both visual quality and precise control.

探索

模型分类

Baidu ERNIE Image Turbo Text-to-image

模型系列

ElevenLabs v3

探索

模型分类

ElevenLabs v3 Text-to-Speech

$0.1/PIC

模型系列

Seedream 4

Seedream v4, a cutting-edge image generation model by ByteDance, redefines creative workflows by combining lightning-fast inference speeds with breathtaking 4K high-definition output. Beyond its raw performance, the model leverages advanced knowledge and reasoning to interpret complex prompts with precision, enabling seamless prompt-based editing and a vast spectrum of versatile artistic styles that make it the ultimate solution for professional design, content creation, and digital marketing.

探索

模型分类

Seedream v4

$0.03/PIC$0.027/PIC

Seedream v4 Sequential

$0.03/PIC$0.027/PIC

Seedream v4 Edit

$0.03/PIC$0.027/PIC

Seedream v4 Edit Sequential

$0.03/PIC$0.027/PIC

模型系列

Imagen Image

Imagen is Google’s diffusion-based image generation family, designed for photorealism, creativity, and scalable content workflows. With options from fast inference to ultra-high fidelity, Imagen balances speed, detail, and enterprise reliability.

模型分类

模型系列

Seedance

Seedance API 让开发者能够在 Atlas Cloud 上访问 ByteDance 全系列的视频生成模型，从涵盖 480p 到 1080p 的 Lite 和 Pro 层级，到具备原生音频的多模态 Seedance 2.0 系列。通过一个兼容 OpenAI 的密钥生成电影级的文生视频、图生视频以及参考生视频，无需排队等待，并保证 99.99% 的正常运行时间。

探索

模型分类

Seedance v1.5 Pro Image-to-Video Spicy

$0.049/SEC

Seedance v1 Pro Fast Text-to-video

$0.01/SEC$0.009/SEC

Seedance v1 Pro Fast Image-to-video

$0.01/SEC$0.009/SEC

Seedance v1 Pro t2v 1080p

$0.122/SEC$0.11/SEC

Seedance v1 Pro t2v 720p

$0.052/SEC$0.047/SEC

Seedance v1 Pro t2v 480p

$0.024/SEC$0.022/SEC

Seedance v1 Pro i2v 720p

$0.052/SEC$0.047/SEC

Seedance v1 Pro i2v 480p

$0.024/SEC$0.022/SEC

Seedance v1 Pro i2v 1080p

$0.122/SEC$0.11/SEC

模型系列

Van Video

Built on the Wan 2.5 and 2.6 frameworks, Van Model is a flagship AI video series that delivers superior high-resolution outputs with unmatched creative freedom. By blending cinematic 3D VAE visuals with Flow Matching dynamics, it leverages proprietary compute distillation to offer ultra-fast inference speeds at a fraction of the cost, making it the premier engine for scalable, high-frequency video production on a budget.

探索

模型分类

Van-2.6 Text-to-video

$0.068/SEC

Van-2.6 Image-to-video

$0.068/SEC

Van-2.5 Image-to-video

$0.054/SEC

Van-2.5 Text-to-video

$0.068/SEC

模型系列

Wan 2.5

Wan 2.5 is Alibaba’s state-of-the-art multimodal video generation model, capable of producing high-fidelity, audio-synchronized videos from text or images. It delivers realistic motion, natural lighting, and strong prompt alignment across 480p to 1080p outputs—ideal for creative and production-grade workflows.

探索

模型分类

Wan-2.5 Text-to-video Fast

$0.071/SEC

Wan-2.5 Text-to-video

$0.05/SEC$0.035/SEC

Wan-2.5 Image-to-video

$0.05/SEC$0.035/SEC

Wan-2.5 Image-to-video Fast

$0.071/SEC

Wan-2.5 Image Edit

$0.03/PIC$0.021/PIC

Wan-2.5 Text-to-image

$0.03/PIC$0.021/PIC

模型系列

Kling

Kling AI is a text-to-video model developed by Kuaishou that creates realistic, high-quality videos from text prompts. It focuses on smooth motion, stable frames, and natural-looking scenes. Kling works well for short videos, ads, and marketing content, helping creators save time and reduce production costs. With strong performance in video consistency and realism, Kling AI is becoming a popular choice in the AI video generation space.

探索

模型分类

Kling v2.6 Pro Avatar

$0.112/SEC$0.095/SEC

Kling v2.6 Std Avatar

$0.056/SEC$0.048/SEC

Kling v2.6 Pro Motion Control

$0.112/SEC$0.095/SEC

Kling v2.6 Std Motion Control

$0.07/SEC$0.06/SEC

Kling v2.6 Pro Text-to-Video

$0.07/SEC$0.06/SEC

Kling v2.6 Pro Image-to-Video

$0.07/SEC$0.06/SEC

Kling Video O1 Image-to-video

$0.112/SEC$0.095/SEC

Kling Video O1 Text-to-video

$0.112/SEC$0.095/SEC

Kling v2.5 Turbo Pro Text-to-video

$0.07/SEC$0.06/SEC

Kling v2.5 Turbo Pro Image-to-video

$0.07/SEC$0.06/SEC

Kling v2.1 i2v Pro Start-end-frame

$0.098/SEC$0.083/SEC

Kling v1.6 Multi i2v Pro

$0.098/SEC$0.083/SEC

Kling v1.6 Multi i2v Standard

$0.056/SEC$0.048/SEC

Kling Effects

$0.25/SEC$0.212/SEC

kling v2.0 i2v Master

$0.28/SEC$0.238/SEC

Kling v2.1 t2v Master

$0.28/SEC$0.238/SEC

Kling v2.0 t2v Master

$0.28/SEC$0.238/SEC

Kling v2.1 i2v Master

$0.28/SEC$0.238/SEC

Kling v2.1 i2v Pro

$0.098/SEC$0.083/SEC

Kling v1.6 t2v Standard

$0.056/SEC$0.048/SEC

Kling v1.6 i2v Pro

$0.098/SEC$0.083/SEC

Kling v2.1 i2v Standard

$0.056/SEC$0.048/SEC

Kling v1.6 i2v Standard

$0.056/SEC$0.048/SEC

模型系列

Wan 2.2

Wan 2.2 introduces a Mixture-of-Experts (MoE) architecture that enables greater capacity and finer motion control without higher inference cost, supporting both text-to-video and image-to-video generation with high visual fidelity, smooth motion, and cinematic realism optimized for real-world GPU deployment.

探索

模型分类

Wan-2.2-spicy Image-to-video Lora

$0.04/SEC

Wan-2.2-spicy Image-to-video

$0.03/SEC

Wan-2.2-spicy Video Extend

$0.032/SEC

Wan-2.2 Video Character Swap

$0.18/SEC$0.126/SEC

Wan-2.2 Image To Animation

$0.12/SEC$0.084/SEC

模型系列

Tools

Open, advanced large-scale image generative models that power high-fidelity creation and editing with modular APIs, reproducible training, built-in safety guardrails, and elastic, production-grade inference at scale.

探索

模型分类