How Atlas Cloud Works

Architecture Overview

Atlas Cloud is an AI API aggregation platform that sits between your application and multiple AI model providers. Instead of integrating with each provider separately, you use a single Atlas Cloud API key and consistent API endpoints to access 300+ models from dozens of providers.

Your Application
      │
      ▼
  Atlas Cloud API  ──────  Unified authentication, billing, and monitoring
      │
      ├── DeepSeek (V3, Coder)
      ├── Alibaba (Qwen, Qwen-Image)
      ├── ByteDance (Seedream, Seedance, Kling)
      ├── Black Forest Labs (FLUX)
      ├── MoonshotAI (Kimi)
      ├── MiniMax (Hailuo)
      ├── Luma AI (Video)
      ├── Zhipu AI (GLM)
      └── ... 20+ more providers

How Requests Work

Synchronous APIs (LLM / Chat)

LLM chat completions return responses synchronously, just like the OpenAI API:

You send a POST request to /v1/chat/completions with your prompt
Atlas Cloud routes the request to the selected model provider
You receive the response directly (or via streaming chunks)

Client  →  POST /v1/chat/completions  →  Atlas Cloud  →  Model Provider
Client  ←  Response (text/stream)     ←  Atlas Cloud  ←  Model Provider

Asynchronous APIs (Image / Video Generation)

Image and video generation tasks run asynchronously because they take longer to process:

You send a POST request to submit a generation task
Atlas Cloud returns a prediction ID in data.id immediately
You poll the /api/v1/model/prediction/{id} endpoint to check task status
Once completed, you receive the output URL(s) in data.outputs

Client  →  POST /api/v1/model/generateImage                →  Atlas Cloud
Client  ←  { data: { id: "abc123", status: "processing" }} ←  Atlas Cloud

Client  →  GET /api/v1/model/prediction/abc123              →  Atlas Cloud
Client  ←  { data: { status: "processing" }}                ←  Atlas Cloud

Client  →  GET /api/v1/model/prediction/abc123              →  Atlas Cloud
Client  ←  { data: { status: "completed", outputs: ["https://..."] }} ← Atlas Cloud

For more details on polling, see Predictions.

API Endpoints Summary

Endpoint	Method	Type	Description
`/v1/chat/completions`	POST	Synchronous	LLM chat (OpenAI-compatible)
`/api/v1/model/generateImage`	POST	Asynchronous	Image generation
`/api/v1/model/generateVideo`	POST	Asynchronous	Video generation
`/api/v1/model/uploadMedia`	POST	Synchronous	Upload files for generation tasks
`/api/v1/model/prediction/{id}`	GET	Synchronous	Get async task results

Authentication

All API requests require an API key in the Authorization header:

Authorization: Bearer your-api-key

Get your API key from the Atlas Cloud Console. See the API Keys guide for details.

Key Benefits

One API, 300+ Models

No need to manage multiple provider accounts, API keys, or billing relationships. Atlas Cloud handles all provider integrations for you.

OpenAI SDK Compatible

The LLM API is fully compatible with the OpenAI SDK. Switch to Atlas Cloud by changing just two lines of code — the base URL and API key.

Optimized Infrastructure

Atlas Cloud's inference infrastructure is optimized for speed and reliability:

Image generation in under 5 seconds
Video generation in under 2 minutes
99.9% API uptime

Unified Billing

One account, one balance, one invoice — regardless of how many models or providers you use. Monitor usage and costs in real-time from the Console.

Next Steps

Quick Start — Make your first API call
Predictions — Understanding async task flow
Upload Files — Upload media for generation workflows
Model APIs — Explore available models

On This Page