LIMITED-TIME OFFER|20% OFF Seedance 2.0 & 2.0 Mini!
Wan API: Open Video Generation by Alibaba

Wan API: Open Video Generation by Alibaba

The Wan API brings Alibaba's open Wan video models to Atlas Cloud through one unified key. Wan 2.2 pioneered a Mixture-of-Experts architecture for video diffusion, lifting capacity and motion control at the same inference cost. It handles text to video, image to video, and video to video, with first-to-last frame control, extend, and upscaling, all reachable alongside 300+ models.

Explore the Leading Wan

Atlas Cloud provides you with the latest industry-leading creative models.

Every Wan API Model and Mode

Match each job to the right Wan 2.2 variant: the cinematic A14B models, the lightweight TI2V-5B, and turbo builds for speed, across text, image, and video to video, all through one key on Atlas Cloud.

Model Description
Wan 2.2 T2V-A14B (Text to Video)The flagship text-to-video model, using the Mixture-of-Experts architecture for cinematic motion and fine aesthetic control. Best when you need the strongest fidelity from a pure text prompt.
Wan 2.2 I2V-A14B (Image to Video)Animates a still image into moving footage while preserving subject identity and texture, with an MoE design that holds detail as frames evolve. A fit for product shots and concept art brought to life.
Wan 2.2 V2V (Video to Video)Transforms existing footage: restyle a clip, shift its tone, or apply a new look while keeping the original motion intact. Turns one source video into several variations without a reshoot.
Wan 2.2 TI2V-5B (Text and Image to Video)A lighter model that combines text and image inputs in one checkpoint, generating 720p at 24fps. Efficient enough for consumer GPUs, so it suits quick, cost-aware generation.
Wan 2.2 Turbo (Accelerated)A speed-first path built on an rCM sampler and 4-step distillation, cutting latency while retaining strong fidelity. Built for iteration, batch generation, and prompt testing.
Wan 2.2 Video ExtendLengthens an existing clip into a longer sequence while keeping motion and lighting continuous with the source. Turns a short draft into a fuller deliverable.
Wan 2.2 UpscaleSharpens footage to 1080p or 2K while preserving timing and composition, with 4K planned for a later release. A finishing step for drafts and older clips.

Wan API Features

The Wan API brings Alibaba's open Wan 2.2 to Atlas Cloud with a Mixture-of-Experts architecture, cinematic motion control, first-to-last frame guidance, video extend, and upscaling, all behind one unified key.

Mixture-of-Experts Video Architecture

Wan 2.2 was the first video diffusion model to adopt a Mixture-of-Experts design, splitting the denoising process across specialized experts. This raises model capacity and detail without increasing inference cost, so you get richer scenes and more accurate motion at the same compute budget.

Cinematic Motion and Aesthetic Control with the Wan API

Trained on curated data labeled for lighting, composition, contrast, and color, the Wan API gives fine control over the look of a shot. Wan 2.2 handles complex motion, dynamic camera moves, and fluid transitions, so prompts like an aerial orbit or a handheld tracking shot render as intended.

Text, Image, and Video to Video

Wan 2.2 covers text to video, image to video, and video to video in one model, so you can start from a prompt, animate a still, or transform existing footage. Switching modes is a parameter change, which keeps a mixed pipeline on a single integration.

First-to-Last Frame Control with the Wan API

The Wan API supports first-and-last frame guidance, letting you set the opening and closing frames and have the model generate the motion between them. This gives directors precise control over how a shot begins and resolves, useful for product reveals and scripted beats.

Video Extend and Upscaling

Beyond fresh generation, Wan 2.2 extends existing clips into longer sequences while preserving motion, and upscales footage to 1080p or 2K while keeping timing and composition intact. Together these turn short drafts into polished, longer deliverables without a reshoot.

Turbo Acceleration with the Wan API

The Wan API offers a turbo path built on an rCM sampler and 4-step distillation, compressing the denoising steps for low-latency generation. It keeps strong visual fidelity while cutting wait time, which suits iteration, batch runs, and prompt testing.

One Prompt Across the Wan API and Beyond

Run the same prompt through the Wan API and other leading video models on Atlas Cloud, and compare how each handles cinematic motion, restyling, and open-source flexibility in a single scene.

Prompt

Cinematic multi-shot mood piece in 8 seconds, controlled lighting and color. Shot 1, slow dolly-in: a dim study at night, a single desk lamp pooling warm light over an open book as dust drifts in the beam. Shot 2, hard cut: a wide shot of a lone figure standing at a floor-to-ceiling window, city lights bokeh-soft behind rain on the glass. Shot 3, close-up: fingers trace the rim of a coffee cup, steam curling upward in the cool blue light. Shot 4, low angle: the figure turns toward a doorway where warm hallway light spills in, silhouette sharpening. Shot 5, dramatic wide: the room in soft dawn light, curtains glowing as the lamp clicks off. Rich color grading, deliberate lighting shifts, shallow depth of field, cinematic, crisp 1080p.

Wan 2.2

Pixverse v6

Veo 3.1

Prompt

Restyle and motion showcase in 8 seconds, hard cuts. Shot 1: ordinary daytime street footage of a person walking through a market, transformed into a warm, painterly illustrated style while the walking motion stays natural and continuous. Shot 2, hard cut: the same scene restyled into a cool, cinematic teal-and-orange film look, crowd movement preserved. Shot 3: a slow push-in on a fruit stall as the style shifts to soft watercolor, colors bleeding gently at the edges. Consistent underlying motion across every restyle, smooth transitions, clean detail, crisp 1080p.

Wan 2.2

Pixverse v6

Veo 3.1

What You Can Build with the Wan API

From social clips and product videos to previs, restyling, and self-hosted pipelines, the Wan API turns Alibaba's open Wan 2.2 into production features through one unified key on Atlas Cloud.

Cinematic Social and Short-Form Video

Turn a prompt or a single image into short, cinematic clips for TikTok, Reels, and Shorts, with the motion control and aesthetic grading Wan 2.2 is built for. Creator tools and social teams can produce polished vertical content without a shoot.

Product and Marketing Videos with the Wan API

Animate a product photo into a moving showcase with the Wan API, using image to video and first-to-last frame control to script how the reveal opens and closes. Marketing teams get repeatable ad clips from stills, sized for each channel.

Previs and Storyboards

Generate quick motion references from scripts and concept art before committing to a full shoot. Wan 2.2's camera and motion control let directors test staging, pacing, and shot transitions cheaply, then iterate before production.

Restyle and Repurpose Existing Footage with the Wan API

Use the Wan API's video to video and extend modes to restyle clips, shift tone, or lengthen a scene while keeping motion continuous. One source clip becomes several variations or a longer cut without reshooting.

Remaster and Upscale Video

Sharpen and lengthen older or low-resolution footage with Wan 2.2's upscaling to 1080p or 2K and its extend mode, preserving timing and composition. This gives archives and rough drafts a clean, deliverable finish.

Self-Hosted or Cloud Video Pipelines with the Wan API

Because Wan 2.2 is open under Apache 2.0, you can self-host the weights or reach the Wan API on Atlas Cloud to skip the GPU setup. Wire it into an automated pipeline that turns rows of data into finished clips at scale, all through one integration.

How the Wan API Compares

See how the Wan API lines up against other leading video models on architecture, inputs, and licensing, so you can pick the model that fits, all reachable on Atlas Cloud.

ModelProviderArchitectureInputsOpen WeightsBest For
Wan 2.2AlibabaMixture-of-Experts diffusionText, image, videoYes (Apache 2.0)Cinematic open-source video with self-host option
Seedance 2.0ByteDanceProprietaryText, image, video, audioNoMultimodal, reference-driven cinematic video
Kling 3.0KuaishouProprietaryText, image, videoNoAI Director storytelling, multilingual dialogue
HailuoMiniMaxProprietaryText, imageNoLifelike physics motion and anime styles
Veo 3.1GoogleProprietaryText, imageNoCinematic, prompt-faithful short clips

How to Use Wan on Atlas Cloud

Get started in minutes — follow these simple steps to integrate and deploy models through Atlas Cloud's platform.

Create an Atlas Cloud Account

Sign up at atlascloud.ai and complete verification. New users receive free credits to explore the platform and test models.

Why Use Wan on Atlas Cloud

Combining the advanced Wan models with Atlas Cloud's GPU-accelerated platform provides unmatched performance, scalability, and developer experience.

Performance & flexibility

Low Latency:
GPU-optimized inference for real-time reasoning.

Unified API:
Run Wan, GPT, Gemini, and DeepSeek with one integration.

Transparent Pricing:
Predictable per-token billing with serverless options.

Enterprise & Scale

Developer Experience:
SDKs, analytics, fine-tuning tools, and templates.

Reliability:
99.99% uptime, RBAC, and compliance-ready logging.

Security & Compliance:
SOC 2 Type II, HIPAA alignment, data sovereignty in US.

Wan API FAQ

The Wan API gives developers Alibaba's open Wan video models on Atlas Cloud through one unified key. This page runs Wan 2.2, a foundational video model that pioneered a Mixture-of-Experts architecture for video diffusion, covering text to video, image to video, and video to video with cinematic motion control. It sits alongside 300+ other models on the same account, so you reach it with OpenAI-compatible endpoints and no separate setup.

Wan 2.2 ships in a few variants: the A14B models built for cinematic-quality text to video and image to video, and the lighter TI2V-5B model that combines both modes and runs at 720p on consumer GPUs. Atlas Cloud also offers turbo variants that trade a little fidelity for faster, lower-latency generation. Pick A14B for maximum quality, TI2V-5B for efficiency, and turbo for rapid iteration.

The Wan API covers text to video, image to video, and video to video, so you can start from a prompt, animate a still, or transform existing footage. It also supports first-to-last frame control, video extend to lengthen a clip, and upscaling to sharpen resolution. Each mode is a parameter or endpoint change, which keeps a mixed pipeline on one integration.

Mixture-of-Experts, or MoE, splits the denoising process across specialized expert models that each handle part of the work. Wan 2.2 was the first video diffusion model to use this design, which raises overall model capacity and detail without increasing inference cost per step. In practice that means richer scenes and more accurate motion at the same compute budget.

Wan 2.2 generates natively at 480p and 720p, with higher resolution available through upscaling to 1080p or 2K, and typical clip lengths in the short range of a few seconds per generation. The lighter TI2V-5B variant targets 720p at 24fps.

Yes. Wan 2.2 is released by Alibaba under the Apache 2.0 license, with model weights and inference code published on HuggingFace and GitHub, so you can download, modify, and self-host it. Running it locally needs a GPU with at least 16GB of VRAM, ideally more for the larger A14B models. Atlas Cloud hosts the Wan API so you can skip the hardware and scaling work entirely.

Yes. The Apache 2.0 license allows commercial use, modification, and redistribution, so video generated with Wan 2.2 can go into commercial projects. Review Atlas Cloud's terms of service for the specifics of your plan, and note the usual restrictions around generating content that depicts real, identifiable people without their consent.

The turbo variants apply an rCM sampler and 4-step distillation, which compress the denoising process into far fewer steps. This lowers latency and cost per clip while keeping strong visual fidelity, which makes turbo a good fit for iteration, batch runs, and prompt testing. Move to the standard A14B models when a final render needs maximum detail.

Generation is asynchronous: each request returns a prediction ID that you poll until the clip is ready, which fits queues and high-volume runs. Add exponential backoff and a retry on a 429 response, and use the turbo variants for drafts to keep throughput high. Contact support to raise concurrency limits as your workload grows.

Create an account on Atlas Cloud, generate an API key, and send a request to the Wan model with your prompt or input image through the OpenAI-compatible endpoint. Poll the prediction endpoint for the finished clip, then scale up as needed. Because the same key reaches 300+ models, you can test other video and image models without any extra setup.

Explore More Families

Seedance 2.0

The Seedance 2.0 API gives you production access to ByteDance's multimodal video model — quad-modal inputs (text, image, video, audio) and an industry-leading "Universal Reference" system that locks composition, camera movement, and character actions across shots. Integrate director-level control with one API call, a flat $0.09/s, instant key, and no waitlist — backed by enterprise-grade uptime and compliance. Seedance 2.0 Native 4K Is Now Live in June, 2026!

View Family

Grok Imagine

The Grok Imagine API gives developers xAI's image, video, and audio generation in one suite. It produces up to 2K images with multilingual text rendering, plus video up to 15 seconds with native, synchronized audio and reference-based editing. On Atlas Cloud one key runs every Grok Imagine mode, so you move between image, video, and audio without separate setups, from $0.02 per image and $0.05 per second.

View Family

Gemini Omni

Gemini Omni (by Google DeepMind) is a video generation and editing model launched on May 20, 2026 at Google I/O that redefines the standard for "reasoning-driven creation," built specifically to solve the core challenge of AI video: making output that actually understands what you mean, not just what you type. It fuses Gemini's reasoning engine with generative capability, accepting any mix of images, text, video, and audio to produce consistent, knowledge-grounded output. Unlike models that start from scratch each time, Omni lets you edit through natural conversation — swapping objects, rewriting scenes, shifting styles — while keeping physics, characters, and continuity intact across every turn.

View Family

Happy Horse

HappyHorse leads the Artificial Analysis Video Arena leaderboard for both text-to-video and image-to-video generation. The HappyHorse 1.0 API and HappyHorse 1.1 API give developers direct access to Alibaba's unified video model — no multi-stage pipeline, and a single integration for both modalities. Generate 1080p video with synchronized audio straight from your code.

View Family

GPT Image 2

The GPT Image 2 API gives developers access to OpenAI's latest image model, the successor to GPT Image 1.5. It generates and edits images with accurate text rendering across Latin and CJK scripts, plus strong composition for posters, mockups, and infographics. On Atlas Cloud you reach it through one unified API alongside 300+ models, with free credits, 99.99% uptime, and no OpenAI organization verification required.

View Family

Google

Google's most powerful creative models are all available on Atlas Cloud. Veo 3.1 delivers cinematic video generation, Nano Banana 2 powers high-fidelity image creation, and Gemini brings multimodal intelligence to every workflow. Access the full Google model suite through one API key with Day-0 availability and pay-as-you-go pricing.

View Family

Seedance 2.0 Mini

The Seedance 2.0 Mini API is the lightest, lowest-cost tier of ByteDance's Seedance video line, built for teams where throughput and unit cost matter more than maximum polish. Use it for batch generation, rapid prototyping, and draft passes, all through one OpenAI-compatible key on Atlas Cloud.

View Family

ByteDance

From cinematic video generation to high-fidelity image creation, ByteDance's most powerful models are live on Atlas Cloud. Run Seedance and Seedream at scale with the lowest inference pricing and zero infrastructure overhead.

View Family

Alibaba

Atlas Cloud brings together Alibaba's full model lineup under one API: Qwen for language and image tasks, Wan for video generation up to 1080p. Access every model pay-as-you-go with no subscriptions. The Alibaba API is available via a single base URL using your existing OpenAI-compatible client.

View Family

OpenAI

Atlas Cloud gives you access to the full OpenAI API lineup, from GPT Image 2 for image generation to Sora 2 for video. Every model is available pay-as-you-go with no monthly commitment. Plug in with a single base URL swap using the OpenAI-compatible API.

View Family

xAI

Build complete image and video pipelines using the xAI API on Atlas Cloud. Generate at 2K, edit with reference images, and animate images into audio-synced clips.

View Family

Kwaivgi

The Kwaivgi API at 15% off standard rates. Day-0 access to every new Kling release, pay-as-you-go, no seat limits. One account covers the full Kling lineup.

View Family

One API for All Media AI.

Explore all models

Join our Discord community

Join the Discord community for the latest model updates, prompts, and support.