alibaba/wan-2.6/text-to-video

text-to-video

Wan 2.6 Text-to-Video API by Alibaba

alibaba/wan-2.6/text-to-video

Text-to-video

A speed-optimized text-to-video option that prioritizes lower latency while retaining strong visual fidelity. Ideal for iteration, batch generation, and prompt testing.

INPUT

Loading parameter configuration...

OUTPUT

Idle

Your generated videos will appear here

Configure your settings and click Run to get started

Your request will cost $0.07 per run. For $10 you can run this model approximately 142 times.

Here's what you can do next:

Seedance 2.0 Kling v3 Vidu Wan2.7

Parameters

Code Example
import requests
import time

# Step 1: Start video generation
generate_url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "alibaba/wan-2.6/text-to-video",
    "prompt": "A beautiful sunset over the ocean with gentle waves",
    "width": 512,
    "height": 512,
    "duration": 3,
    "fps": 24,
}

generate_response = requests.post(generate_url, headers=headers, json=data)
generate_result = generate_response.json()
prediction_id = generate_result["data"]["id"]

# Step 2: Poll for result
poll_url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"

def check_status():
    while True:
        response = requests.get(poll_url, headers={"Authorization": "Bearer $ATLASCLOUD_API_KEY"})
        result = response.json()

        if result["data"]["status"] in ["completed", "succeeded"]:
            print("Generated video:", result["data"]["outputs"][0])
            return result["data"]["outputs"][0]
        elif result["data"]["status"] == "failed":
            raise Exception(result["data"]["error"] or "Generation failed")
        else:
            # Still processing, wait 2 seconds
            time.sleep(2)

video_url = check_status()

Install

Install the required package for your language.

pip install requests

Authentication

All API requests require authentication via an API key. You can get your API key from the Atlas Cloud dashboard.

export ATLASCLOUD_API_KEY="your-api-key-here"

HTTP Headers

import os

API_KEY = os.environ.get("ATLASCLOUD_API_KEY")
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}

Keep your API key secure

Never expose your API key in client-side code or public repositories. Use environment variables or a backend proxy instead.

Submit a request

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "your-model",
    "prompt": "A beautiful landscape"
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

Submit a Request

Submit an asynchronous generation request. The API returns a prediction ID that you can use to check the status and retrieve the result.

POST/api/v1/model/generateVideo

Request Body

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}

data = {
    "model": "alibaba/wan-2.6/text-to-video",
    "prompt": "A beautiful sunset over the ocean with gentle waves"
}

response = requests.post(url, headers=headers, json=data)
result = response.json()

print(f"Prediction ID: {result['data']['id']}")
print(f"Status: {result['data']['status']}")

Response

{
  "code": 200,
  "data": {
    "id": "pred_abc123",
    "status": "processing",
    "model": "model-name",
    "created_at": "2025-01-01T00:00:00Z"
  }
}

Check Status

Poll the prediction endpoint to check the current status of your request.

GET/api/v1/model/prediction/{prediction_id}

Polling Example

import requests
import time

prediction_id = "pred_abc123"
url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

while True:
    response = requests.get(url, headers=headers)
    result = response.json()
    status = result["data"]["status"]
    print(f"Status: {status}")

    if status in ["completed", "succeeded"]:
        output_url = result["data"]["outputs"][0]
        print(f"Output URL: {output_url}")
        break
    elif status == "failed":
        print(f"Error: {result['data'].get('error', 'Unknown')}")
        break

    time.sleep(3)

Status Values

processingThe request is still being processed.

completedGeneration is complete. Outputs are available.

succeededGeneration succeeded. Outputs are available.

failedGeneration failed. Check the error field.

Completed Response

{
  "data": {
    "id": "pred_abc123",
    "status": "completed",
    "outputs": [
      "https://storage.atlascloud.ai/outputs/result.mp4"
    ],
    "metrics": {
      "predict_time": 45.2
    },
    "created_at": "2025-01-01T00:00:00Z",
    "completed_at": "2025-01-01T00:00:10Z"
  }
}

Upload Files

Upload files to Atlas Cloud storage and get a URL you can use in your API requests. Use multipart/form-data to upload.

POST/api/v1/model/uploadMedia

Upload Example

import requests

url = "https://api.atlascloud.ai/api/v1/model/uploadMedia"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

with open("image.png", "rb") as f:
    files = {"file": ("image.png", f, "image/png")}
    response = requests.post(url, headers=headers, files=files)

result = response.json()
download_url = result["data"]["download_url"]
print(f"File URL: {download_url}")

Response

{
  "data": {
    "download_url": "https://storage.atlascloud.ai/uploads/abc123/image.png",
    "file_name": "image.png",
    "content_type": "image/png",
    "size": 1024000
  }
}

Input Schema

The following parameters are accepted in the request body.

Total: 0Required: 0Optional: 0

No parameters available.

Example Request Body

{
  "model": "alibaba/wan-2.6/text-to-video"
}

Output Schema

The API returns a prediction response with the generated output URLs.

idstringrequired

Unique identifier for the prediction.

statusstringrequired

Current status of the prediction.

processingcompletedsucceededfailed

modelstringrequired

The model used for generation.

outputsarray[string]

Array of output URLs. Available when status is "completed".

errorstring

Error message if status is "failed".

metricsobject

Performance metrics.

predict_timenumber

Time taken for video generation in seconds.

created_atstringrequired

ISO 8601 timestamp when the prediction was created.

Format: date-time

completed_atstring

ISO 8601 timestamp when the prediction was completed.

Format: date-time

Example Response

{
  "id": "pred_abc123",
  "status": "completed",
  "model": "model-name",
  "outputs": [
    "https://storage.atlascloud.ai/outputs/result.mp4"
  ],
  "metrics": {
    "predict_time": 45.2
  },
  "created_at": "2025-01-01T00:00:00Z",
  "completed_at": "2025-01-01T00:00:10Z"
}

Atlas Cloud Skills

Atlas Cloud Skills integrates 300+ AI models directly into your AI coding assistant. One command to install, then use natural language to generate images, videos, and chat with LLMs.

Supported Clients

Claude Code

OpenAI Codex

Gemini CLI

Cursor

Windsurf

VS Code

Trae

GitHub Copilot

Cline

Roo Code

Amp

Goose

Replit

40+ supported clients

Install

npx skills add AtlasCloudAI/atlas-cloud-skills

Setup API Key

Get your API key from the Atlas Cloud dashboard and set it as an environment variable.

export ATLASCLOUD_API_KEY="your-api-key-here"

Capabilities

Once installed, you can use natural language in your AI assistant to access all Atlas Cloud models.

Image GenerationGenerate images with models like Nano Banana 2, Z-Image, and more.

Video CreationCreate videos from text or images with Kling, Vidu, Veo, etc.

LLM ChatChat with Qwen, DeepSeek, and other large language models.

Media UploadUpload local files for image editing and image-to-video workflows.

Learn more

github.com/AtlasCloudAI/atlas-cloud-skills

MCP Server

Atlas Cloud MCP Server connects your IDE with 300+ AI models via the Model Context Protocol. Works with any MCP-compatible client.

Supported Clients

Cursor

VS Code

Windsurf

Claude Code

OpenAI Codex

Gemini CLI

Cline

Roo Code

100+ supported clients

Install

npx -y atlascloud-mcp

Configuration

Add the following configuration to your IDE's MCP settings file.

{
  "mcpServers": {
    "atlascloud": {
      "command": "npx",
      "args": [
        "-y",
        "atlascloud-mcp"
      ],
      "env": {
        "ATLASCLOUD_API_KEY": "your-api-key-here"
      }
    }
  }
}

Available Tools

atlas_generate_imageGenerate images from text prompts.

atlas_generate_videoCreate videos from text or images.

atlas_chatChat with large language models.

atlas_list_modelsBrowse 300+ available AI models.

atlas_quick_generateOne-step content creation with auto model selection.

atlas_upload_mediaUpload local files for API workflows.

Learn more

github.com/AtlasCloudAI/mcp-server

API Schema

Schema not available

No examples available

🎬MULTI-SHOT VIDEO GENERATION

Wan 2.6Professional Multi-Shot AI Video Creation

Alibaba's latest breakthrough in AI video generation. Create up to 15-second 1080p videos with multi-shot storytelling, reference-driven character consistency, and native audio-visual synchronization. The first model to truly understand storyboard logic for cinematic narratives.

Revolutionary Breakthroughs

What makes Wan 2.6 the game-changer in AI video generation

Multi-Shot Storytelling

First model to understand storyboard logic. Automatically generates sequential shots with coherent transitions, maintaining character appearance and environment consistency across scene changes—enabling complete story arcs in a single 15-second generation.

Reference-to-Video (R2V)

Upload a 2-30 second reference video to extract and preserve character appearance, movement patterns, and voice characteristics. Create consistent character performances across multiple videos with unprecedented accuracy.

Accurate Text Rendering

Industry-leading text rendering capabilities for product packaging, signage, and branded content. Generate clear, readable text within video frames—essential for marketing and commercial applications.

Core Capabilities

Extended 15-Second Duration

Generate up to 15 seconds per video with complete "Three Act" structure (Setup → Action → Resolution)

Professional 1080p Quality

Native 1080p output at 24fps with cinematic quality and enhanced visual stability

Native Audio Sync

Dialogue matches lip movements, background music aligns with pacing, sound effects trigger perfectly

Character Consistency

Maintain character appearance, costumes, and identity across shots and multiple videos

Cinematic Camera Control

Professional camera movements including pans, zooms, tracking shots, and dolly movements

Flexible Aspect Ratios

16:9 (YouTube), 9:16 (Reels), 1:1 (Square) - platform-optimized without post-production cropping

Wan 2.6 vs Wan 2.5: Major Improvements

See what's new in the latest release

Video Duration

Up to 15 seconds

Wan 2.5: 10 seconds max

Multi-Shot Capability

Understands storyboard logic

Wan 2.5: Single shot or messy morphing

Reference Video Support

R2V mode with full preservation

Wan 2.5: Image reference only

Character Consistency

Excellent across shots

Wan 2.5: Character drift issues

Motion Stability

Reduced jitter and artifacts

Wan 2.5: Occasional frame drift

Prompt Understanding

Complex multi-character scenes

Wan 2.5: Basic scene generation

Three Specialized Generation Modes

Choose the right mode for your creative workflow

Text-to-Video (T2V)

Image-to-Video (I2V)

Enhanced

Transform still images into motion videos with improved motion coherence. Ideal for product showcases, photo animation, and visual storytelling.

Precise text rendering for products
Style consistency across frames
Natural motion from static images
Narrative-driven visual optimization

Reference-to-Video (R2V)

NEW

Upload a reference video (2-30s) to preserve character appearance, movement patterns, and voice. Strongest consistency guarantee for character-driven content.

Full character identity preservation
Voice characteristics extraction
Movement pattern replication
Multi-character co-acting scenes

Perfect For

Marketing & Advertising

Product demos with text rendering, brand campaigns with character consistency, and promotional videos

Content Creation

YouTube videos, social media reels, multi-shot storytelling, and video editing workflows

E-commerce

Product showcases with accurate text, tutorial videos, and customer testimonial recreation

Education & Training

Instructional content, course materials, and multi-scene educational narratives

Entertainment

Short films, character-driven stories, cinematic sequences, and creative experiments

Pre-visualization

Film concept development, storyboard creation, and scene planning for productions

Wan 2.6 T2V, I2V, and R2V API Integration

Complete API suite for Text-to-Video, Image-to-Video, and Reference-to-Video generation

Text-to-Video API (T2V API)

Our Wan 2.6 T2V API transforms text prompts into multi-shot cinematic videos with automatic scene segmentation. Generate professional 1080p videos up to 15 seconds with native audio sync.

Multi-shot storytelling from single prompt

15-second duration with Three Act structure

Enhanced prompt understanding for complex scenes

Flexible aspect ratios: 16:9, 9:16, 1:1

Image-to-Video API (I2V API)

Our Wan 2.6 I2V API brings still images to life with precise motion control and text rendering. Perfect for product videos, photo animation, and branded content creation.

Accurate text rendering for products and signage

Style consistency across animation frames

Natural motion with improved coherence

Narrative-optimized visual output

Reference-to-Video API (R2V API)

Our Wan 2.6 R2V API preserves character identity from reference videos. Upload 2-30 second clips to extract appearance, voice, and movement patterns for consistent character generation.

Character appearance and identity preservation

Voice characteristics extraction and replication

Movement pattern analysis and reproduction

Multi-character scene support

💡

Complete API Suite

All three Wan 2.6 API modes (T2V API, I2V API, R2V API) support RESTful architecture with comprehensive documentation. Get started with SDKs for Python, Node.js, and more. Each endpoint includes native audio-visual synchronization and full commercial usage rights.

How to Get Started with Wan 2.6

Start creating professional videos in minutes with two simple paths

API Integration

For developers building applications

Sign Up & Login

Create your Atlas Cloud account or login to access the console

Add Payment Method

Bind your credit card in the Billing section to fund your account

Generate API Key

Navigate to Console → API Keys and create your authentication key

Start Building

Use T2V, I2V, or R2V API endpoints to integrate Wan 2.6 into your application

Playground Experience

For quick testing and experimentation

Sign Up & Login

Create your Atlas Cloud account or login to access the platform

Add Payment Method

Bind your credit card in the Billing section to get started

Use Playground

Go to the Wan 2.6 playground, choose T2V/I2V/R2V mode, and generate videos instantly

💡

Pro Tip: Test different generation modes in the Playground first to understand which works best for your use case, then integrate the corresponding API for production scale.

Frequently Asked Questions

What makes Wan 2.6's multi-shot capability unique?

Wan 2.6 is the first model to truly understand storyboard logic. Unlike Wan 2.5 which created messy "morphing" effects, Wan 2.6 can automatically segment a single prompt into multiple distinct shots with coherent transitions, maintaining character consistency across scene changes.

How does Reference-to-Video (R2V) work?

Upload a 2-30 second reference video, and Wan 2.6 extracts the character's appearance, movement patterns, and voice characteristics. You can then generate new videos featuring the same character with consistent identity—ideal for creating character-driven content series.

What video formats and durations are supported?

Wan 2.6 generates 1080p videos at 24fps with durations from 5 to 15 seconds. Supported aspect ratios include 16:9 (YouTube), 9:16 (Instagram Reels/TikTok), and 1:1 (square format), optimized for each platform without requiring post-production cropping.

Can Wan 2.6 render text in videos?

Yes! Wan 2.6 features industry-leading text rendering for product packaging, signage, and branded content. The model can generate clear, readable text within video frames—a critical feature that Seedance and most competitors lack.

What's the difference between T2V, I2V, and R2V modes?

T2V (Text-to-Video) generates from text prompts with multi-shot capability. I2V (Image-to-Video) animates still images with precise text rendering. R2V (Reference-to-Video) uses video references to preserve character identity across generations. Choose based on your input type and consistency needs.

Do I have commercial rights to generated videos?

Yes! Every Wan 2.6 creation comes with full commercial usage rights. Videos are production-ready for marketing campaigns, client deliverables, branded content, and commercial applications without additional licensing requirements.

Why Use Wan 2.6 on Atlas Cloud?

Leverage enterprise-grade infrastructure for your professional video generation workflows

Purpose-Built Infrastructure

Deploy Wan 2.6's multi-shot generation and R2V capabilities on infrastructure specifically optimized for demanding AI video workloads. Maximum performance for 1080p 15-second generation.

Unified API for All Models

Access Wan 2.6 (T2V, I2V, R2V) alongside 300+ AI models (LLMs, image, video, audio) through one unified API. Single integration for all your generative AI needs with consistent auth.

Competitive Pricing

Save up to 70% compared to AWS with transparent, pay-as-you-go pricing. No hidden fees, no commitments—scale from prototype to production without breaking the bank.

SOC I & II Certified Security

Your reference videos and generated content protected with SOC I & II certifications and HIPAA compliance. Enterprise-grade security with encrypted transmission and storage.

99.9% Uptime SLA

Enterprise-grade reliability with guaranteed 99.9% uptime. Your Wan 2.6 multi-shot video generation is always available for production campaigns and critical content workflows.

Easy Integration

Complete integration in minutes with REST API and multi-language SDKs (Python, Node.js, Go). Switch between T2V, I2V, and R2V modes seamlessly with unified endpoint structure.

99.9%

Uptime

70%

Lower Cost vs AWS

300+

Gen AI Models

24/7

Pro Support

Technical Specifications

Architecture

Advanced Transformer with Multi-Modal Understanding

Resolution

1080p (Full HD)

Frame Rate

24 FPS

Duration

5-15 seconds (mode dependent)

Aspect Ratios

16:9, 9:16, 1:1

Generation Modes

T2V, I2V, R2V

Audio

Native synchronization with lip-sync

Commercial Rights

Full commercial usage included

Experience Professional Multi-Shot Video Generation

Join content creators, marketers, and filmmakers worldwide who are revolutionizing video production with Wan 2.6's groundbreaking multi-shot storytelling and character consistency capabilities.

Alibaba WAN 2.6 Text-to-Video Model

Alibaba WAN 2.6 is an advanced text-to-video model provided by Alibaba Cloud's DashScope platform. This model generates high-quality 480p/720p/1080p videos from text prompts.

What makes it stand out?

More affordable: Wan 2.6 is more streamlined and cost-effective - reducing creator expenses and offering more options.
One-pass A/V sync: Wan 2.6 creates a fully synchronized video (audio/voiceover + lip-sync) from a single, well-structured prompt - no separate recording or manual alignment required.
Multilingual friendly: Wan 2.6 reliably processes like Chinese prompts for A/V-synced videos.
Longer duration & more video size options: Wan 2.6 delivers up to 10 seconds and 6 aspect/size options, enabling more storytelling room and publishing flexibility.
Multi-shot storytelling: Generates cohesive multi-shot narratives, keeping key details consistent across shots and offering auto shot-split for simple prompts.
Video reference generation: Uses a reference video's appearance and voice to guide new videos; supports human or arbitrary subjects, single or dual performers.
15s long videos: Produces videos up to 15 seconds, expanding temporal capacity for richer storytelling.

Designed For

Marketing teams: Fast, polished demos/tutorials—low cost, consistent style.
Global enterprises: Multilingual, lip-synced videos with subtitles for efficient localization.
Storytellers & YouTubers: Immersive narratives while maintaining cadence and quality—driving growth.
Corporate training teams: HD videos over docs—clearer key points, better communication.

Pricing

The table below lists prices for easy comparsion.

Output Resolution	Duration (5s)	Duration (10s)
480p	$0.2	$0.4
720p	$0.4	$0.8
1080p	$0.6	$1.2

Billing Rules

Minimum charge: 5 seconds
Per-second rate = (price per 5 seconds) ÷ 5
Billed duration = video length in seconds (rounded up), with a 5-second minimum
Total cost = billed duration × per-second rate (by output resolution)

How to Use

Write your prompt.
Upload an audio file (optional) for voice/music.
Choose the video size (resolution/aspect).
Select the video duration (e.g., 5s / 10s).
Submit and wait for processing.
Preview and download the result.

Explore Similar Models

NEW

HOT

text-to-video

Van-2.6 Text-to-video

A speed-optimized text-to-video option that prioritizes lower latency while retaining strong visual fidelity. Ideal for iteration, batch generation, and prompt testing.

Van-2.6 Image-to-video

A speed-optimized image-to-video option that prioritizes lower latency while retaining strong visual fidelity. Ideal for iteration, batch generation, and prompt testing.

HappyHorse-1.0 Image-to-video

Animates a first-frame image into video with optional prompt guidance, 720P or 1080P output, and durations from 3 to 15 seconds.

HappyHorse-1.0 Text-to-video

Generates videos from text prompts with HappyHorse 1.0, supporting 720P or 1080P output, flexible aspect ratios, and durations from 3 to 15 seconds.

HappyHorse-1.0 Video-edit

Edits an input video with text instructions and optional reference images, supporting 720P or 1080P output.

HappyHorse-1.0 Reference-to-video

Generates videos from one to nine reference images and a text prompt, supporting 720P or 1080P output, flexible aspect ratios, and durations from 3 to 15 seconds.

Wan-2.7 Image-to-video

Animates images into videos with first-frame, first-and-last-frame, video continuation, and audio-driven modes.

Wan-2.7 Text-to-video

Generates videos from text prompts with multi-shot narrative, audio generation, and sound-image synchronization.

Wan-2.7 Video-edit

Edits videos using text instructions, reference images, and style transfer with multi-modal input support.

Wan-2.7 Reference-to-video

Generates character-driven videos from reference images and videos, with multi-subject and voice-cloning support.

Wan-2.2 Image-to-video

Open and Advanced Large-Scale Video Generative Models.

Wan-2.2 Image-to-video Lora

Open and Advanced Large-Scale Video Generative Models.

From

$0.04/SEC

image-to-video

Wan-2.2-spicy Image-to-video

Open and Advanced Large-Scale Video Generative Models.

From

$0.03/SEC

image-to-video

Wan-2.2-spicy Image-to-video Lora

Open and Advanced Large-Scale Video Generative Models.

Wan-2.6 Image-to-video Flash

Wan2.6 image to video flash, faster and more cost-effective generation. Intelligent shot scheduling enables multi‑camera storytelling, supports stable multi‑speaker dialogue with more natural and realistic vocal timbres.

Wan-2.6 Video-to-video

A speed-optimized video-to-video option that prioritizes lower latency while retaining strong visual fidelity. Ideal for iteration, batch generation, and prompt testing.

From$0.1/SEC

$0.07/SEC

-30%

One API for All Media AI.

Explore all models