alibaba/wan-2.5/text-to-video

文生视频

Wan 2.5 Text-to-Video API by Alibaba

alibaba/wan-2.5/text-to-video

Text-to-video

A speed-optimized text-to-video option that prioritizes lower latency while retaining strong visual fidelity. Ideal for iteration, batch generation, and prompt testing.

输入

提示词 *

负向提示词

音频

可拖拽文件至此，或点击上传

MAX:1

尺寸

时长

启用提示词扩展

生成音频

随机种子

输出

空闲

生成的视频将在这里显示

配置参数后点击运行开始生成

每次运行将花费 $0.035。$10 可运行约 285 次。

你可以继续：

Seedance 2.0 Kling v3 Vidu Wan2.7

参数

代码示例
import requests
import time

# Step 1: Start video generation
generate_url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "alibaba/wan-2.5/text-to-video",  # Required. model name
    "audio": "example_value",  # Audio URL to guide generation (optional)
    "duration": 5,  # The duration of the generated media in seconds. options: 5 | 10
    "enable_prompt_expansion": False,  # If set to true, the prompt optimizer will be enabled
    "negative_prompt": "example_value",  # Negative prompt for the generation
    "prompt": "A beautiful sunset over the ocean with gentle waves",  # Required. The prompt for generating the output
    "seed": -1,  # The random seed to use for the generation
    "size": "1920*1080",  # Required. The size of the generated media in pixels (width*height)
    "generate_audio": True,  # Whether to automatically add audio to the generated video
}

generate_response = requests.post(generate_url, headers=headers, json=data)
generate_result = generate_response.json()
prediction_id = generate_result["data"]["id"]

# Step 2: Poll for result
poll_url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"

def check_status():
    while True:
        response = requests.get(poll_url, headers={"Authorization": "Bearer $ATLASCLOUD_API_KEY"})
        result = response.json()

        if result["data"]["status"] in ["completed", "succeeded"]:
            print("Generated video:", result["data"]["outputs"][0])
            return result["data"]["outputs"][0]
        elif result["data"]["status"] == "failed":
            raise Exception(result["data"]["error"] or "Generation failed")
        else:
            # Still processing, wait 2 seconds
            time.sleep(2)

video_url = check_status()

安装

安装所需的依赖包。

pip install requests

认证

所有 API 请求需要通过 API Key 进行认证。您可以在 Atlas Cloud 控制台获取 API Key。

export ATLASCLOUD_API_KEY="your-api-key-here"

HTTP 请求头

import os

API_KEY = os.environ.get("ATLASCLOUD_API_KEY")
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}

保护好您的 API Key

切勿在客户端代码或公开仓库中暴露您的 API Key。请使用环境变量或后端代理。

提交请求

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "your-model",
    "prompt": "A beautiful landscape"
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

提交请求

提交一个异步生成请求。API 返回一个 prediction ID，您可以用它来检查状态和获取结果。

POST/api/v1/model/generateVideo

请求体

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}

data = {
    "model": "alibaba/wan-2.5/text-to-video",
    "prompt": "A beautiful sunset over the ocean with gentle waves"
}

response = requests.post(url, headers=headers, json=data)
result = response.json()

print(f"Prediction ID: {result['data']['id']}")
print(f"Status: {result['data']['status']}")

响应

{
  "code": 200,
  "data": {
    "id": "pred_abc123",
    "status": "processing",
    "model": "model-name",
    "created_at": "2025-01-01T00:00:00Z"
  }
}

检查状态

轮询 prediction 端点以检查请求的当前状态。

GET/api/v1/model/prediction/{prediction_id}

轮询示例

import requests
import time

prediction_id = "pred_abc123"
url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

while True:
    response = requests.get(url, headers=headers)
    result = response.json()
    status = result["data"]["status"]
    print(f"Status: {status}")

    if status in ["completed", "succeeded"]:
        output_url = result["data"]["outputs"][0]
        print(f"Output URL: {output_url}")
        break
    elif status == "failed":
        print(f"Error: {result['data'].get('error', 'Unknown')}")
        break

    time.sleep(3)

状态值

processing请求仍在处理中。

completed生成完成，输出可用。

succeeded生成成功，输出可用。

failed生成失败，请检查 error 字段。

完成响应

{
  "data": {
    "id": "pred_abc123",
    "status": "completed",
    "outputs": [
      "https://storage.atlascloud.ai/outputs/result.mp4"
    ],
    "metrics": {
      "predict_time": 45.2
    },
    "created_at": "2025-01-01T00:00:00Z",
    "completed_at": "2025-01-01T00:00:10Z"
  }
}

上传文件

将文件上传到 Atlas Cloud 存储，获取可在 API 请求中使用的 URL。使用 multipart/form-data 上传。

POST/api/v1/model/uploadMedia

上传示例

import requests

url = "https://api.atlascloud.ai/api/v1/model/uploadMedia"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

with open("image.png", "rb") as f:
    files = {"file": ("image.png", f, "image/png")}
    response = requests.post(url, headers=headers, files=files)

result = response.json()
download_url = result["data"]["download_url"]
print(f"File URL: {download_url}")

响应

{
  "data": {
    "download_url": "https://storage.atlascloud.ai/uploads/abc123/image.png",
    "file_name": "image.png",
    "content_type": "image/png",
    "size": 1024000
  }
}

Input Schema

以下参数在请求体中被接受。

总计: 9必填: 3可选: 6

modelstringrequired

model name

Default: "alibaba/wan-2.5/text-to-video"

audiostring

Audio URL to guide generation (optional).

durationinteger

The duration of the generated media in seconds.

Default: 5

510

enable_prompt_expansionboolean

If set to true, the prompt optimizer will be enabled.

Default: false

negative_promptstring

Negative prompt for the generation.

promptstringrequired

The prompt for generating the output.

seedinteger

The random seed to use for the generation. -1 means a random seed will be used.

Default: -1

sizestringrequired

The size of the generated media in pixels (width*height).

Default: "1920*1080"

832*480480*832624*6241280*720720*1280960*9601088*832832*10881920*10801080*19201440*14401632*12481248*1632

generate_audioboolean

Whether to automatically add audio to the generated video.

Default: true

请求体示例

{
  "model": "alibaba/wan-2.5/text-to-video",
  "duration": 5,
  "enable_prompt_expansion": false,
  "prompt": "A beautiful landscape",
  "seed": -1,
  "size": "1920*1080",
  "generate_audio": true
}

Output Schema

API 返回包含生成输出 URL 的 prediction 响应。

created_atstring

ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”).

idstring

Unique identifier for the prediction, the ID of the prediction to get.

modelstring

Model ID used for the prediction.

outputsarray

Array of URLs to the generated content (empty when status is not completed).

statusstring

Status of the task: created, processing, completed, or failed.

响应示例

{
  "id": "pred_abc123",
  "status": "completed",
  "model": "model-name",
  "outputs": [
    "https://storage.atlascloud.ai/outputs/result.mp4"
  ],
  "metrics": {
    "predict_time": 45.2
  },
  "created_at": "2025-01-01T00:00:00Z",
  "completed_at": "2025-01-01T00:00:10Z"
}

Atlas Cloud Skills

Atlas Cloud Skills 将 400+ AI 模型直接集成到您的 AI 编程助手中。一条命令安装，即可用自然语言生成图像、视频和与 LLM 对话。

支持的客户端

Claude Code

OpenAI Codex

Gemini CLI

Cursor

Windsurf

VS Code

Trae

GitHub Copilot

Cline

Roo Code

Amp

Goose

Replit

40+ 支持的客户端

安装

npx skills add AtlasCloudAI/atlas-cloud-skills

设置 API Key

从 Atlas Cloud 控制台获取 API Key，并将其设置为环境变量。

export ATLASCLOUD_API_KEY="your-api-key-here"

功能

安装后，您可以在 AI 助手中使用自然语言访问所有 Atlas Cloud 模型。

图像生成使用 Nano Banana 2、Z-Image 等模型生成图像。

视频创作使用 Kling、Vidu、Veo 等模型从文本或图像创建视频。

LLM 对话与 Qwen、DeepSeek 等大语言模型对话。

媒体上传上传本地文件用于图像编辑和图生视频工作流。

MCP Server

Atlas Cloud MCP Server 通过 Model Context Protocol 将您的 IDE 与 400+ AI 模型连接。支持任何兼容 MCP 的客户端。

支持的客户端

Cursor

VS Code

Windsurf

Claude Code

OpenAI Codex

Gemini CLI

Cline

Roo Code

100+ 支持的客户端

安装

npx -y atlascloud-mcp

配置

将以下配置添加到您的 IDE 的 MCP 设置文件中。

{
  "mcpServers": {
    "atlascloud": {
      "command": "npx",
      "args": [
        "-y",
        "atlascloud-mcp"
      ],
      "env": {
        "ATLASCLOUD_API_KEY": "your-api-key-here"
      }
    }
  }
}

可用工具

atlas_generate_image从文本提示生成图像。

atlas_generate_video从文本或图像创建视频。

atlas_chat与大语言模型对话。

atlas_list_models浏览 400+ 可用 AI 模型。

atlas_quick_generate一步式内容创建，自动选择最佳模型。

atlas_upload_media上传本地文件用于 API 工作流。

了解更多

github.com/AtlasCloudAI/mcp-server

API Schema

{
  "info": {
    "title": "AtlasCloud API",
    "version": "1.0.0",
    "description": "The AtlasCloud API."
  },
  "openapi": "3.0.0",
  "paths": {
    "/api/v1/model/generateVideo": {
      "post": {
        "requestBody": {
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/Input"
              }
            }
          },
          "required": true
        },
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "The request status."
          }
        }
      },
      "x-api-name": "model_run"
    },
    "/api/v1/model/result/{request_id}": {
      "get": {
        "parameters": [
          {
            "in": "path",
            "name": "request_id",
            "required": true,
            "schema": {
              "description": "Request ID",
              "type": "string"
            }
          }
        ],
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "Result of the request."
          }
        }
      },
      "x-api-name": "model_result"
    }
  },
  "components": {
    "schemas": {
      "Input": {
        "properties": {
          "model": {
            "type": "string",
            "description": "model name",
            "default": "alibaba/wan-2.5/text-to-video"
          },
          "audio": {
            "description": "Audio URL to guide generation (optional).",
            "type": "string"
          },
          "duration": {
            "default": 5,
            "description": "The duration of the generated media in seconds.",
            "enum": [
              5,
              10
            ],
            "type": "integer",
            "x-ui-component": "select"
          },
          "enable_prompt_expansion": {
            "default": false,
            "description": "If set to true, the prompt optimizer will be enabled.",
            "type": "boolean"
          },
          "negative_prompt": {
            "description": "Negative prompt for the generation.",
            "type": "string"
          },
          "prompt": {
            "description": "The prompt for generating the output.",
            "type": "string",
            "x-rows": 10,
            "x-ui-component": "textarea"
          },
          "seed": {
            "default": -1,
            "description": "The random seed to use for the generation. -1 means a random seed will be used.",
            "type": "integer"
          },
          "size": {
            "default": "1920*1080",
            "description": "The size of the generated media in pixels (width*height).",
            "enum": [
              "832*480",
              "480*832",
              "624*624",
              "1280*720",
              "720*1280",
              "960*960",
              "1088*832",
              "832*1088",
              "1920*1080",
              "1080*1920",
              "1440*1440",
              "1632*1248",
              "1248*1632"
            ],
            "type": "string"
          },
          "generate_audio": {
            "default": true,
            "description": "Whether to automatically add audio to the generated video.",
            "type": "boolean"
          }
        },
        "required": [
          "model",
          "prompt",
          "size"
        ],
        "type": "object",
        "x-order-properties": [
          "model",
          "prompt",
          "negative_prompt",
          "audio",
          "size",
          "duration",
          "enable_prompt_expansion",
          "generate_audio",
          "seed"
        ]
      },
      "PredictionResponse": {
        "properties": {
          "created_at": {
            "description": "ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”).",
            "format": "date-time",
            "type": "string"
          },
          "has_nsfw_contents": {
            "description": "Array of boolean values indicating NSFW detection for each output.",
            "items": {
              "type": "boolean"
            },
            "type": "array"
          },
          "id": {
            "description": "Unique identifier for the prediction, the ID of the prediction to get.",
            "type": "string"
          },
          "model": {
            "description": "Model ID used for the prediction.",
            "type": "string"
          },
          "outputs": {
            "description": "Array of URLs to the generated content (empty when status is not completed).",
            "items": {
              "type": "object"
            },
            "type": "array"
          },
          "status": {
            "description": "Status of the task: created, processing, completed, or failed.",
            "type": "string"
          },
          "urls": {
            "description": "Object containing related API endpoints.",
            "type": "object"
          }
        },
        "type": "object"
      }
    }
  },
  "servers": [
    {
      "url": "https://api.atlascloud.ai"
    }
  ]
}

LLM 友好的提示词模板

# alibaba/wan-2.5/text-to-video

> A speed-optimized text-to-video option that prioritizes lower latency while retaining strong visual fidelity. Ideal for iteration, batch generation, and prompt testing.


## Overview

- **Submit endpoint (POST)**: `https://api.atlascloud.ai/api/v1/model/generateVideo` — start an async generation; returns a `prediction_id`
- **Poll endpoint (GET)**: `https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}` — poll this until the prediction finishes
- **Model ID**: `alibaba/wan-2.5/text-to-video`


## API Information

This model can be used via our HTTP API or more conveniently via our client libraries.
See the input and output schema below, as well as the usage examples.


### Input Schema

The API accepts the following input parameters:

- **`model`** (`string`, _required_):
  model name
  - Default: `"alibaba/wan-2.5/text-to-video"`

- **`prompt`** (`string`, _required_):
  The prompt for generating the output.

- **`negative_prompt`** (`string`, _optional_):
  Negative prompt for the generation.

- **`audio`** (`string`, _optional_):
  Audio URL to guide generation (optional).

- **`size`** (`string`, _required_):
  The size of the generated media in pixels (width*height).
  - Default: `"1920*1080"`
  - Options: "832*480", "480*832", "624*624", "1280*720", "720*1280", "960*960", "1088*832", "832*1088", "1920*1080", "1080*1920", "1440*1440", "1632*1248", "1248*1632"

- **`duration`** (`integer`, _optional_):
  The duration of the generated media in seconds.
  - Default: `5`
  - Options: 5, 10

- **`enable_prompt_expansion`** (`boolean`, _optional_):
  If set to true, the prompt optimizer will be enabled.
  - Default: `false`

- **`generate_audio`** (`boolean`, _optional_):
  Whether to automatically add audio to the generated video.
  - Default: `true`

- **`seed`** (`integer`, _optional_):
  The random seed to use for the generation. -1 means a random seed will be used.
  - Default: `-1`



**Required Parameters Example**:

```json
{
  "model": "alibaba/wan-2.5/text-to-video",
  "prompt": "",
  "size": "1920*1080"
}
```


**Full Example**:

```json
{
  "model": "alibaba/wan-2.5/text-to-video",
  "prompt": "",
  "negative_prompt": "",
  "audio": "",
  "size": "1920*1080",
  "duration": 5,
  "enable_prompt_expansion": false,
  "generate_audio": true,
  "seed": -1
}
```


### Output Schema

The API returns the following output format:


- **`created_at`** (`string`, _optional_):
  ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”).

- **`has_nsfw_contents`** (`array[boolean]`, _optional_):
  Array of boolean values indicating NSFW detection for each output.

- **`id`** (`string`, _optional_):
  Unique identifier for the prediction, the ID of the prediction to get.

- **`model`** (`string`, _optional_):
  Model ID used for the prediction.

- **`outputs`** (`array[object]`, _optional_):
  Array of URLs to the generated content (empty when status is not completed).

- **`status`** (`string`, _optional_):
  Status of the task: created, processing, completed, or failed.

- **`urls`** (`object`, _optional_):
  Object containing related API endpoints.



**Example Response**:

```json
{
  "created_at": "",
  "has_nsfw_contents": [],
  "id": "",
  "model": "",
  "outputs": [],
  "status": "",
  "urls": {}
}
```


## Usage Examples

### cURL

```bash
# Step 1: Start generation (async)
curl -X POST "https://api.atlascloud.ai/api/v1/model/generateVideo" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "alibaba/wan-2.5/text-to-video",
  "prompt": "",
  "negative_prompt": "",
  "audio": "",
  "size": "1920*1080",
  "duration": 5,
  "enable_prompt_expansion": false,
  "generate_audio": true,
  "seed": -1
}'

# Response will contain: {"code": 200, "data": {"id": "prediction_id", "status": "processing"}}

# Step 2: Poll for result (replace {prediction_id} with the id returned above)
curl -X GET "https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY"

# Keep polling until status is "completed", "succeeded" or "failed"
# When completed, outputs will contain the generated content URL(s)
```

## Additional Resources

### Documentation

- [Model Playground](https://www.atlascloud.ai/models/alibaba/wan-2.5/text-to-video)

A middle-aged man sitting at a wooden desk in a cozy study room, surrounded by bookshelves and a warm lamp glow. He opens an old book and reads aloud with a calm, deep voice: 'History teaches us more than just facts… it shows us who we are.' The room has subtle background sounds: pages turning, the faint ticking of a clock, and distant rain against the window.

A young man in his early 30s sits in a modern studio, wearing a navy blazer and white shirt. Soft lighting illuminates his face. He speaks directly to the camera, his lips moving naturally as he says: “Welcome to today’s interview. We’re going to explore how AI is changing our daily lives.” His gestures are subtle, occasionally raising his hands for emphasis, creating a professional and engaging tone.

A cinematic opening sequence of a sci-fi movie: a spaceship travels across the galaxy, and the movie title “星河远征 · Galactic Odyssey” emerges in golden 3D letters, with flawless kerning and no distortion, floating stably in space as the camera rotates.

A handsome, muscular man with well-defined abs is catching his breath after an intense workout. Sweat drips down his torso. He is shirtless, wearing only black athletic shorts, and is leaning against gym equipment. The lighting comes from the upper side, highlighting the contours of his chest and arms. The scene is filled with a raw, masculine energy, hyper-realistic, high-contrast lighting.

A graceful ballerina with her hair in a messy bun, performing a powerful and emotional contemporary ballet routine. She is in a minimalist, dark art studio. Abstract patterns of light and shadow, projected from a hidden source, dance across her body and the surrounding walls, constantly shifting with her movements. The camera focuses on the tension in her muscles and the expressive gestures of her hands. A single, dramatic slow-motion shot captures her mid-air leap, with the light patterns swirling around her like a galaxy. Moody, artistic, high contrast.

A young couple sitting on a park bench during sunset. The woman leans her head on the man’s shoulder. He whispers softly: 'No matter where we go, I’ll always be here with you.' The sound includes the rustling of leaves, distant laughter of children playing, and the gentle hum of cicadas in the evening air.

A low-angle panning shot of a concrete wall under a highway overpass at night. Graffiti of a young man comes to life and starts rapping. The style is a dynamic blend of 2D street art animation on a realistic, dark, cinematic background. Cityscape is visible in the distance.

A 3D animated, anthropomorphic badger wearing a brown leather vest is angrily sweeping yellow autumn leaves from the doorway of his rustic wooden cabin. The style is reminiscent of a Pixar film, with detailed fur and expressive animation. Sunny day, lush green meadow with a forest in the background.

加载中...

为什么选择 Wan 2.5?

更实惠

尽管 Google 最近降价,但 Veo 3 整体仍然昂贵。Wan 2.5 轻量且性价比高,为创作者提供更多选择,同时大幅降低制作成本。

一步生成,端到端同步

使用 Wan 2.5,无需单独录制语音或手动对齐口型。只需提供清晰、结构化的提示词,一次性生成带有音频/配音和口型同步的完整视频 - 更快更简单。

多语言友好

当提示词为中文时,Wan 2.5 可靠地生成音视频同步视频。相比之下,Veo 3 对中文提示词经常显示「未知语言」。

精准角色还原

Wan 2.5 擅长角色特征还原,准确呈现角色外观、表情和动作风格,让生成的视频角色更具辨识度和个性化,增强叙事性和沉浸感。

艺术风格渲染

支持吉卜力风格渲染,创造手绘水彩质感和动画效果。带来温暖、梦幻的视觉体验,增强艺术感染力和叙事深度。

谁能受益?

营销团队

无论是产品发布、促销活动还是品牌营销,Wan 2.5 帮助您快速生成高质量视频,让创作变得简单高效。

产品演示和教程,无需协调烦恼
社交媒体营销,多语言字幕和口型同步
AI 生成内容让团队专注于策略和创意

Bottom line: 总结:创作从未如此简单、快速和智能 - Wan 2.5 是您营销的秘密武器!

全球企业

为跨国公司提供理想的内容本地化解决方案,让创作更轻松、更高效。

多语言视频支持,提示词识别
一键生成口型同步的字幕和配音
快速内容本地化,面向全球市场

Bottom line: 总结:跨境内容创作从未如此简单、快速和智能。

故事创作者 / YouTuber

创作者可以利用 Wan 2.5 提高视频制作效率,同时确保高质量输出。

沉浸式叙事,精准的角色动作和表情
更高的发布效率,减少编辑和后期制作时间
从短视频到动画故事片段的多样化内容

企业培训团队

Wan 2.5 让企业培训更高效、更引人入胜。

专业视频取代枯燥的文本文档
快速创建操作演示和培训教程
一致的风格和标准化输出,便于全球推广

自由创意人 / 小型工作室

Wan 2.5 让创意自由流动,无需昂贵的设备或演员 - AI 高效生成一切。

尝试从短片到社交媒体内容的多样化作品
从灵感到完成,「一键生成」
无需昂贵设备或专业演员的高质量内容

Bottom line: 总结:Wan 2.5 让创作更轻松、更自由、更精彩,每次尝试都令人惊艳!

教育机构 / 在线课程创作者

将创意转化为现实,无需高成本 - Wan 2.5 让优质内容制作变得简单经济。

尝试从短片到宣传视频的各种风格
从概念到成品的更高制作效率
无需昂贵设备或专业人才的优质内容

Bottom line: 总结:Wan 2.5 让创作轻松、高效、自由 - 每次尝试都精彩纷呈!

核心特性

一步音视频生成

在单一流程中生成带有同步音频、配音和口型同步的完整视频

双角色同步

支持同时生成两个角色,动作、表情和口型同步,自然互动

专业品质

高质量视频输出,逼真的角色表情和精确的口型同步

多语言支持

对中文提示词的出色支持,可靠生成多语言内容

性价比高

与竞品相比成本大幅降低,同时保持专业品质

角色特征还原

精准还原角色外观、表情和动作风格,高保真度和个性化

艺术风格渲染

支持包括吉卜力风格手绘水彩质感在内的各种艺术风格

沉浸式场景

非常适合对话场景、访谈或双人短片,自然的音视频一致性

Digital Human Sync

Study Room Scholar

Middle-aged man reading with perfect lip-sync in a warm study environment

Lip-sync with audioEnvironmental soundsCharacter emotion

Prompt

Dual Character Scene

Park Sunset Romance

Couple interaction with synchronized dual character actions and expressions

Dual character syncNatural interactionAmbient soundscape

Prompt

A young couple sitting on a park bench during sunset. The woman leans her head on the man's shoulder. He whispers softly: 'No matter where we go, I'll always be here with you.' The sound includes the rustling of leaves, distant laughter of children playing, and the gentle hum of cicadas in the evening air.

Character Restoration

Ballet Performance Art

Precise character trait restoration with artistic movement and expression

Character trait restorationMovement precisionArtistic lighting

Prompt

Artistic Style Rendering

Ghibli Forest Magic

Studio Ghibli-inspired animation with hand-painted watercolor texture

Ghibli art styleHand-painted textureMagical atmosphere

Prompt

Studio Ghibli-inspired anime style. A young girl with a straw hat lies peacefully in a sun-dappled magical forest, surrounded by friendly, glowing forest spirits (Kodama). A gentle breeze rustles the leaves of the giant, ancient trees. The air is filled with sparkling dust motes, illuminated by shafts of sunlight. The art style is soft, with a hand-painted watercolor texture. The scene feels serene, magical, and heartwarming.

使用场景

🎬

视频制作

📢

营销内容

🎓

教育视频

📱

社交媒体

🌐

多语言内容

💼

企业培训

🎭

娱乐

💃

表演艺术

🎨

动画与番剧

📚

故事讲述

👥

双角色视频

🎙️

访谈

📺

广播媒体

技术规格

模型类型:音视频同步生成

核心特性:音视频同步、角色还原、艺术渲染、多语言

语言支持:中文、英文等

输出质量:专业高清视频带音频

生成速度:快速一步生成

API 集成:RESTful API 与完整文档

体验 Wan 2.5 - 您的视频创作革命

加入数千名创作者和企业,用同步音视频生成技术改变您的视频内容创作。

🎬一步音视频同步

🌍多语言支持

⚡性价比高

Wan 2.5: A next-generation AI video generation model developed by Alibaba Wanxiang.

Model Card Overview

Field	Description
Model Name	Wan 2.5
Developed By	Alibaba Group
Release Date	September 24, 2025
Model Type	Generative AI, Video Foundation Model
Related Links	Official Website: https://wan.video/, Hugging Face: https://huggingface.co/Wan-AI, Technical Paper (Wan Series): https://arxiv.org/abs/2503.20314

Introduction

Wan 2.5 is a state-of-the-art, open-source video foundation model developed by Alibaba's Wan AI team. It is designed to generate high-quality, cinematic videos complete with synchronized audio directly from text or image prompts. The model represents a significant advancement in the field of generative AI, aiming to lower the barrier for creative video production. Its core contribution lies in its ability to produce coherent, dynamic, and narratively consistent video clips with a high degree of realism and integrated audio-visual elements, such as lip-sync and sound effects, in a single, streamlined process.

Key Features & Innovations

Wan 2.5 introduces several key features that distinguish it from previous models and competitors:

Unified Audio-Visual Synthesis: Unlike many models that require separate steps for video and audio generation, Wan 2.5 creates video with natively synchronized audio, including voice, sound effects, and lip-sync, in one step.
High-Fidelity, High-Resolution Output: The model is capable of generating videos in multiple resolutions, including 480p, 720p, and full 1080p HD, with significant improvements in visual quality and frame-to-frame stability over its predecessors.
Extended Video Duration: Wan 2.5 can generate video clips up to 10 seconds in length, offering more creative flexibility for storytelling compared to other models in its class.
Advanced Cinematic Control: The model demonstrates a sophisticated understanding of cinematic language, allowing for precise control over camera movement, shot composition, and character consistency within scenes.
Open-Source Commitment: Following the precedent set by earlier versions, the Wan series of models, including Wan 2.5, are open-sourced to encourage research, development, and innovation within the broader AI community.

Model Architecture & Technical Details

Wan 2.5 is built upon the Diffusion Transformer (DiT) paradigm, which has become a mainstream approach for high-quality generative tasks. The technical report for the Wan model series outlines a suite of innovations that contribute to its performance.

The architecture includes a novel Variational Autoencoder (VAE) designed for high-efficiency video compression, enabling the model to handle high-resolution video data effectively. The Wan series is available in multiple sizes to balance performance and computational requirements, such as the 1.3B and 14B parameter models detailed for Wan 2.2. The model was trained on a massive, curated dataset comprising billions of images and videos, which enhances its ability to generalize across a wide range of motions, semantics, and aesthetic styles.

Intended Use & Applications

Wan 2.5 is designed for a wide array of applications in creative and commercial fields. Its intended uses include:

Content Creation: Generating short-form videos for social media, marketing campaigns, and digital advertising.
Storytelling and Filmmaking: Creating cinematic scenes, character animations, and narrative sequences for short films and conceptual art.
Prototyping: Rapidly visualizing scripts and storyboards for film, television, and game development.
Personalized Media: Enabling users to create unique, personalized video content from their own ideas and images.

Performance

Wan 2.5 has demonstrated significant performance improvements over previous versions and holds a competitive position against other leading video generation models. Independent reviews and benchmarks provide insight into its capabilities.

Benchmark Scores

A review conducted by Curious Refuge Labs™ evaluated the model's visual generation capabilities across several metrics.

Metric	Score (out of 10)
Prompt Adherence	7.0
Temporal Consistency	6.6
Visual Fidelity	6.5
Motion Quality	5.9
Style & Cinematic Realism	5.7
Overall Score	6.3

These scores indicate strong prompt understanding and a notable improvement in visual quality from Wan 2.2, although it still shows limitations in complex motion and realism compared to top-tier commercial models.

探索类似模型

NEW

HOT

文生视频

Van-2.5 Text-to-video

Convert prompts into cinematic video clips with synchronized sound. Van 2.5 generates 720p/1080p outputs with stable motion, native audio sync, and prompt-faithful visual storytelling.

Van-2.5 Image-to-video

Get animated visuals from your images faster without major quality sacrifice. Perfect for preview workflows, previews at scale, or mass production of animated assets.

HappyHorse-1.1 Reference-to-video

Generates videos from one to nine reference images and a text prompt, supporting 720P or 1080P output, flexible aspect ratios, and durations from 3 to 15 seconds.

HappyHorse-1.1 Image-to-video

Animates a first-frame image into video with optional prompt guidance, 720P or 1080P output, and durations from 3 to 15 seconds.

HappyHorse-1.1 Text-to-video

Generates videos from text prompts with HappyHorse 1.1, supporting 720P or 1080P output, flexible aspect ratios, and durations from 3 to 15 seconds.

HappyHorse-1.0 Image-to-video

Animates a first-frame image into video with optional prompt guidance, 720P or 1080P output, and durations from 3 to 15 seconds.

HappyHorse-1.0 Text-to-video

Generates videos from text prompts with HappyHorse 1.0, supporting 720P or 1080P output, flexible aspect ratios, and durations from 3 to 15 seconds.

HappyHorse-1.0 Video-edit

Edits an input video with text instructions and optional reference images, supporting 720P or 1080P output.

HappyHorse-1.0 Reference-to-video

Generates videos from one to nine reference images and a text prompt, supporting 720P or 1080P output, flexible aspect ratios, and durations from 3 to 15 seconds.

From

$0.14/秒