Kling V3.0 API: AI Director Video with Native Audio

Kling 3.0 API 透過一個與 OpenAI 相容的金鑰，將 Kuaishou 的旗艦影片套件引入 Atlas Cloud。它包含兩個模型：Kling 3.0 用於 AI Director 敘事、多語言唇形同步和精準的螢幕文字；Kling 3.0 Omni (O3) 用於透過短影片或圖像進行主體和聲音複製。兩者皆能在同一次處理中生成原生音訊，最高支援 4K 輸出。在可靠的基礎設施上建構電影級敘事、全球行銷、多語言廣告和系列化角色內容。

探索領先模型

Atlas Cloud 為您提供最新的行業領先創意模型。

NEW

文生影片

TURBO

Kling V3.0 Turbo Text-to-Video

Kling V3.0 Turbo Text-to-Video generates dynamic cinematic videos from text prompts using MVL technology. Supports first/last frame control and audio generation.

Kling V3.0 Turbo Image-to-Video

Kling V3.0 Turbo Image-to-Video transforms static images into dynamic cinematic videos using MVL technology. Supports first/last frame control and audio generation.

Kling Video O3 4K Text-to-Video

Kling Omni Video O3 (4K) is Kuaishou advanced unified multi-modal video model with MVL (Multi-modal Visual Language) technology. Generates high-quality videos from text prompts with natural motion and audio generation support.

Kling Video O3 4K Image-to-Video

Kling Omni Video O3 (4K) Image-to-Video transforms static images into dynamic cinematic videos using MVL technology. Supports first/last frame control and audio generation.

Kling v3.0 4K Image-to-Video

Kling v3.0 4K Image-to-Video model by Kuaishou. High-quality video generation from images.

Kling v3.0 Std Image-to-Video

Kling v3.0 Standard Image-to-Video model by Kuaishou. High-quality video generation from images.

Kling v3.0 Pro Image-to-Video

Kling v3.0 Professional Image-to-Video model by Kuaishou. Premium quality video generation from images with advanced features.

Kling v3.0 Pro Text-to-Video

Kling v3.0 Professional Text-to-Video model by Kuaishou. Premium quality video generation from text prompts with advanced features.

Kling v3.0 4K Text-to-Video

Kling v3.0 4K Text-to-Video model by Kuaishou. High-quality video generation from text prompts.

Kling v3.0 Std Text-to-Video

Kling v3.0 Standard Text-to-Video model by Kuaishou. High-quality video generation from text prompts.

Kling Video O3 Pro Text-to-Video

Kling Omni Video O3 is Kuaishou's advanced unified multi-modal video model with MVL (Multi-modal Visual Language) technology. Professional quality with enhanced motion and detail.

Kling Video O3 Pro Image-to-Video

Kling Omni Video O3 Image-to-Video transforms static images into dynamic cinematic videos using MVL technology. Professional quality with first/last frame control and audio generation.

Kling Video O3 Pro Reference-to-Video

Kling Omni Video O3 Reference-to-Video generates creative videos using character, prop, or scene references. Professional quality with up to 7 reference images and optional video input.

Kling Video O3 Pro Video-Edit

Kling Omni Video O3 Video-Edit enables conversational video editing through natural language commands. Professional quality with object removal/replacement, background changes, and effects.

Kling Video O3 Std Video-Edit

Kling Omni Video O3 Video-Edit (Standard) enables natural-language video edits: remove or replace objects, change backgrounds, add effects, and more. Video duration limited to 10s.

Kling Video O3 Std Reference-to-Video

Kling Omni Video O3 (Standard) Reference-to-Video generates creative videos using character, prop, or scene references. Supports up to 7 reference images and optional video input.

Kling Video O3 Std Image-to-Video

Kling Omni Video O3 (Standard) Image-to-Video transforms static images into dynamic cinematic videos using MVL technology. Supports first/last frame control and audio generation.

Kling Video O3 Std Text-to-Video

Kling Omni Video O3 (Standard) is Kuaishou's advanced unified multi-modal video model with MVL (Multi-modal Visual Language) technology. Generates high-quality videos from text prompts with natural motion and audio generation support.

From$0.084/秒

$0.071/秒

-15%

峰值速度

最低成本

模態	描述
Kling 3.0 Std T2V API(Text To Video)	Kling 3.0 Std T2V API 賦能開發者將文字提示詞轉化為電影級影片片段。透過定義運鏡、場景和動作，它能生成流暢、音畫同步的內容，專為專業分鏡腳本繪製、動態行銷和社群媒體敘事而優化。
Kling 3.0 Std I2V API(Image To Video)	Kling 3.0 Std I2V API 將靜態影像和文字提示詞轉換為影片片段。透過支援參考幀和結束幀控制，它引導運動軌跡並生成音畫同步內容，以實現視覺連貫性和標準行銷素材。
Kling 3.0 Pro T2V API(Text To Video)	Kling 3.0 Pro T2V API 能夠根據文字提示生成具有先進物理特性和電影級紋理的高保真影片。它支援多鏡頭敘事，相比 Standard 版本提供更高的細節和視覺複雜度。
Kling 3.0 Pro I2V API(Image To Video)	Kling 3.0 Pro I2V API 將影像轉換為具備強化細節保留能力的高解析度影片。它為高端商業製作提供專業級的攝影機控制與精確的視聽同步。
Kling Video O3 Std T2V API(Text To Video)	Kling Video O3 Std T2V API 可根據文字產生影片。它支援原生音訊生成。
Kling Video O3 Std I2V API(Image To Video)	Kling Video O3 Std I2V API 使用圖像和文字生成具有高參考還原度的影片。它專為在標準解析度工作流程中需要穩定角色或產品呈現的任務而設計。
Kling Video O3 Std R2V(Video To Video)	Kling Video O3 Std R2V API 使用角色、道具或場景參考生成創意影片。支援最多 7 張參考圖像和可選的影片輸入。它具備影片風格重塑和屬性編輯功能，適用於標準畫質的社群媒體和實驗性內容。
Kling Video O3 Std Video Edit API(Video To Video)	Kling Video O3 Std Video Edit API(Video To Video) 支援自然語言影片編輯：移除或替換物件、更換背景、添加特效等。
Kling Video O3 Pro T2V API(Text To Video)	Kling Video O3 Pro T2V API 提供文生視頻生成功能。它在複雜的場景中提供專業級的人物一致性和電影級的光影效果，實現電影品質的敘事。
Kling Video O3 Pro I2V API(Image To Video)	Kling Video O3 Pro I2V API 利用參考優先架構將影像轉換為專業品質的影片。它確保了視覺細節的高保真保留和流暢的動作，適用於高端數位行銷和視覺特效。
Kling Video O3 Pro R2V(Video To Video)	Kling Video O3 Pro R2V 提供影片轉換和風格重塑功能。它具備像素級控制和動態穩定性，適用於專業影片剪輯和高階視覺修改。
Kling Video O3 Pro Video Edit(Video To Video)	Kling Video O3 Pro Video Edit (Video To Video) 透過自然語言提示詞實現高品質影片修改。它提供進階物件移除、背景替換和特效整合功能，具備專業級的精確度與細節保留。

Kling 3.0 API 功能與展示

Kling 3.0 API 為 Atlas Cloud 帶來了 Kuaishou 的電影級工具包：一個用於多鏡頭敘事、多語言口型同步與螢幕文本、主體與聲音複製、原生音訊、參考控制以及最高 4K 輸出的 AI Director。

智慧電影級敘事 (Kling 3.0)

Kling 3.0 引入了「AI 導演」功能，能從提示詞中直觀把握敘事脈絡，自動編排鏡頭構圖和運鏡角度，從而實現正反打對話序列等高級電影技法。它僅需一次生成即可呈現成熟的視覺敘事，讓每位創作者都能輕鬆駕馭複雜的電影表達。

單步生成原生音訊

Kling 3.0 在生成影片的同一次處理中生成語音、音效和背景音訊，因此輸出的成品片段已預先將聲音與動作完美匹配。無需獨立的音訊模型或後期製作步驟，從而確保對話、特效和環境音與螢幕畫面保持精準同步。

原生4K輸出

Kling 3.0 renders at resolutions up to native 4K, holding fine texture, lighting, and depth that survive on large screens and tight crops. The same prompt scales from quick standard-resolution drafts to a high-resolution master, so previews and final renders come from one model.

多語言音畫同步與高傳真文字 (Kling 3.0)

Kling 3.0 實現了文字與視覺字符的精準映射，支援中英日韓西等混合語言對話及方言，嘴型同步自然流暢。它直接滿足了電商和全球行銷對高保真文字展示及在地化內容製作的需求。

專業級主體一致性 (Kling O3)

Kling O3 支援從上傳或拍攝的 3–8 秒影片中提取人物特徵，完美還原人物的相貌、身形和神態。它開啟了「主演自己電影」的創作快感，非常適合對人物一致性要求極高的短劇和連載內容。

Reference-to-Video and Multi-Element Control

Kling O3 takes up to 7 reference images plus an optional video to lock characters, props, and scenes across a generation. It reproduces each referenced element faithfully, so a specific face, object, and setting stay consistent shot to shot, the foundation for branded series and template-style content.

One Prompt, Many Models: Kling 3.0 API

Run the same prompt through the Kling 3.0 API and other leading video models on Atlas Cloud, and compare how each handles cinematic motion, character consistency, and audio in a single scene.

提示詞

電影感多鏡頭動作序列,時長 10 秒。Shot 1,low tracking:一名孤身騎手策馬奔過狂風吹拂的沙漠山脊,正值黃金時刻,馬蹄後揚起塵土。Shot 2,hard cut 切到 side tracking:馬躍過一道深谷,鬃毛與騎手的披風在半空中隨風獵獵作響。Shot 3,whip pan 切到高空航拍:騎手在高聳的岩柱間穿行,身後一場沙暴正滾滾襲來。Shot 4,fast push-in:特寫騎手在破舊兜帽下堅定的雙眼,沙礫從鏡頭前掠過。Shot 5,dramatic wide:人馬在俯瞰廣闊峽谷的懸崖邊急停,披風翻飛,陽光炸開光暈。動態運鏡,體積光,飛揚的塵沙,照片級真實。

Kling V3.0

Seedance 2.0

Kling V2.6 Pro

提示詞

Kling V3.0

Seedance 2.0

Kling V2.6 Pro

What You Can Build with the Kling 3.0 API

From cinematic storytelling and multilingual marketing to character cloning and precise video editing, the Kling 3.0 API turns text, images, and reference clips into production-ready video with native audio.

使用 Kling 3.0 API 進行動態物理模擬

Kling 3.0 利用先進的物理建模技術生成複雜物體之間逼真的交互，包括流體力學、布料動態和結構碰撞。透過模擬現實世界的重力和材質屬性，該 API 可生成適用於專業視覺特效、逼真產品廣告和需要精確物理精度的技術演示的高保真動態。

Cinematic Storytelling with an AI Director

Kling 3.0 reads a prompt like a shot list and plans the sequence for you, setting shot composition, camera angles, and transitions, including shot-reverse-shot dialogue. It delivers a multi-shot visual narrative in a single generation instead of one isolated clip, a fast path to previs, trailers, and social hooks without booking a crew.

使用 Kling 3.0 API 進行精準影片編輯與變換

Kling 3.0 API 透過自然語言指令實現複雜的視訊對視訊（video-to-video）修改，支援無縫背景替換、物件移除和風格轉換。該 API 在保留原始動態結構的同時更改特定視覺屬性，從而為尋求高效、高解析度內容迭代的創意代理商和社群媒體平台簡化了後製工作流程。

Subject and Voice Cloning for Serialized Content

Kling O3 extracts a character's appearance and voice from a short 3 to 8 second video or an image, then reproduces that subject across new clips with matching lip-sync. It keeps a face, build, and voice consistent from episode to episode, which suits short dramas, digital hosts, and serialized social content where the same character has to return on demand.

使用 Kling 3.0 API 建立一致的角色敘事

利用參考驅動技術，Kling 3.0 在生成的多個片段中保持了嚴格的角色和風格一致性。這項能力使開發者能夠構建具有穩定面部特徵和環境光照的連貫多鏡頭序列。它是需要視覺統一性的數位人創作、連載敘事和品牌一致性行銷活動的理想解決方案。

Multilingual Dialogue and On-Screen Text

Kling 3.0 renders crisp, readable on-screen text and speaks in multiple languages, with natural lip-sync across Chinese, English, Japanese, Korean, and Spanish, plus mixed-language delivery in one clip. You can assign dialogue to each character so scenes with several speakers stay clear, which fits e-commerce, localized campaigns, and global marketing that depend on accurate text and voice.

How the Kling 3.0 API Compares

See how the Kling 3.0 API lines up against other leading video models on inputs, duration, resolution, and native audio, so you can match each project to the model that fits.

模型	輸入類型	輸出時長	解析度	音訊生成
Kling 3.0	文本、圖片、影片	5s;10s	720P	√
Kling O1	文本，圖像	5s;10s	720P	×
Kling 2.6	文本、圖像、影片	5s;10s	720P	√
Seedance 2.0	文本、圖像、影片、音訊	4~15s	2K, 1080P, 720P, 480P	√
Veo 3.1	文本、圖像	4s, 6s, 8s	1080P, 720P	√
Wan 2.6	文字、圖像、影片、音訊	5s, 10s, 15s	1080P, 720P	√
Hailuo 2.3	文本、圖像	5s	1080P	×

如何在 Atlas Cloud 上使用 Kling V3.0

幾分鐘即可上手 — 按照以下簡單步驟，透過 Atlas Cloud 平台整合和部署模型。

建立 Atlas Cloud 帳戶

在 atlascloud.ai 註冊並完成驗證。新用戶可獲得免費額度，用於探索平台和測試模型。

為何在 Atlas Cloud 使用 Kling V3.0

將先進的 Kling V3.0 模型與 Atlas Cloud 的 GPU 加速平台相結合，提供無與倫比的效能、可擴展性和開發體驗。

效能與靈活性

低延遲：
GPU 最佳化推理，實現即時回應。

統一 API：
一次整合，暢用 Kling V3.0、GPT、Gemini 和 DeepSeek。

透明定價：
按 Token 計費，支援 Serverless 模式。

企業與規模

開發者體驗：
SDK、資料分析、微調工具和模板一應俱全。

可靠性：
99.99% 可用性、RBAC 權限控制、合規日誌。

安全與合規：
SOC 2 Type II 認證、HIPAA 合規、美國資料主權。

Kling 3.0 API: Frequently Asked Questions

透過整合影片主體參考、影像主體參考以及聲音/語調參考。

標準版平衡了生成速度與品質，適用於社群媒體內容和快速原型設計。專業版專為專業影視需求設計，提供更逼真的物理動態模擬和更精細的材質紋理輸出。

R2V 專注於「全局重塑」，例如將真人影片轉換為特定的動畫或寫實藝術風格。相比之下，Video Edit 專注於「基於指令的修改」，支援精確的後製操作，如新增、刪除或修改影片中的特定元素。

Kling 3.0 produces clips in the 5 to 10 second range, with resolution options up to 4K on the dedicated 4K models. Standard and Pro tiers cover everyday and high-fidelity work, while the 4K variants are there when you need maximum detail. Set the resolution and duration per request to balance quality, speed, and cost.

Standard balances speed and quality for social content and rapid prototyping. Pro targets professional film and video work, with more realistic physics and finer material detail. Turbo is the accelerated option for faster turnaround. All tiers share the same endpoints, so you can move a job between them without changing your integration.

Kling 3.0 renders crisp, readable text directly in the frame and generates natural lip-sync across several languages, including Chinese, English, Japanese, Korean, and Spanish, with mixed-language delivery in one clip. You can assign dialogue to specific characters so scenes with multiple speakers stay clear, which suits e-commerce, localization, and global marketing.

Kling O3 extracts a subject's appearance and voice from a short 3 to 8 second video or an image, then reproduces that character across new clips with matching lip-sync. Combined with reference images for props and scenes, this keeps a face, build, and voice stable from shot to shot, which is what serialized stories and digital hosts need.

Yes. The Kling O3 video editing endpoint applies natural-language instructions to footage, including object removal and replacement, background changes, and added effects. Reference-to-video also handles broader restyling, such as converting live footage into a different visual style, so you can revise content without regenerating it from scratch.

Generation is asynchronous: each request returns a task ID that you poll until the clip is ready, which fits queues and high-volume pipelines. Rate limits and concurrency vary by account tier, so add exponential backoff and a retry on a 429 response, and contact support to raise limits as you scale. The Enterprise plan offers higher ceilings and custom limits.

Uploads that contain real human faces are subject to platform content rules and identity protections, and may be restricted. For consistent characters, use Kling O3's subject reference workflow with original or licensed material rather than a real person's photo, and review Atlas Cloud's acceptable use terms before building face-based workflows.

探索更多系列

Seedance 2.0

Seedance 2.0 API 為您提供 ByteDance 多模態影片模型的生產級存取權限——支援四模態輸入（文字、影像、影片、音訊），以及業界領先的「Universal Reference」（通用參考）系統，可在不同鏡頭間鎖定構圖、運鏡與角色動作。只需一次 API 呼叫即可整合導演級控制，固定費率為 $0.09/秒，即時取得金鑰，無需排隊——由企業級正常運行時間與合規性提供保障。Seedance 2.0 原生 4K 現已上線！

檢視系列

Grok Imagine

Grok Imagine API 為開發者提供 xAI 的圖像、影片和音訊生成一站式套件。它可以生成解析度高達 2K 且支援多語言文本渲染的圖像，以及長達 15 秒且帶有原生同步音訊和基於參考圖像編輯功能的影片。在 Atlas Cloud 上，只需一個金鑰即可執行每個 Grok Imagine 模式，因此您可以在圖像、影片和音訊之間無縫切換，無需單獨設定，每張圖像 0.02 美元起，每秒 0.05 美元起。

檢視系列

Gemini Omni Flash

Gemini Omni API 將 Google DeepMind 於 Google I/O 2026 發表的多模態影片生成與編輯模型帶進你的技術棧。Gemini Omni 將 Gemini 的推理引擎與生成式媒體融合，可接受文字、圖片、影片與音訊的任意組合輸入，產生一致且以知識為根據的輸出。透過自然對話持續打磨成果：替換物件、改寫場景、切換風格，同時維持物理規律、角色與畫面連貫性不變。Atlas Cloud 透過單一整合 API 提供完整的 Gemini Omni Flash 系列——文字生成影片、支援最多 7 張參考圖片的圖片生成影片，以及參考圖生成影片——採每秒計費、價格透明，$0.112 起，無需訂閱。立即開始打造。

檢視系列

GPT Image 2

GPT Image 2 API 為開發者提供了訪問 OpenAI 最新圖像模型的途徑，它是 GPT Image 1.5 的繼任者。該模型可生成和編輯圖像，能夠在拉丁和 CJK 文字上實現準確的文本渲染，並在海報、樣機和資訊圖表方面具備強大的排版能力。在 Atlas Cloud 上，您可以透過一個統一的 API 與 300 多個模型一起訪問它，並享受免費額度、99.99% 的正常運行時間，且無需 OpenAI 組織驗證。

檢視系列

Google

Google最強大的創意模型現已在Atlas Cloud上全面可用。Veo 3.1提供電影等級的影片生成，Nano Banana 2支援高保真圖像建立，而Gemini為每個工作流程帶來多模態智慧。透過單一API key即可存取完整的Google模型套件，提供Day-0可用性和隨用隨付（pay-as-you-go）定價。

檢視系列

Seedance 2.0 Mini

Seedance 2.0 Mini 將 ByteDance 的多模態影片生成技術引入到對速度和成本要求極高的工作流程中。它以更輕量的佔用空間提供 Seedance 2.0 的核心能力——更快的生成速度、更低的單支影片成本，並且使用您現有的同款 API 整合。對於運行高吞吐量流水線或進行大規模原型設計的團隊來說，Mini 是最實用的預設選擇。

檢視系列

ByteDance

從電影級影片生成到高保真影像建立，ByteDance 最強大的模型現已在 Atlas Cloud 上線。以最低的推論定價和零基礎設施開銷，大規模執行 Seedance 和 Seedream。

檢視系列

Alibaba

Atlas Cloud 將 Alibaba 的全系模型陣容整合至同一個 API 中：Qwen 適用於語言和圖像任務，Wan 適用於高達 1080p 的影片生成。所有模型均採用按需付費模式，無需訂閱。您可以使用現有的 OpenAI 兼容客戶端，透過單一的 base URL 存取 Alibaba API。

檢視系列

OpenAI

Atlas Cloud 為您提供存取完整 OpenAI API 產品線的權限，從用於圖像生成的 GPT Image 2 到用於影片的 Sora 2。每個模型均採用按需付費模式，無月度消費限制。使用相容 OpenAI 的 API，只需簡單替換基礎 URL 即可輕鬆接入。

檢視系列

xAI

在 Atlas Cloud 上使用 xAI API 建構完整的影像與影片處理管線。以 2K 解析度生成、使用參考影像進行編輯，並將影像動畫化為音訊同步的影片片段。

檢視系列

Kwaivgi

Kwaivgi API 價格低於標準定價 15%。Atlas Cloud 提供對最新 Kling 版本的零日（Day-0）存取權限，採用按需付費定價且無席位限制。一個帳戶，一個金鑰，暢享從標準版到大師版的所有 Kling 模型。

檢視系列

Seedream 5.0 Pro

Seedream 5.0 Pro API 為開發者在 Atlas Cloud 上提供了字節跳動的可控圖像編輯模型。它透過錨點和座標精確定位編輯，將圖像分離為可編輯圖層，融合多個參考，並精準匹配顏色和材質，支援 2K 和 3K 解析度的多語言文本。在 Atlas Cloud 上，您只需一個金鑰即可存取！

檢視系列

一個 API，暢享全模態 AI。

探索全部模型

Kling V3.0 API: AI Director Video with Native Audio

探索領先模型

Kling V3.0 Turbo Text-to-Video

Kling V3.0 Turbo Image-to-Video

Kling Video O3 4K Text-to-Video

Kling Video O3 4K Image-to-Video

Kling v3.0 4K Image-to-Video

Kling v3.0 Std Image-to-Video

Kling v3.0 Pro Image-to-Video

Kling v3.0 Pro Text-to-Video

Kling v3.0 4K Text-to-Video

Kling v3.0 Std Text-to-Video

Kling Video O3 Pro Text-to-Video

Kling Video O3 Pro Image-to-Video

Kling Video O3 Pro Reference-to-Video

Kling Video O3 Pro Video-Edit

Kling Video O3 Std Video-Edit

Kling Video O3 Std Reference-to-Video

Kling Video O3 Std Image-to-Video

Kling Video O3 Std Text-to-Video

峰值速度

Kling 3.0 API 功能與展示

智慧電影級敘事 (Kling 3.0)

單步生成原生音訊

原生4K輸出

多語言音畫同步與高傳真文字 (Kling 3.0)

專業級主體一致性 (Kling O3)

Reference-to-Video and Multi-Element Control

One Prompt, Many Models: Kling 3.0 API

What You Can Build with the Kling 3.0 API

使用 Kling 3.0 API 進行動態物理模擬

Cinematic Storytelling with an AI Director

使用 Kling 3.0 API 進行精準影片編輯與變換

Subject and Voice Cloning for Serialized Content

使用 Kling 3.0 API 建立一致的角色敘事

Multilingual Dialogue and On-Screen Text

How the Kling 3.0 API Compares

如何在 Atlas Cloud 上使用 Kling V3.0

建立 Atlas Cloud 帳戶

為何在 Atlas Cloud 使用 Kling V3.0

效能與靈活性

企業與規模

Kling 3.0 API: Frequently Asked Questions

探索更多系列

Seedance 2.0

Grok Imagine

Gemini Omni Flash

GPT Image 2

Google

Seedance 2.0 Mini

ByteDance

Alibaba

OpenAI

xAI

Kwaivgi

Seedream 5.0 Pro

一個 API，暢享全模態 AI。

Join our Discord community