kwaivgi/kling-v2.6-pro/avatar

image-vers-vidéo

Kling V2 AI Avatar Pro generates high-quality AI avatar videos with clean detail, stable motion, and strong identity consistency—ideal for profiles, intros, and social content.

Kling-v2-ai-avatar-pro — Talking Avatar from Image + Audio

kling-v2-ai-avatar-pro turns a single portrait into a lip-synced talking-head video driven by your own audio. Upload a clear face image, provide a narration or dialogue track, and the model generates a vertical HD avatar clip that speaks and moves naturally on camera.

🌟 Highlights

Audio-driven performance – Uses your uploaded audio as-is (no TTS), keeping timing, pauses and emotion.
Photo-real talking avatar – Animates the face, eyes and head while preserving the identity from the reference image.
One-shot setup – Just an image + audio; no need for video capture or motion recording.
Portrait-ready output – Produces social-ready vertical video that fits Reels, TikTok, Shorts and story formats.
Prompt-guided styling (optional) – Use prompt to hint at camera feel or mood (e.g. “soft studio lighting, subtle head movement, gentle smile”).

🔧 Parameters

audio* – Required. The voice track that drives lip-sync and timing (URL or upload).
image* – Required. A clear, front-facing portrait of the person to animate.
prompt – Optional text describing style, expression or camera feel. If omitted, the model uses a neutral talking-head style.

Tip: Use a well-lit, unobstructed face (no heavy motion blur, minimal occlusion) for best identity preservation.

🚀 How to Use

Upload audio

Clean mono/stereo track, with minimal background noise. Make sure the final edited length matches what you want in the video. 2. Upload image

Front or 3/4 view, eyes visible, face not cropped. The avatar’s identity and pose come from this image. 3. (Optional) Add a prompt

Guide expression or style, e.g.:

“confident presenter in a tech promo, subtle head nods” “friendly customer service tone, warm expression” 4. Run the model

The video length is automatically derived from the audio duration. Download the generated talking-head clip and drop it into your editor or directly onto social platforms.

💰 Pricing

Billing is based on audio duration, with a minimum of 5 seconds.

Audio length (s)	Billed seconds	Price (USD)
0–5	5	0.56
10	10	1.12
20	20	2.24
30	30	3.36
60	60	6.72

Any clip shorter than 5 seconds is still billed as 5 seconds.

🧠 Tips for Best Results

Edit your audio first – Remove mistakes, long silences and background noise before upload.
Match tone to use case – Calm, even delivery for corporate avatars; more expressive reads for ads or UGC.
Keep framing consistent – Use images with similar head size and framing across a campaign for a unified look.
Test a few portraits – Small changes in the reference image (lighting, angle) can noticeably change the avatar's feel.

Spécifications Détaillées

Aperçu :

Fournisseur du Modèle :KWAIVGI

Type de Modèle :image-to-video

Déploiement :Inference API; Playground

Tarification :$0.095

Paramètres Clés :

Limite de Taille :Jusqu'à Largeur × Hauteur (configurable par l'utilisateur)

Support LoRA :Non

Options de Seed :N/A

Créez Votre Prochaine Chef-d'œuvre

Découvrir des modèles similaires

NEW

image-vers-vidéo

Kling v2.6 Std Avatar

Kling AI Avatar generates high-quality AI avatar videos for profiles, intros, and social content, delivering clean detail and cinematic motion with reliable prompt adherence.

$0.048/SEC

NEW

image-vers-vidéo

Kling v2.6 Pro Motion Control

Kling 2.6 Pro Motion Control turns reference motion clips (dance, action, gesture) into smooth, realistic animations. Upload a character image (or source video) and a motion video; the model transfers the movement while preserving identity and temporal consistency.

Kling v2.6 Std Motion Control

Kling 2.6 Standard Motion Control transfers motion from reference videos to animate still images. Upload a character image and a motion clip (dance, action, gesture), and the model extracts the movement to generate smooth, realistic video.

Kling Video O1 Text-to-video

Kling Omni Video O1 is Kuaishou's first unified multi-modal video model with MVL (Multi-modal Visual Language) technology. Text-to-Video mode generates cinematic videos from text prompts with subject consistency, natural physics simulation, and precise semantic understanding. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

$0.112/SEC

$0.095/SEC

-15%

Commencez avec Plus de 300 Modèles,

Explorer tous les modèles