bytedance/avatar-omni-human-v1.5

오디오를 비디오로

Avatar Omni Human 1.5 API by ByteDance

bytedance/avatar-omni-human-v1.5

Avatar-omni-human-v1.5

Open and Advanced Large-Scale Video Generative Models.

입력

매개변수 구성 로드 중...

출력

대기

생성된 비디오가 여기에 표시됩니다

설정을 구성하고 실행을 클릭하여 시작하세요

요청당 $0.06가 소요됩니다. $10로 이 모델을 약 166번 실행할 수 있습니다.

다음으로 할 수 있는 작업:

Seedance 2.0 Kling v3 Vidu Wan2.7

파라미터

코드 예시
import requests
import time

# Step 1: Start video generation
generate_url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "bytedance/avatar-omni-human-v1.5",
    "prompt": "A beautiful sunset over the ocean with gentle waves",
    "width": 512,
    "height": 512,
    "duration": 3,
    "fps": 24,
}

generate_response = requests.post(generate_url, headers=headers, json=data)
generate_result = generate_response.json()
prediction_id = generate_result["data"]["id"]

# Step 2: Poll for result
poll_url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"

def check_status():
    while True:
        response = requests.get(poll_url, headers={"Authorization": "Bearer $ATLASCLOUD_API_KEY"})
        result = response.json()

        if result["data"]["status"] in ["completed", "succeeded"]:
            print("Generated video:", result["data"]["outputs"][0])
            return result["data"]["outputs"][0]
        elif result["data"]["status"] == "failed":
            raise Exception(result["data"]["error"] or "Generation failed")
        else:
            # Still processing, wait 2 seconds
            time.sleep(2)

video_url = check_status()

설치

사용하는 언어에 필요한 패키지를 설치하세요.

pip install requests

인증

모든 API 요청에는 API 키를 통한 인증이 필요합니다. Atlas Cloud 대시보드에서 API 키를 받을 수 있습니다.

export ATLASCLOUD_API_KEY="your-api-key-here"

HTTP 헤더

import os

API_KEY = os.environ.get("ATLASCLOUD_API_KEY")
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}

API 키를 안전하게 보관하세요

클라이언트 측 코드나 공개 저장소에 API 키를 노출하지 마세요. 대신 환경 변수 또는 백엔드 프록시를 사용하세요.

요청 제출

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "your-model",
    "prompt": "A beautiful landscape"
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

요청 제출

비동기 생성 요청을 제출합니다. API는 상태 확인 및 결과 조회에 사용할 수 있는 예측 ID를 반환합니다.

POST/api/v1/model/generateVideo

요청 본문

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}

data = {
    "model": "bytedance/avatar-omni-human-v1.5",
    "prompt": "A beautiful sunset over the ocean with gentle waves"
}

response = requests.post(url, headers=headers, json=data)
result = response.json()

print(f"Prediction ID: {result['data']['id']}")
print(f"Status: {result['data']['status']}")

응답

{
  "code": 200,
  "data": {
    "id": "pred_abc123",
    "status": "processing",
    "model": "model-name",
    "created_at": "2025-01-01T00:00:00Z"
  }
}

상태 확인

예측 엔드포인트를 폴링하여 요청의 현재 상태를 확인합니다.

GET/api/v1/model/prediction/{prediction_id}

폴링 예시

import requests
import time

prediction_id = "pred_abc123"
url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

while True:
    response = requests.get(url, headers=headers)
    result = response.json()
    status = result["data"]["status"]
    print(f"Status: {status}")

    if status in ["completed", "succeeded"]:
        output_url = result["data"]["outputs"][0]
        print(f"Output URL: {output_url}")
        break
    elif status == "failed":
        print(f"Error: {result['data'].get('error', 'Unknown')}")
        break

    time.sleep(3)

상태 값

processing요청이 아직 처리 중입니다.

completed생성이 완료되었습니다. 출력을 사용할 수 있습니다.

succeeded생성이 성공했습니다. 출력을 사용할 수 있습니다.

failed생성에 실패했습니다. 오류 필드를 확인하세요.

완료 응답

{
  "data": {
    "id": "pred_abc123",
    "status": "completed",
    "outputs": [
      "https://storage.atlascloud.ai/outputs/result.mp4"
    ],
    "metrics": {
      "predict_time": 45.2
    },
    "created_at": "2025-01-01T00:00:00Z",
    "completed_at": "2025-01-01T00:00:10Z"
  }
}

파일 업로드

Atlas Cloud 스토리지에 파일을 업로드하고 API 요청에 사용할 수 있는 URL을 받습니다. multipart/form-data를 사용하여 업로드합니다.

POST/api/v1/model/uploadMedia

업로드 예시

import requests

url = "https://api.atlascloud.ai/api/v1/model/uploadMedia"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

with open("image.png", "rb") as f:
    files = {"file": ("image.png", f, "image/png")}
    response = requests.post(url, headers=headers, files=files)

result = response.json()
download_url = result["data"]["download_url"]
print(f"File URL: {download_url}")

응답

{
  "data": {
    "download_url": "https://storage.atlascloud.ai/uploads/abc123/image.png",
    "file_name": "image.png",
    "content_type": "image/png",
    "size": 1024000
  }
}

입력 Schema

다음 파라미터를 요청 본문에서 사용할 수 있습니다.

전체: 0필수: 0선택: 0

사용 가능한 파라미터가 없습니다.

요청 본문 예시

{
  "model": "bytedance/avatar-omni-human-v1.5"
}

출력 Schema

API는 생성된 출력 URL이 포함된 예측 응답을 반환합니다.

idstringrequired

Unique identifier for the prediction.

statusstringrequired

Current status of the prediction.

processingcompletedsucceededfailed

modelstringrequired

The model used for generation.

outputsarray[string]

Array of output URLs. Available when status is "completed".

errorstring

Error message if status is "failed".

metricsobject

Performance metrics.

predict_timenumber

Time taken for video generation in seconds.

created_atstringrequired

ISO 8601 timestamp when the prediction was created.

Format: date-time

completed_atstring

ISO 8601 timestamp when the prediction was completed.

Format: date-time

응답 예시

{
  "id": "pred_abc123",
  "status": "completed",
  "model": "model-name",
  "outputs": [
    "https://storage.atlascloud.ai/outputs/result.mp4"
  ],
  "metrics": {
    "predict_time": 45.2
  },
  "created_at": "2025-01-01T00:00:00Z",
  "completed_at": "2025-01-01T00:00:10Z"
}

Atlas Cloud Skills

Atlas Cloud Skills는 300개 이상의 AI 모델을 AI 코딩 어시스턴트에 직접 통합합니다. 한 번의 명령으로 설치하고 자연어로 이미지, 동영상 생성 및 LLM과 대화할 수 있습니다.

지원 클라이언트

Claude Code

OpenAI Codex

Gemini CLI

Cursor

Windsurf

VS Code

Trae

GitHub Copilot

Cline

Roo Code

Amp

Goose

Replit

40+ 지원 클라이언트

설치

npx skills add AtlasCloudAI/atlas-cloud-skills

API 키 설정

Atlas Cloud 대시보드에서 API 키를 받아 환경 변수로 설정하세요.

export ATLASCLOUD_API_KEY="your-api-key-here"

기능

설치 후 AI 어시스턴트에서 자연어를 사용하여 모든 Atlas Cloud 모델에 접근할 수 있습니다.

이미지 생성Nano Banana 2, Z-Image 등의 모델로 이미지를 생성합니다.

동영상 제작Kling, Vidu, Veo 등으로 텍스트나 이미지에서 동영상을 만듭니다.

LLM 채팅Qwen, DeepSeek 등 대규모 언어 모델과 대화합니다.

미디어 업로드이미지 편집 및 이미지-동영상 변환 워크플로우를 위해 로컬 파일을 업로드합니다.

더 알아보기

github.com/AtlasCloudAI/atlas-cloud-skills

MCP Server

Atlas Cloud MCP Server는 Model Context Protocol을 통해 IDE와 300개 이상의 AI 모델을 연결합니다. MCP 호환 클라이언트에서 사용할 수 있습니다.

지원 클라이언트

Cursor

VS Code

Windsurf

Claude Code

OpenAI Codex

Gemini CLI

Cline

Roo Code

100+ 지원 클라이언트

설치

npx -y atlascloud-mcp

설정

다음 설정을 IDE의 MCP 설정 파일에 추가하세요.

{
  "mcpServers": {
    "atlascloud": {
      "command": "npx",
      "args": [
        "-y",
        "atlascloud-mcp"
      ],
      "env": {
        "ATLASCLOUD_API_KEY": "your-api-key-here"
      }
    }
  }
}

사용 가능한 도구

atlas_generate_image텍스트 프롬프트로 이미지를 생성합니다.

atlas_generate_video텍스트나 이미지로 동영상을 만듭니다.

atlas_chat대규모 언어 모델과 대화합니다.

atlas_list_models300개 이상의 사용 가능한 AI 모델을 탐색합니다.

atlas_quick_generate최적 모델을 자동 선택하여 한 번에 콘텐츠를 생성합니다.

atlas_upload_mediaAPI 워크플로우를 위해 로컬 파일을 업로드합니다.

더 알아보기

github.com/AtlasCloudAI/mcp-server

API 스키마

스키마를 사용할 수 없음

사용 가능한 예제 없음

로드 중...

Avatar Omni Human 1.5

Turn a single portrait photo into a lifelike, lip-synced talking video — driven entirely by an audio clip.

Avatar Omni Human 1.5 (OmniHuman) by ByteDance is a state-of-the-art digital-human video generation model. Give it one reference image of a person and an audio track, and it generates a natural, expressive video of that person speaking — with accurate lip sync, head motion, and facial expressions that match the audio.

Key Capabilities

Audio-driven lip sync — mouth movements precisely follow the speech in your audio.
Identity preservation — the generated person stays faithful to your reference image.
Natural motion — lifelike head pose, blinking, and micro-expressions, not a stiff talking head.
Multilingual — works with audio in many languages, including Chinese, English, Japanese, Korean, Spanish, and Indonesian.
High resolution — render output at up to 1080p.

How It Works

Provide a reference image (a clear photo of a person) and a driving audio clip (the speech to be spoken).
The model animates the person to "speak" the audio, producing a full talking-head video.
The output video's duration matches the length of your audio.

Generation is asynchronous: you submit a request and receive a task ID, then poll for the result. A typical clip takes a few minutes depending on audio length and resolution.

Inputs

Parameter	Required	Description
`image_url`	Yes	URL of the reference portrait. A clear, front-facing photo with a fully visible face works best.
`audio_url`	Yes	URL of the driving audio (speech). The generated video's length equals this audio's length.
`prompt`	No	Optional text hint for action / scene / expression. Supports Chinese, English, Japanese, Korean, Spanish, and Indonesian.
`output_resolution`	No	`720` (default) or `1080`.
`seed`	No	Fix the seed for reproducible output; `-1` for random.

Best Practices

Reference image: use a high-quality, well-lit, front-facing photo with one clearly visible face. Avoid heavy occlusion (hands or objects over the face), extreme angles, or faces that are too small in frame.
Audio: clean speech with minimal background noise yields the most accurate lip sync.
Single subject: for best results the reference image should contain a single, clear primary person.

Typical Use Cases

Virtual presenters and AI news anchors
Talking-avatar marketing and product explainers
Localized / dubbed spokesperson videos
Education and training content
Social-media avatars and digital influencers

Notes & Limitations

Output video length is bounded by the input audio length.
Quality depends heavily on input quality — blurry images or noisy audio reduce lip-sync accuracy.
Best suited to a single primary speaker; complex multi-person scenes are not the target use case.

Pricing

Billed at $0.12 per second of generated video (output duration = audio duration).

Powered by ByteDance OmniHuman 1.5, served through Atlas Cloud.

Avatar Omni Human 1.5 API by ByteDance

입력

출력

파라미터

코드 예시

설치

인증

HTTP 헤더

요청 제출

요청 제출

요청 본문

응답

상태 확인

폴링 예시

상태 값

완료 응답

파일 업로드

업로드 예시

응답

입력 Schema

요청 본문 예시

출력 Schema

응답 예시

Atlas Cloud Skills

지원 클라이언트

설치

API 키 설정

기능

MCP Server

지원 클라이언트

설치

설정

사용 가능한 도구

API 스키마

Avatar Omni Human 1.5

Key Capabilities

How It Works

Inputs

Best Practices

Typical Use Cases

Notes & Limitations

Pricing

유사한 모델 탐색

Seedance 2.0 Mini Reference-to-Video

Seedance 2.0 Mini Image-to-Video

Seedance 2.0 Mini Text-to-Video

Seedance 2.0 Fast Reference-to-Video

Seedance 2.0 Fast Image-to-Video

Seedance 2.0 Fast Text-to-Video

Seedance 2.0 Reference-to-Video

Seedance 2.0 Image-to-Video

Seedance 2.0 Text-to-Video

Seedance v1.5 Pro Image-to-Video

Seedance v1.5 Pro Text-to-Video

Seedance v1.5 Pro Image-to-Video Fast

Seedance v1.5 Pro Text-to-Video Fast

Seedance v1 Pro Fast Text-to-video

Seedance v1 Pro Fast Image-to-video

Seedance v1 Pro t2v 1080p

하나의 API로 모든 미디어 AI를.

Join our Discord community

입력

출력

파라미터

코드 예시

설치

인증

HTTP 헤더

요청 제출

요청 제출

요청 본문

응답

상태 확인

폴링 예시

상태 값

완료 응답

파일 업로드

업로드 예시

응답

입력 Schema