bytedance/seed-audio-1.0

ข้อความเป็นเสียงพูด

Seed Audio 1.0 API by ByteDance

bytedance/seed-audio-1.0

Seed-audio-1.0

Doubao‑Audio‑Generate‑1.0 is Doubao Voice’s next‑generation audio‑generation engine. The industry‑first commercial tool creates film‑grade audio with just one prompt. It eliminates cumbersome audio‑engineering work. Creators generate publish‑ready radio dramas, podcasts and branded audio easily, shifting from a simple voice‑generator to an AI audio director. It serves audiobooks, serialized episodes and commercial audio for high‑quality narrative‑driven production.

อินพุต

ข้อความ *

References

สูงสุด: 3

Format

อัตราการสุ่มตัวอย่าง

Pitch rate

Speech rate

Loudness rate

เอาต์พุต

รอดำเนินการ

เสียงที่สร้างขึ้นจะแสดงที่นี่

ป้อนข้อความแล้วคลิกเรียกใช้เพื่อสังเคราะห์เสียงพูด

พารามิเตอร์

ตัวอย่างโค้ด
import requests
import time

# Step 1: Start audio generation
generate_url = "https://api.atlascloud.ai/api/v1/model/generateAudio"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "bytedance/seed-audio-1.0",  # Required. Model name. options: bytedance/seed-audio-1.0
    "text": "A phone vibrates first, then a calm male voice says: Welcome to Seed Audio.",  # Required. Prompt or text to synthesize into audio
    "references": [
        {
            "speaker": "example_speaker",
            "audio_url": "example_audio_url",
            "audio_data": "example_audio_data",
            "image_url": "example_image_url",
            "image_data": "example_image_data"
        }
    ],  # Optional reference resources
    "format": "mp3",  # Output audio format. options: mp3 | wav | pcm | ogg_opus
    "sample_rate": 24000,  # Output sample rate. options: 8000 | 16000 | 24000 | 32000 | 44100 | 48000
    "pitch_rate": 0,  # Pitch adjustment. (min: -12, max: 12)
    "speech_rate": 0,  # Speech speed adjustment. (min: -50, max: 100)
    "loudness_rate": 0,  # Loudness adjustment. (min: -50, max: 100)
}

generate_response = requests.post(generate_url, headers=headers, json=data)
generate_result = generate_response.json()
prediction_id = generate_result["data"]["id"]

# Step 2: Poll for result
poll_url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"

def check_status():
    while True:
        response = requests.get(poll_url, headers={"Authorization": "Bearer $ATLASCLOUD_API_KEY"})
        result = response.json()

        if result["data"]["status"] in ["completed", "succeeded"]:
            print("Generated audio:", result["data"]["outputs"][0])
            return result["data"]["outputs"][0]
        elif result["data"]["status"] == "failed":
            raise Exception(result["data"]["error"] or "Generation failed")
        else:
            # Still processing, wait 2 seconds
            time.sleep(2)

audio_url = check_status()

ติดตั้ง

ติดตั้งแพ็กเกจที่จำเป็น

pip install requests

การยืนยันตัวตน

คำขอ API ทั้งหมดต้องมีการยืนยันตัวตนผ่าน API key คุณสามารถรับ API key ได้จากแดชบอร์ด Atlas Cloud

export ATLASCLOUD_API_KEY="your-api-key-here"

HTTP Headers

import os

API_KEY = os.environ.get("ATLASCLOUD_API_KEY")
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}

รักษา API key ของคุณให้ปลอดภัย

อย่าเปิดเผย API key ของคุณในโค้ดฝั่งไคลเอนต์หรือที่เก็บข้อมูลสาธารณะ ให้ใช้ตัวแปรสภาพแวดล้อมหรือพร็อกซีฝั่งเซิร์ฟเวอร์แทน

ส่งคำขอ

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateAudio"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "your-model",
    "text": "Hello, welcome to AtlasCloud text-to-speech."
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

ส่งคำขอ

ส่งคำขอสร้างแบบอะซิงโครนัส API จะส่งคืน prediction ID ที่คุณสามารถใช้ตรวจสอบสถานะและดึงผลลัพธ์ได้

POST/api/v1/model/generateAudio

เนื้อหาคำขอ

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateAudio"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}

data = {
    "model": "bytedance/seed-audio-1.0",
    "text": "Hello, welcome to AtlasCloud text-to-speech."
}

response = requests.post(url, headers=headers, json=data)
result = response.json()

print(f"Prediction ID: {result['data']['id']}")
print(f"Status: {result['data']['status']}")

การตอบกลับ

{
  "code": 200,
  "data": {
    "id": "pred_abc123",
    "status": "processing",
    "model": "model-name",
    "created_at": "2025-01-01T00:00:00Z"
  }
}

ตรวจสอบสถานะ

ตรวจสอบสถานะปัจจุบันของคำขอด้วยการเรียก prediction endpoint เป็นระยะ

GET/api/v1/model/prediction/{prediction_id}

ตัวอย่างการตรวจสอบสถานะเป็นระยะ

import requests
import time

prediction_id = "pred_abc123"
url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

while True:
    response = requests.get(url, headers=headers)
    result = response.json()
    status = result["data"]["status"]
    print(f"Status: {status}")

    if status in ["completed", "succeeded"]:
        output_url = result["data"]["outputs"][0]
        print(f"Output URL: {output_url}")
        break
    elif status == "failed":
        print(f"Error: {result['data'].get('error', 'Unknown')}")
        break

    time.sleep(3)

ค่าสถานะ

processingคำขอยังอยู่ระหว่างการประมวลผล

completedการสร้างเสร็จสมบูรณ์แล้ว ผลลัพธ์พร้อมใช้งาน

succeededการสร้างสำเร็จแล้ว ผลลัพธ์พร้อมใช้งาน

failedการสร้างล้มเหลว ตรวจสอบฟิลด์ error

การตอบกลับที่เสร็จสมบูรณ์

{
  "data": {
    "id": "pred_abc123",
    "status": "completed",
    "outputs": [
      "https://storage.atlascloud.ai/outputs/result.mp3"
    ],
    "metrics": {
      "predict_time": 8.3
    },
    "created_at": "2025-01-01T00:00:00Z",
    "completed_at": "2025-01-01T00:00:10Z"
  }
}

Input Schema

พารามิเตอร์ต่อไปนี้ยอมรับในเนื้อหาคำขอ

ทั้งหมด: 8จำเป็น: 2ไม่บังคับ: 6

modelstringrequired

Model name.

Default: "bytedance/seed-audio-1.0"

bytedance/seed-audio-1.0

textstringrequired

Prompt or text to synthesize into audio. Max 2048 characters. For reference audio generation, mention references using the upstream placeholder tokens such as `@audio1`, `@audio2`, and `@audio3`.

Default: "A phone vibrates first, then a calm male voice says: Welcome to Seed Audio."

referencesarray[object]

Optional reference resources. Omit for text-only generation. Provide up to 3 audio references or 1 image reference. Each reference item must contain exactly one of `speaker`, `audio_url`, `audio_data`, `image_url`, or `image_data`. Audio references can be cited in text using the upstream placeholder tokens such as `@audio1`, `@audio2`, and `@audio3`. Do not mix image references with audio references or speaker IDs in the same request.

Max items: 3

speakerstring

Voice ID from Doubao TTS 2.0 or a cloned voice. Mutually exclusive with audio_url and audio_data.

zh_female_vv_uranus_bigttszh_female_xiaohe_uranus_bigttszh_male_m191_uranus_bigttszh_male_taocheng_uranus_bigttszh_male_liufei_uranus_bigttszh_female_sophie_uranus_bigttszh_female_qingxinnvsheng_uranus_bigttszh_female_cancan_uranus_bigttszh_female_tianmeitaozi_uranus_bigttszh_male_ruyayichen_uranus_bigtts

audio_urlstring

Reference audio URL. Max 30s and 10MB per file. Supports wav/mp3/pcm/ogg_opus. Mutually exclusive with speaker and audio_data.

audio_datastring

Reference audio as Base64. Max 30s and 10MB per file. Supports wav/mp3/pcm/ogg_opus. Mutually exclusive with speaker and audio_url.

image_urlstring

Reference image URL. Mutually exclusive with image_data and cannot be mixed with audio references.

image_datastring

Reference image as Base64. Mutually exclusive with image_url and cannot be mixed with audio references.

formatstring

Output audio format.

Default: "mp3"

mp3wavpcmogg_opus

sample_rateinteger

Output sample rate.

Default: 24000

80001600024000320004410048000

pitch_rateinteger

Pitch adjustment. Range [-12, 12], default 0.

Default: 0Min: -12Max: 12

speech_rateinteger

Speech speed adjustment. Range [-50, 100], where 100 means 2.0x and -50 means 0.5x.

Default: 0Min: -50Max: 100

loudness_rateinteger

Loudness adjustment. Range [-50, 100], where 100 means 2.0x loudness and -50 means 0.5x.

Default: 0Min: -50Max: 100

ตัวอย่างเนื้อหาคำขอ

{
  "model": "bytedance/seed-audio-1.0",
  "text": "A phone vibrates first, then a calm male voice says: Welcome to Seed Audio.",
  "format": "mp3",
  "sample_rate": 24000,
  "pitch_rate": 0,
  "speech_rate": 0,
  "loudness_rate": 0
}

Output Schema

API จะส่งคืนการตอบกลับ prediction พร้อม URL ของผลลัพธ์ที่สร้างขึ้น

created_atstring

ISO timestamp of when the request was created.

idstring

Unique identifier for the prediction.

modelstring

Model ID used for the prediction.

outputsarray

Array of URLs to the generated audio.

statusstring

Status of the task: created, processing, completed, or failed.

ตัวอย่างการตอบกลับ

{
  "id": "pred_abc123",
  "status": "completed",
  "model": "model-name",
  "outputs": [
    "https://storage.atlascloud.ai/outputs/result.png"
  ],
  "metrics": {
    "predict_time": 8.3
  },
  "created_at": "2025-01-01T00:00:00Z",
  "completed_at": "2025-01-01T00:00:10Z"
}

Atlas Cloud Skills

Atlas Cloud Skills เชื่อมต่อโมเดล AI กว่า 300+ เข้ากับผู้ช่วยเขียนโค้ด AI ของคุณโดยตรง ติดตั้งด้วยคำสั่งเดียว จากนั้นใช้ภาษาธรรมชาติเพื่อสร้างรูปภาพ วิดีโอ และสนทนากับ LLM

ไคลเอนต์ที่รองรับ

Claude Code

OpenAI Codex

Gemini CLI

Cursor

Windsurf

VS Code

Trae

GitHub Copilot

Cline

Roo Code

Amp

Goose

Replit

40+ ไคลเอนต์ที่รองรับ

ติดตั้ง

npx skills add AtlasCloudAI/atlas-cloud-skills

ตั้งค่า API Key

รับ API key จากแดชบอร์ด Atlas Cloud และตั้งค่าเป็นตัวแปรสภาพแวดล้อม

export ATLASCLOUD_API_KEY="your-api-key-here"

ความสามารถ

เมื่อติดตั้งแล้ว คุณสามารถใช้ภาษาธรรมชาติในผู้ช่วย AI ของคุณเพื่อเข้าถึงโมเดล Atlas Cloud ทั้งหมด

สร้างรูปภาพสร้างรูปภาพด้วยโมเดลเช่น Nano Banana 2, Z-Image และอื่นๆ

สร้างวิดีโอสร้างวิดีโอจากข้อความหรือรูปภาพด้วย Kling, Vidu, Veo เป็นต้น

สนทนา LLMสนทนากับ Qwen, DeepSeek และโมเดลภาษาขนาดใหญ่อื่นๆ

อัปโหลดสื่ออัปโหลดไฟล์จากเครื่องสำหรับการแก้ไขรูปภาพและเวิร์กโฟลว์รูปภาพเป็นวิดีโอ

เรียนรู้เพิ่มเติม

github.com/AtlasCloudAI/atlas-cloud-skills

MCP Server

Atlas Cloud MCP Server เชื่อมต่อ IDE ของคุณกับโมเดล AI กว่า 300+ ผ่าน Model Context Protocol ใช้งานได้กับไคลเอนต์ที่รองรับ MCP ทุกตัว

ไคลเอนต์ที่รองรับ

Cursor

VS Code

Windsurf

Claude Code

OpenAI Codex

Gemini CLI

Cline

Roo Code

100+ ไคลเอนต์ที่รองรับ

ติดตั้ง

npx -y atlascloud-mcp

การกำหนดค่า

เพิ่มการกำหนดค่าต่อไปนี้ลงในไฟล์ตั้งค่า MCP ของ IDE ของคุณ

{
  "mcpServers": {
    "atlascloud": {
      "command": "npx",
      "args": [
        "-y",
        "atlascloud-mcp"
      ],
      "env": {
        "ATLASCLOUD_API_KEY": "your-api-key-here"
      }
    }
  }
}

เครื่องมือที่ใช้ได้

atlas_generate_imageสร้างรูปภาพจากข้อความ prompt

atlas_generate_videoสร้างวิดีโอจากข้อความหรือรูปภาพ

atlas_chatสนทนากับโมเดลภาษาขนาดใหญ่

atlas_list_modelsเรียกดูโมเดล AI กว่า 300+ ที่ใช้ได้

atlas_quick_generateสร้างเนื้อหาขั้นตอนเดียวพร้อมเลือกโมเดลอัตโนมัติ

atlas_upload_mediaอัปโหลดไฟล์จากเครื่องสำหรับเวิร์กโฟลว์ API

เรียนรู้เพิ่มเติม

github.com/AtlasCloudAI/mcp-server

API Schema

{
  "info": {
    "title": "AtlasCloud API",
    "version": "1.0.0",
    "description": "The AtlasCloud API."
  },
  "openapi": "3.0.0",
  "paths": {
    "/api/v1/model/generateAudio": {
      "post": {
        "requestBody": {
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/Input"
              }
            }
          },
          "required": true
        },
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "The request status."
          }
        }
      },
      "x-api-name": "model_run"
    },
    "/api/v1/model/prediction/{request_id}": {
      "get": {
        "parameters": [
          {
            "in": "path",
            "name": "request_id",
            "required": true,
            "schema": {
              "description": "Request ID",
              "type": "string"
            }
          }
        ],
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "Result of the request."
          }
        }
      },
      "x-api-name": "model_result"
    }
  },
  "components": {
    "schemas": {
      "Input": {
        "properties": {
          "model": {
            "type": "string",
            "description": "Model name.",
            "default": "bytedance/seed-audio-1.0",
            "enum": [
              "bytedance/seed-audio-1.0"
            ]
          },
          "text": {
            "description": "Prompt or text to synthesize into audio. Max 2048 characters. For reference audio generation, mention references using the upstream placeholder tokens such as `@audio1`, `@audio2`, and `@audio3`.",
            "type": "string",
            "default": "A phone vibrates first, then a calm male voice says: Welcome to Seed Audio.",
            "x-ui-component": "textarea",
            "maxLength": 2048
          },
          "references": {
            "description": "Optional reference resources. Omit for text-only generation. Provide up to 3 audio references or 1 image reference. Each reference item must contain exactly one of `speaker`, `audio_url`, `audio_data`, `image_url`, or `image_data`. Audio references can be cited in text using the upstream placeholder tokens such as `@audio1`, `@audio2`, and `@audio3`. Do not mix image references with audio references or speaker IDs in the same request.",
            "type": "array",
            "maxItems": 3,
            "items": {
              "description": "One reference resource. Pick exactly ONE source type (preset voice / audio URL / audio Base64 / image URL / image Base64); the others are hidden once one is chosen. Audio: wav/mp3/pcm/ogg_opus, <=30s & 10MB. Image: jpeg/png/webp, one image only, <=10MB. Image references cannot be mixed with audio/speaker references.",
              "type": "object",
              "properties": {
                "speaker": {
                  "description": "Voice ID from Doubao TTS 2.0 or a cloned voice. Mutually exclusive with audio_url and audio_data.",
                  "type": "string",
                  "enum": [
                    "zh_female_vv_uranus_bigtts",
                    "zh_female_xiaohe_uranus_bigtts",
                    "zh_male_m191_uranus_bigtts",
                    "zh_male_taocheng_uranus_bigtts",
                    "zh_male_liufei_uranus_bigtts",
                    "zh_female_sophie_uranus_bigtts",
                    "zh_female_qingxinnvsheng_uranus_bigtts",
                    "zh_female_cancan_uranus_bigtts",
                    "zh_female_tianmeitaozi_uranus_bigtts",
                    "zh_male_ruyayichen_uranus_bigtts"
                  ],
                  "x-ui-component": "voice-select",
                  "x-enum-options": {
                    "zh_female_vv_uranus_bigtts": {
                      "name": "Vivi 2.0",
                      "language": "en",
                      "gender": "female",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_vv_uranus_bigtts.mp3"
                    },
                    "zh_female_xiaohe_uranus_bigtts": {
                      "name": "Xiaohe 2.0",
                      "language": "en",
                      "gender": "female",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_xiaohe_uranus_bigtts.mp3"
                    },
                    "zh_male_m191_uranus_bigtts": {
                      "name": "Yunzhou 2.0",
                      "language": "en",
                      "gender": "male",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_m191_uranus_bigtts.mp3"
                    },
                    "zh_male_taocheng_uranus_bigtts": {
                      "name": "Xiaotian 2.0",
                      "language": "en",
                      "gender": "male",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_taocheng_uranus_bigtts.mp3"
                    },
                    "zh_male_liufei_uranus_bigtts": {
                      "name": "Liufei 2.0",
                      "language": "en",
                      "gender": "male",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_liufei_uranus_bigtts.mp3"
                    },
                    "zh_female_sophie_uranus_bigtts": {
                      "name": "Sophie 2.0",
                      "language": "en",
                      "gender": "female",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_sophie_uranus_bigtts.mp3"
                    },
                    "zh_female_qingxinnvsheng_uranus_bigtts": {
                      "name": "Fresh Voice 2.0",
                      "language": "en",
                      "gender": "female",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_qingxinnvsheng_uranus_bigtts.mp3"
                    },
                    "zh_female_cancan_uranus_bigtts": {
                      "name": "Cancan 2.0",
                      "language": "en",
                      "gender": "female",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_cancan_uranus_bigtts.mp3"
                    },
                    "zh_female_tianmeitaozi_uranus_bigtts": {
                      "name": "Sweet Peach 2.0",
                      "language": "en",
                      "gender": "female",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_tianmeitaozi_uranus_bigtts.mp3"
                    },
                    "zh_male_ruyayichen_uranus_bigtts": {
                      "name": "Yichen 2.0",
                      "language": "en",
                      "gender": "male",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_ruyayichen_uranus_bigtts.mp3"
                    }
                  },
                  "title": "Preset voice",
                  "x-ui-exclusive-group": "ref_source",
                  "x-ui-source-type": "audio"
                },
                "audio_url": {
                  "description": "Reference audio URL. Max 30s and 10MB per file. Supports wav/mp3/pcm/ogg_opus. Mutually exclusive with speaker and audio_data.",
                  "type": "string",
                  "x-ui-component": "uploader",
                  "title": "Audio URL",
                  "x-ui-exclusive-group": "ref_source",
                  "x-ui-source-type": "audio"
                },
                "audio_data": {
                  "description": "Reference audio as Base64. Max 30s and 10MB per file. Supports wav/mp3/pcm/ogg_opus. Mutually exclusive with speaker and audio_url.",
                  "type": "string",
                  "x-ui-component": "textarea",
                  "title": "Audio (Base64)",
                  "x-ui-exclusive-group": "ref_source",
                  "x-hidden": true,
                  "x-ui-source-type": "audio"
                },
                "image_url": {
                  "description": "Reference image URL. Mutually exclusive with image_data and cannot be mixed with audio references.",
                  "type": "string",
                  "x-ui-component": "uploader",
                  "title": "Image URL",
                  "x-ui-exclusive-group": "ref_source",
                  "x-ui-source-type": "image"
                },
                "image_data": {
                  "description": "Reference image as Base64. Mutually exclusive with image_url and cannot be mixed with audio references.",
                  "type": "string",
                  "x-ui-component": "textarea",
                  "title": "Image (Base64)",
                  "x-ui-exclusive-group": "ref_source",
                  "x-hidden": true,
                  "x-ui-source-type": "image"
                }
              },
              "x-ui-exclusive": "ref_source"
            },
            "x-ui-cross-item-exclusive-by": "x-ui-source-type",
            "x-ui-source-type-max-items": {
              "audio": 3,
              "image": 1
            }
          },
          "format": {
            "description": "Output audio format.",
            "type": "string",
            "default": "mp3",
            "enum": [
              "mp3",
              "wav",
              "pcm",
              "ogg_opus"
            ],
            "x-ui-component": "select"
          },
          "sample_rate": {
            "description": "Output sample rate.",
            "type": "integer",
            "default": 24000,
            "enum": [
              8000,
              16000,
              24000,
              32000,
              44100,
              48000
            ],
            "x-ui-component": "select"
          },
          "pitch_rate": {
            "description": "Pitch adjustment. Range [-12, 12], default 0.",
            "type": "integer",
            "default": 0,
            "minimum": -12,
            "maximum": 12,
            "x-ui-component": "slider"
          },
          "speech_rate": {
            "description": "Speech speed adjustment. Range [-50, 100], where 100 means 2.0x and -50 means 0.5x.",
            "type": "integer",
            "default": 0,
            "minimum": -50,
            "maximum": 100,
            "x-ui-component": "slider"
          },
          "loudness_rate": {
            "description": "Loudness adjustment. Range [-50, 100], where 100 means 2.0x loudness and -50 means 0.5x.",
            "type": "integer",
            "default": 0,
            "minimum": -50,
            "maximum": 100,
            "x-ui-component": "slider"
          }
        },
        "required": [
          "model",
          "text"
        ],
        "type": "object",
        "x-order-properties": [
          "model",
          "text",
          "references",
          "format",
          "sample_rate",
          "pitch_rate",
          "speech_rate",
          "loudness_rate"
        ]
      },
      "Reference": {
        "description": "One reference resource for Seed Audio. Use exactly one of speaker, audio_url, audio_data, image_url, or image_data per item. Audio supports wav/mp3/pcm/ogg_opus, up to 30 seconds and 10 MB per item. Image supports jpeg/png/webp, one image only, up to 10 MB.",
        "type": "object",
        "properties": {
          "speaker": {
            "description": "Voice ID from Doubao TTS 2.0 or a cloned voice. Mutually exclusive with audio_url and audio_data.",
            "type": "string",
            "enum": [
              "zh_female_vv_uranus_bigtts",
              "zh_female_xiaohe_uranus_bigtts",
              "zh_male_m191_uranus_bigtts",
              "zh_male_taocheng_uranus_bigtts",
              "zh_male_liufei_uranus_bigtts",
              "zh_female_sophie_uranus_bigtts",
              "zh_female_qingxinnvsheng_uranus_bigtts",
              "zh_female_cancan_uranus_bigtts",
              "zh_female_tianmeitaozi_uranus_bigtts",
              "zh_male_ruyayichen_uranus_bigtts"
            ],
            "x-ui-component": "voice-select",
            "x-enum-options": {
              "zh_female_vv_uranus_bigtts": {
                "name": "Vivi 2.0",
                "language": "en",
                "gender": "female",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_vv_uranus_bigtts.mp3"
              },
              "zh_female_xiaohe_uranus_bigtts": {
                "name": "Xiaohe 2.0",
                "language": "en",
                "gender": "female",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_xiaohe_uranus_bigtts.mp3"
              },
              "zh_male_m191_uranus_bigtts": {
                "name": "Yunzhou 2.0",
                "language": "en",
                "gender": "male",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_m191_uranus_bigtts.mp3"
              },
              "zh_male_taocheng_uranus_bigtts": {
                "name": "Xiaotian 2.0",
                "language": "en",
                "gender": "male",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_taocheng_uranus_bigtts.mp3"
              },
              "zh_male_liufei_uranus_bigtts": {
                "name": "Liufei 2.0",
                "language": "en",
                "gender": "male",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_liufei_uranus_bigtts.mp3"
              },
              "zh_female_sophie_uranus_bigtts": {
                "name": "Sophie 2.0",
                "language": "en",
                "gender": "female",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_sophie_uranus_bigtts.mp3"
              },
              "zh_female_qingxinnvsheng_uranus_bigtts": {
                "name": "Fresh Voice 2.0",
                "language": "en",
                "gender": "female",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_qingxinnvsheng_uranus_bigtts.mp3"
              },
              "zh_female_cancan_uranus_bigtts": {
                "name": "Cancan 2.0",
                "language": "en",
                "gender": "female",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_cancan_uranus_bigtts.mp3"
              },
              "zh_female_tianmeitaozi_uranus_bigtts": {
                "name": "Sweet Peach 2.0",
                "language": "en",
                "gender": "female",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_tianmeitaozi_uranus_bigtts.mp3"
              },
              "zh_male_ruyayichen_uranus_bigtts": {
                "name": "Yichen 2.0",
                "language": "en",
                "gender": "male",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_ruyayichen_uranus_bigtts.mp3"
              }
            },
            "x-ui-source-type": "audio"
          },
          "audio_url": {
            "description": "Reference audio URL. Mutually exclusive with speaker and audio_data.",
            "type": "string",
            "x-ui-component": "uploader",
            "x-ui-source-type": "audio"
          },
          "audio_data": {
            "description": "Reference audio as Base64. Mutually exclusive with speaker and audio_url.",
            "type": "string",
            "x-ui-component": "textarea",
            "x-hidden": true,
            "x-ui-source-type": "audio"
          },
          "image_url": {
            "description": "Reference image URL. Mutually exclusive with image_data and cannot be mixed with audio references.",
            "type": "string",
            "x-ui-component": "uploader",
            "x-ui-source-type": "image"
          },
          "image_data": {
            "description": "Reference image as Base64. Mutually exclusive with image_url and cannot be mixed with audio references.",
            "type": "string",
            "x-ui-component": "textarea",
            "x-hidden": true,
            "x-ui-source-type": "image"
          }
        }
      },
      "PredictionResponse": {
        "properties": {
          "created_at": {
            "description": "ISO timestamp of when the request was created.",
            "format": "date-time",
            "type": "string"
          },
          "id": {
            "description": "Unique identifier for the prediction.",
            "type": "string"
          },
          "model": {
            "description": "Model ID used for the prediction.",
            "type": "string"
          },
          "outputs": {
            "description": "Array of URLs to the generated audio.",
            "items": {
              "type": "string",
              "x-format": "audio"
            },
            "type": "array"
          },
          "status": {
            "description": "Status of the task: created, processing, completed, or failed.",
            "type": "string"
          },
          "urls": {
            "description": "Object containing related API endpoints.",
            "type": "object"
          }
        },
        "type": "object"
      }
    }
  },
  "servers": [
    {
      "url": "https://api.atlascloud.ai"
    }
  ]
}

เทมเพลต Prompt สำหรับ LLM

# bytedance/seed-audio-1.0

> Doubao‑Audio‑Generate‑1.0 is Doubao Voice’s next‑generation audio‑generation engine. The industry‑first commercial tool creates film‑grade audio with just one prompt.
It eliminates cumbersome audio‑engineering work. Creators generate publish‑ready radio dramas, podcasts and branded audio easily, shifting from a simple voice‑generator to an AI audio director. It serves audiobooks, serialized episodes and commercial audio for high‑quality narrative‑driven production.


## Overview

- **Submit endpoint (POST)**: `https://api.atlascloud.ai/api/v1/model/generateAudio` — start an async generation; returns a `prediction_id`
- **Poll endpoint (GET)**: `https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}` — poll this until the prediction finishes
- **Model ID**: `bytedance/seed-audio-1.0`


## API Information

This model can be used via our HTTP API or more conveniently via our client libraries.
See the input and output schema below, as well as the usage examples.


### Input Schema

The API accepts the following input parameters:

- **`model`** (`string`, _required_):
  Model name.
  - Default: `"bytedance/seed-audio-1.0"`
  - Options: "bytedance/seed-audio-1.0"

- **`text`** (`string`, _required_):
  Prompt or text to synthesize into audio. Max 2048 characters. For reference audio generation, mention references using the upstream placeholder tokens such as `@audio1`, `@audio2`, and `@audio3`.
  - Default: `"A phone vibrates first, then a calm male voice says: Welcome to Seed Audio."`

- **`references`** (`array[object]`, _optional_):
  Optional reference resources. Omit for text-only generation. Provide up to 3 audio references or 1 image reference. Each reference item must contain exactly one of `speaker`, `audio_url`, `audio_data`, `image_url`, or `image_data`. Audio references can be cited in text using the upstream placeholder tokens such as `@audio1`, `@audio2`, and `@audio3`. Do not mix image references with audio references or speaker IDs in the same request.
  - Max items: 3
  - Item properties:
    - **`speaker`** (`string`, _optional_):
      Voice ID from Doubao TTS 2.0 or a cloned voice. Mutually exclusive with audio_url and audio_data.
      - Options: "zh_female_vv_uranus_bigtts", "zh_female_xiaohe_uranus_bigtts", "zh_male_m191_uranus_bigtts", "zh_male_taocheng_uranus_bigtts", "zh_male_liufei_uranus_bigtts", "zh_female_sophie_uranus_bigtts", "zh_female_qingxinnvsheng_uranus_bigtts", "zh_female_cancan_uranus_bigtts", "zh_female_tianmeitaozi_uranus_bigtts", "zh_male_ruyayichen_uranus_bigtts"

    - **`audio_url`** (`string`, _optional_):
      Reference audio URL. Max 30s and 10MB per file. Supports wav/mp3/pcm/ogg_opus. Mutually exclusive with speaker and audio_data.

    - **`audio_data`** (`string`, _optional_):
      Reference audio as Base64. Max 30s and 10MB per file. Supports wav/mp3/pcm/ogg_opus. Mutually exclusive with speaker and audio_url.

    - **`image_url`** (`string`, _optional_):
      Reference image URL. Mutually exclusive with image_data and cannot be mixed with audio references.

    - **`image_data`** (`string`, _optional_):
      Reference image as Base64. Mutually exclusive with image_url and cannot be mixed with audio references.


- **`format`** (`string`, _optional_):
  Output audio format.
  - Default: `"mp3"`
  - Options: "mp3", "wav", "pcm", "ogg_opus"

- **`sample_rate`** (`integer`, _optional_):
  Output sample rate.
  - Default: `24000`
  - Options: 8000, 16000, 24000, 32000, 44100, 48000

- **`pitch_rate`** (`integer`, _optional_):
  Pitch adjustment. Range [-12, 12], default 0.
  - Default: `0`
  - Min: -12
  - Max: 12

- **`speech_rate`** (`integer`, _optional_):
  Speech speed adjustment. Range [-50, 100], where 100 means 2.0x and -50 means 0.5x.
  - Default: `0`
  - Min: -50
  - Max: 100

- **`loudness_rate`** (`integer`, _optional_):
  Loudness adjustment. Range [-50, 100], where 100 means 2.0x loudness and -50 means 0.5x.
  - Default: `0`
  - Min: -50
  - Max: 100



**Required Parameters Example**:

```json
{
  "model": "bytedance/seed-audio-1.0",
  "text": "A phone vibrates first, then a calm male voice says: Welcome to Seed Audio."
}
```


**Full Example**:

```json
{
  "model": "bytedance/seed-audio-1.0",
  "text": "A phone vibrates first, then a calm male voice says: Welcome to Seed Audio.",
  "references": [
    {
      "speaker": "zh_female_vv_uranus_bigtts",
      "audio_url": "",
      "audio_data": "",
      "image_url": "",
      "image_data": ""
    }
  ],
  "format": "mp3",
  "sample_rate": 24000,
  "pitch_rate": 0,
  "speech_rate": 0,
  "loudness_rate": 0
}
```


### Output Schema

The API returns the following output format:


- **`created_at`** (`string`, _optional_):
  ISO timestamp of when the request was created.

- **`id`** (`string`, _optional_):
  Unique identifier for the prediction.

- **`model`** (`string`, _optional_):
  Model ID used for the prediction.

- **`outputs`** (`array[string]`, _optional_):
  Array of URLs to the generated audio.

- **`status`** (`string`, _optional_):
  Status of the task: created, processing, completed, or failed.

- **`urls`** (`object`, _optional_):
  Object containing related API endpoints.



**Example Response**:

```json
{
  "created_at": "",
  "id": "",
  "model": "",
  "outputs": [
    ""
  ],
  "status": "",
  "urls": {}
}
```


## Usage Examples

### cURL

```bash
# Step 1: Start generation (async)
curl -X POST "https://api.atlascloud.ai/api/v1/model/generateAudio" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "bytedance/seed-audio-1.0",
  "text": "A phone vibrates first, then a calm male voice says: Welcome to Seed Audio.",
  "references": [
    {
      "speaker": "zh_female_vv_uranus_bigtts",
      "audio_url": "",
      "audio_data": "",
      "image_url": "",
      "image_data": ""
    }
  ],
  "format": "mp3",
  "sample_rate": 24000,
  "pitch_rate": 0,
  "speech_rate": 0,
  "loudness_rate": 0
}'

# Response will contain: {"code": 200, "data": {"id": "prediction_id", "status": "processing"}}

# Step 2: Poll for result (replace {prediction_id} with the id returned above)
curl -X GET "https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY"

# Keep polling until status is "completed", "succeeded" or "failed"
# When completed, outputs will contain the generated content URL(s)
```

## Additional Resources

### Documentation

- [Model Playground](https://www.atlascloud.ai/models/bytedance/seed-audio-1.0)

ไม่มีตัวอย่าง

กำลังโหลด...

Seed Audio 1.0

Seed Audio 1.0 is ByteDance's audio generation model for producing speech from text prompts, with optional reference audio, speaker, or image inputs. It is exposed on AtlasCloud through the standard asynchronous audio generation API.

Highlights

Text-to-speech generation: Convert text prompts into speech audio.
Reference audio control: Provide up to three reference audios or a speaker ID to guide the voice, tone, or delivery. Refer to them in the prompt using the upstream placeholder tokens @audio1, @audio2, and @audio3.
Reference image control: Provide one reference image to guide the generated audio style or character context.
Reference exclusivity: Each reference item must contain exactly one of speaker, audio_url, audio_data, image_url, or image_data. The upstream API rejects mixed audio + image references in the same request.
Audio format control: Generate mp3, wav, pcm, or ogg_opus.
Sample rate control: Choose common output sample rates from 8000 to 48000.
Speech controls: Adjust pitch, speech speed, and loudness with optional rate parameters.

Parameters

Parameter	Required	Description
`model`	Yes	Use `bytedance/seed-audio-1.0`.
`text`	Yes	Text prompt to synthesize into speech.
`references`	No	Optional array of reference inputs. Use `audio_url`, `audio_data`, or `speaker` for audio/voice references; use `image_url` or `image_data` for image references. Do not mix image references with audio or speaker references.
`format`	No	Output audio format. Default: `mp3`.
`sample_rate`	No	Output sample rate. Default: `24000`.
`pitch_rate`	No	Pitch adjustment. Default: `0`.
`speech_rate`	No	Speech speed adjustment. Default: `0`.
`loudness_rate`	No	Loudness adjustment. Default: `0`.

Example Request

Full-featured example with a reference audio and all tunable controls:

{
  "model": "bytedance/seed-audio-1.0",
  "text": "Use the voice and delivery of @audio1 and say in natural, clear English with a light broadcast tone: Welcome to Seed Audio. This is the most complete reference example.",
  "references": [
    {
      "audio_url": "https://static.atlascloud.ai/model/example/bytedance-seed-audio-1.0.mp3"
    }
  ],
  "format": "mp3",
  "sample_rate": 44100,
  "pitch_rate": 2,
  "speech_rate": 15,
  "loudness_rate": 10
}

Text-only example:

{
  "model": "bytedance/seed-audio-1.0",
  "text": "Hello, this is a Seed Audio text-to-speech test.",
  "format": "mp3",
  "sample_rate": 24000,
  "pitch_rate": 0,
  "speech_rate": 0,
  "loudness_rate": 0
}

With a reference audio:

{
  "model": "bytedance/seed-audio-1.0",
  "text": "Use the voice and delivery of @audio1 and say: The city sounds especially quiet today.",
  "references": [
    {
      "audio_url": "https://static.atlascloud.ai/model/example/bytedance-seed-audio-1.0.mp3"
    }
  ],
  "format": "mp3",
  "sample_rate": 24000
}

With a reference image:

{
  "model": "bytedance/seed-audio-1.0",
  "text": "Using the mood and character context from the reference image, say in a bright, youthful tone: After the rain stopped, the street lit up again.",
  "references": [
    {
      "image_url": "https://static.atlascloud.ai/uploads/models/ebeecbb1-1904-464c-ad24-6a631fa83ab6.png"
    }
  ],
  "format": "mp3",
  "sample_rate": 24000
}

Pricing

Seed Audio 1.0 is billed by input text length.

Unit	Price
Per 1,000 characters	`$0.015`

สำรวจโมเดลที่คล้ายกัน

NEW

ข้อความเป็นเสียงพูด

xAI TTS v1

xAI TTS v1 is a high-fidelity text-to-speech model that converts text into natural, expressive speech with sub-second latency, supporting 20 languages and 80+ voices with fine-grained delivery control.

ElevenLabs v3 Text-to-Speech

ElevenLabs v3 Text-to-Speech model. High-quality speech generation from text prompts.

From

$0.1/K chars

API เดียวสำหรับ AI สื่อทุกประเภท

สำรวจโมเดลทั้งหมด

Seed Audio 1.0 API by ByteDance

อินพุต

เอาต์พุต

พารามิเตอร์

ตัวอย่างโค้ด

ติดตั้ง

การยืนยันตัวตน

HTTP Headers

ส่งคำขอ

ส่งคำขอ

เนื้อหาคำขอ

การตอบกลับ

ตรวจสอบสถานะ

ตัวอย่างการตรวจสอบสถานะเป็นระยะ

ค่าสถานะ

การตอบกลับที่เสร็จสมบูรณ์

Input Schema

ตัวอย่างเนื้อหาคำขอ

Output Schema

ตัวอย่างการตอบกลับ

Atlas Cloud Skills

ไคลเอนต์ที่รองรับ

ติดตั้ง

ตั้งค่า API Key

ความสามารถ

MCP Server

ไคลเอนต์ที่รองรับ

ติดตั้ง

การกำหนดค่า

เครื่องมือที่ใช้ได้

API Schema

เทมเพลต Prompt สำหรับ LLM

Seed Audio 1.0

Highlights

Parameters

Example Request

Pricing

สำรวจโมเดลที่คล้ายกัน

xAI TTS v1

ElevenLabs v3 Text-to-Speech

API เดียวสำหรับ AI สื่อทุกประเภท

Join our Discord community

อินพุต

เอาต์พุต

พารามิเตอร์

ตัวอย่างโค้ด

ติดตั้ง

การยืนยันตัวตน

HTTP Headers

ส่งคำขอ

ส่งคำขอ

เนื้อหาคำขอ

การตอบกลับ

ตรวจสอบสถานะ

ตัวอย่างการตรวจสอบสถานะเป็นระยะ

ค่าสถานะ

การตอบกลับที่เสร็จสมบูรณ์

Input Schema

ตัวอย่างเนื้อหาคำขอ

Output Schema

ตัวอย่างการตอบกลับ

Atlas Cloud Skills

ไคลเอนต์ที่รองรับ

ติดตั้ง

ตั้งค่า API Key

ความสามารถ

MCP Server

ไคลเอนต์ที่รองรับ

ติดตั้ง

การกำหนดค่า

เครื่องมือที่ใช้ได้

API Schema

เทมเพลต Prompt สำหรับ LLM