bytedance/seed-audio-1.0

Tekst-naar-Spraak

Seed Audio 1.0 API by ByteDance

bytedance/seed-audio-1.0

Seed-audio-1.0

Doubao‑Audio‑Generate‑1.0 is Doubao Voice’s next‑generation audio‑generation engine. The industry‑first commercial tool creates film‑grade audio with just one prompt. It eliminates cumbersome audio‑engineering work. Creators generate publish‑ready radio dramas, podcasts and branded audio easily, shifting from a simple voice‑generator to an AI audio director. It serves audiobooks, serialized episodes and commercial audio for high‑quality narrative‑driven production.

Invoer

Tekst *

References

MAX: 3

Format

Samplefrequentie

Pitch rate

Speech rate

Loudness rate

Uitvoer

Inactief

Je gegenereerde audio verschijnt hier

Voer tekst in en klik op Uitvoeren om spraak te synthetiseren.

Parameters

Codevoorbeeld
import requests
import time

# Step 1: Start audio generation
generate_url = "https://api.atlascloud.ai/api/v1/model/generateAudio"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "bytedance/seed-audio-1.0",  # Required. Model name. options: bytedance/seed-audio-1.0
    "text": "A phone vibrates first, then a calm male voice says: Welcome to Seed Audio.",  # Required. Prompt or text to synthesize into audio
    "references": [
        {
            "speaker": "example_speaker",
            "audio_url": "example_audio_url",
            "audio_data": "example_audio_data",
            "image_url": "example_image_url",
            "image_data": "example_image_data"
        }
    ],  # Optional reference resources
    "format": "mp3",  # Output audio format. options: mp3 | wav | pcm | ogg_opus
    "sample_rate": 24000,  # Output sample rate. options: 8000 | 16000 | 24000 | 32000 | 44100 | 48000
    "pitch_rate": 0,  # Pitch adjustment. (min: -12, max: 12)
    "speech_rate": 0,  # Speech speed adjustment. (min: -50, max: 100)
    "loudness_rate": 0,  # Loudness adjustment. (min: -50, max: 100)
}

generate_response = requests.post(generate_url, headers=headers, json=data)
generate_result = generate_response.json()
prediction_id = generate_result["data"]["id"]

# Step 2: Poll for result
poll_url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"

def check_status():
    while True:
        response = requests.get(poll_url, headers={"Authorization": "Bearer $ATLASCLOUD_API_KEY"})
        result = response.json()

        if result["data"]["status"] in ["completed", "succeeded"]:
            print("Generated audio:", result["data"]["outputs"][0])
            return result["data"]["outputs"][0]
        elif result["data"]["status"] == "failed":
            raise Exception(result["data"]["error"] or "Generation failed")
        else:
            # Still processing, wait 2 seconds
            time.sleep(2)

audio_url = check_status()

Installeren

Installeer het vereiste pakket voor uw programmeertaal.

pip install requests

Authenticatie

Alle API-verzoeken vereisen authenticatie via een API-sleutel. U kunt uw API-sleutel ophalen via het Atlas Cloud dashboard.

export ATLASCLOUD_API_KEY="your-api-key-here"

HTTP-headers

import os

API_KEY = os.environ.get("ATLASCLOUD_API_KEY")
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}

Bescherm uw API-sleutel

Stel uw API-sleutel nooit bloot in client-side code of openbare repositories. Gebruik in plaats daarvan omgevingsvariabelen of een backend-proxy.

Een verzoek indienen

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateAudio"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "your-model",
    "text": "Hello, welcome to AtlasCloud text-to-speech."
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

Een verzoek indienen

Dien een asynchroon generatieverzoek in. De API retourneert een voorspellings-ID waarmee u de status kunt controleren en het resultaat kunt ophalen.

POST/api/v1/model/generateAudio

Verzoekinhoud

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateAudio"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}

data = {
    "model": "bytedance/seed-audio-1.0",
    "text": "Hello, welcome to AtlasCloud text-to-speech."
}

response = requests.post(url, headers=headers, json=data)
result = response.json()

print(f"Prediction ID: {result['data']['id']}")
print(f"Status: {result['data']['status']}")

Antwoord

{
  "code": 200,
  "data": {
    "id": "pred_abc123",
    "status": "processing",
    "model": "model-name",
    "created_at": "2025-01-01T00:00:00Z"
  }
}

Status controleren

Bevraag het voorspellings-eindpunt om de huidige status van uw verzoek te controleren.

GET/api/v1/model/prediction/{prediction_id}

Polling-voorbeeld

import requests
import time

prediction_id = "pred_abc123"
url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

while True:
    response = requests.get(url, headers=headers)
    result = response.json()
    status = result["data"]["status"]
    print(f"Status: {status}")

    if status in ["completed", "succeeded"]:
        output_url = result["data"]["outputs"][0]
        print(f"Output URL: {output_url}")
        break
    elif status == "failed":
        print(f"Error: {result['data'].get('error', 'Unknown')}")
        break

    time.sleep(3)

Statuswaarden

processingHet verzoek wordt nog verwerkt.

completedDe generatie is voltooid. Resultaten zijn beschikbaar.

succeededDe generatie is geslaagd. Resultaten zijn beschikbaar.

failedDe generatie is mislukt. Controleer het foutveld.

Voltooid antwoord

{
  "data": {
    "id": "pred_abc123",
    "status": "completed",
    "outputs": [
      "https://storage.atlascloud.ai/outputs/result.mp3"
    ],
    "metrics": {
      "predict_time": 8.3
    },
    "created_at": "2025-01-01T00:00:00Z",
    "completed_at": "2025-01-01T00:00:10Z"
  }
}

Invoer-Schema

De volgende parameters worden geaccepteerd in de verzoekinhoud.

Totaal: 8Vereist: 2Optioneel: 6

modelstringrequired

Model name.

Default: "bytedance/seed-audio-1.0"

bytedance/seed-audio-1.0

textstringrequired

Prompt or text to synthesize into audio. Max 2048 characters. For reference audio generation, mention references using the upstream placeholder tokens such as `@audio1`, `@audio2`, and `@audio3`.

Default: "A phone vibrates first, then a calm male voice says: Welcome to Seed Audio."

referencesarray[object]

Optional reference resources. Omit for text-only generation. Provide up to 3 audio references or 1 image reference. Each reference item must contain exactly one of `speaker`, `audio_url`, `audio_data`, `image_url`, or `image_data`. Audio references can be cited in text using the upstream placeholder tokens such as `@audio1`, `@audio2`, and `@audio3`. Do not mix image references with audio references or speaker IDs in the same request.

Max items: 3

speakerstring

Voice ID from Doubao TTS 2.0 or a cloned voice. Mutually exclusive with audio_url and audio_data.

zh_female_vv_uranus_bigttszh_female_xiaohe_uranus_bigttszh_male_m191_uranus_bigttszh_male_taocheng_uranus_bigttszh_male_liufei_uranus_bigttszh_female_sophie_uranus_bigttszh_female_qingxinnvsheng_uranus_bigttszh_female_cancan_uranus_bigttszh_female_tianmeitaozi_uranus_bigttszh_male_ruyayichen_uranus_bigtts

audio_urlstring

Reference audio URL. Max 30s and 10MB per file. Supports wav/mp3/pcm/ogg_opus. Mutually exclusive with speaker and audio_data.

audio_datastring

Reference audio as Base64. Max 30s and 10MB per file. Supports wav/mp3/pcm/ogg_opus. Mutually exclusive with speaker and audio_url.

image_urlstring

Reference image URL. Mutually exclusive with image_data and cannot be mixed with audio references.

image_datastring

Reference image as Base64. Mutually exclusive with image_url and cannot be mixed with audio references.

formatstring

Output audio format.

Default: "mp3"

mp3wavpcmogg_opus

sample_rateinteger

Output sample rate.

Default: 24000

80001600024000320004410048000

pitch_rateinteger

Pitch adjustment. Range [-12, 12], default 0.

Default: 0Min: -12Max: 12

speech_rateinteger

Speech speed adjustment. Range [-50, 100], where 100 means 2.0x and -50 means 0.5x.

Default: 0Min: -50Max: 100

loudness_rateinteger

Loudness adjustment. Range [-50, 100], where 100 means 2.0x loudness and -50 means 0.5x.

Default: 0Min: -50Max: 100

Voorbeeld verzoekinhoud

{
  "model": "bytedance/seed-audio-1.0",
  "text": "A phone vibrates first, then a calm male voice says: Welcome to Seed Audio.",
  "format": "mp3",
  "sample_rate": 24000,
  "pitch_rate": 0,
  "speech_rate": 0,
  "loudness_rate": 0
}

Uitvoer-Schema

De API retourneert een voorspellingsantwoord met de gegenereerde uitvoer-URL's.

created_atstring

ISO timestamp of when the request was created.

idstring

Unique identifier for the prediction.

modelstring

Model ID used for the prediction.

outputsarray

Array of URLs to the generated audio.

statusstring

Status of the task: created, processing, completed, or failed.

Voorbeeldantwoord

{
  "id": "pred_abc123",
  "status": "completed",
  "model": "model-name",
  "outputs": [
    "https://storage.atlascloud.ai/outputs/result.png"
  ],
  "metrics": {
    "predict_time": 8.3
  },
  "created_at": "2025-01-01T00:00:00Z",
  "completed_at": "2025-01-01T00:00:10Z"
}

Atlas Cloud Skills

Atlas Cloud Skills integreert meer dan 300 AI-modellen rechtstreeks in uw AI-codeerassistent. Eén commando om te installeren, gebruik daarna natuurlijke taal om afbeeldingen, video's te genereren en te chatten met LLMs.

Ondersteunde clients

Claude Code

OpenAI Codex

Gemini CLI

Cursor

Windsurf

VS Code

Trae

GitHub Copilot

Cline

Roo Code

Amp

Goose

Replit

40+ ondersteunde clients

Installeren

npx skills add AtlasCloudAI/atlas-cloud-skills

API-sleutel instellen

Haal uw API-sleutel op via het Atlas Cloud dashboard en stel deze in als omgevingsvariabele.

export ATLASCLOUD_API_KEY="your-api-key-here"

Mogelijkheden

Eenmaal geïnstalleerd kunt u natuurlijke taal gebruiken in uw AI-assistent om toegang te krijgen tot alle Atlas Cloud modellen.

BeeldgeneratieGenereer afbeeldingen met modellen zoals Nano Banana 2, Z-Image en meer.

VideocreatieMaak video's van tekst of afbeeldingen met Kling, Vidu, Veo, enz.

LLM-chatChat met Qwen, DeepSeek en andere grote taalmodellen.

Media uploadenUpload lokale bestanden voor beeldbewerking en afbeelding-naar-video workflows.

Meer informatie

github.com/AtlasCloudAI/atlas-cloud-skills

MCP-server

De Atlas Cloud MCP-server verbindt uw IDE met meer dan 300 AI-modellen via het Model Context Protocol. Werkt met elke MCP-compatibele client.

Ondersteunde clients

Cursor

VS Code

Windsurf

Claude Code

OpenAI Codex

Gemini CLI

Cline

Roo Code

100+ ondersteunde clients

Installeren

npx -y atlascloud-mcp

Configuratie

Voeg de volgende configuratie toe aan het MCP-instellingenbestand van uw IDE.

{
  "mcpServers": {
    "atlascloud": {
      "command": "npx",
      "args": [
        "-y",
        "atlascloud-mcp"
      ],
      "env": {
        "ATLASCLOUD_API_KEY": "your-api-key-here"
      }
    }
  }
}

Beschikbare tools

atlas_generate_imageGenereer afbeeldingen op basis van tekstprompts.

atlas_generate_videoMaak video's van tekst of afbeeldingen.

atlas_chatChat met grote taalmodellen.

atlas_list_modelsBlader door meer dan 300 beschikbare AI-modellen.

atlas_quick_generateContentcreatie in één stap met automatische modelselectie.

atlas_upload_mediaUpload lokale bestanden voor API-workflows.

Meer informatie

github.com/AtlasCloudAI/mcp-server

API Schema

{
  "info": {
    "title": "AtlasCloud API",
    "version": "1.0.0",
    "description": "The AtlasCloud API."
  },
  "openapi": "3.0.0",
  "paths": {
    "/api/v1/model/generateAudio": {
      "post": {
        "requestBody": {
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/Input"
              }
            }
          },
          "required": true
        },
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "The request status."
          }
        }
      },
      "x-api-name": "model_run"
    },
    "/api/v1/model/prediction/{request_id}": {
      "get": {
        "parameters": [
          {
            "in": "path",
            "name": "request_id",
            "required": true,
            "schema": {
              "description": "Request ID",
              "type": "string"
            }
          }
        ],
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "Result of the request."
          }
        }
      },
      "x-api-name": "model_result"
    }
  },
  "components": {
    "schemas": {
      "Input": {
        "properties": {
          "model": {
            "type": "string",
            "description": "Model name.",
            "default": "bytedance/seed-audio-1.0",
            "enum": [
              "bytedance/seed-audio-1.0"
            ]
          },
          "text": {
            "description": "Prompt or text to synthesize into audio. Max 2048 characters. For reference audio generation, mention references using the upstream placeholder tokens such as `@audio1`, `@audio2`, and `@audio3`.",
            "type": "string",
            "default": "A phone vibrates first, then a calm male voice says: Welcome to Seed Audio.",
            "x-ui-component": "textarea",
            "maxLength": 2048
          },
          "references": {
            "description": "Optional reference resources. Omit for text-only generation. Provide up to 3 audio references or 1 image reference. Each reference item must contain exactly one of `speaker`, `audio_url`, `audio_data`, `image_url`, or `image_data`. Audio references can be cited in text using the upstream placeholder tokens such as `@audio1`, `@audio2`, and `@audio3`. Do not mix image references with audio references or speaker IDs in the same request.",
            "type": "array",
            "maxItems": 3,
            "items": {
              "description": "One reference resource. Pick exactly ONE source type (preset voice / audio URL / audio Base64 / image URL / image Base64); the others are hidden once one is chosen. Audio: wav/mp3/pcm/ogg_opus, <=30s & 10MB. Image: jpeg/png/webp, one image only, <=10MB. Image references cannot be mixed with audio/speaker references.",
              "type": "object",
              "properties": {
                "speaker": {
                  "description": "Voice ID from Doubao TTS 2.0 or a cloned voice. Mutually exclusive with audio_url and audio_data.",
                  "type": "string",
                  "enum": [
                    "zh_female_vv_uranus_bigtts",
                    "zh_female_xiaohe_uranus_bigtts",
                    "zh_male_m191_uranus_bigtts",
                    "zh_male_taocheng_uranus_bigtts",
                    "zh_male_liufei_uranus_bigtts",
                    "zh_female_sophie_uranus_bigtts",
                    "zh_female_qingxinnvsheng_uranus_bigtts",
                    "zh_female_cancan_uranus_bigtts",
                    "zh_female_tianmeitaozi_uranus_bigtts",
                    "zh_male_ruyayichen_uranus_bigtts"
                  ],
                  "x-ui-component": "voice-select",
                  "x-enum-options": {
                    "zh_female_vv_uranus_bigtts": {
                      "name": "Vivi 2.0",
                      "language": "en",
                      "gender": "female",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_vv_uranus_bigtts.mp3"
                    },
                    "zh_female_xiaohe_uranus_bigtts": {
                      "name": "Xiaohe 2.0",
                      "language": "en",
                      "gender": "female",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_xiaohe_uranus_bigtts.mp3"
                    },
                    "zh_male_m191_uranus_bigtts": {
                      "name": "Yunzhou 2.0",
                      "language": "en",
                      "gender": "male",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_m191_uranus_bigtts.mp3"
                    },
                    "zh_male_taocheng_uranus_bigtts": {
                      "name": "Xiaotian 2.0",
                      "language": "en",
                      "gender": "male",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_taocheng_uranus_bigtts.mp3"
                    },
                    "zh_male_liufei_uranus_bigtts": {
                      "name": "Liufei 2.0",
                      "language": "en",
                      "gender": "male",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_liufei_uranus_bigtts.mp3"
                    },
                    "zh_female_sophie_uranus_bigtts": {
                      "name": "Sophie 2.0",
                      "language": "en",
                      "gender": "female",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_sophie_uranus_bigtts.mp3"
                    },
                    "zh_female_qingxinnvsheng_uranus_bigtts": {
                      "name": "Fresh Voice 2.0",
                      "language": "en",
                      "gender": "female",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_qingxinnvsheng_uranus_bigtts.mp3"
                    },
                    "zh_female_cancan_uranus_bigtts": {
                      "name": "Cancan 2.0",
                      "language": "en",
                      "gender": "female",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_cancan_uranus_bigtts.mp3"
                    },
                    "zh_female_tianmeitaozi_uranus_bigtts": {
                      "name": "Sweet Peach 2.0",
                      "language": "en",
                      "gender": "female",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_tianmeitaozi_uranus_bigtts.mp3"
                    },
                    "zh_male_ruyayichen_uranus_bigtts": {
                      "name": "Yichen 2.0",
                      "language": "en",
                      "gender": "male",
                      "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_ruyayichen_uranus_bigtts.mp3"
                    }
                  },
                  "title": "Preset voice",
                  "x-ui-exclusive-group": "ref_source",
                  "x-ui-source-type": "audio"
                },
                "audio_url": {
                  "description": "Reference audio URL. Max 30s and 10MB per file. Supports wav/mp3/pcm/ogg_opus. Mutually exclusive with speaker and audio_data.",
                  "type": "string",
                  "x-ui-component": "uploader",
                  "title": "Audio URL",
                  "x-ui-exclusive-group": "ref_source",
                  "x-ui-source-type": "audio"
                },
                "audio_data": {
                  "description": "Reference audio as Base64. Max 30s and 10MB per file. Supports wav/mp3/pcm/ogg_opus. Mutually exclusive with speaker and audio_url.",
                  "type": "string",
                  "x-ui-component": "textarea",
                  "title": "Audio (Base64)",
                  "x-ui-exclusive-group": "ref_source",
                  "x-hidden": true,
                  "x-ui-source-type": "audio"
                },
                "image_url": {
                  "description": "Reference image URL. Mutually exclusive with image_data and cannot be mixed with audio references.",
                  "type": "string",
                  "x-ui-component": "uploader",
                  "title": "Image URL",
                  "x-ui-exclusive-group": "ref_source",
                  "x-ui-source-type": "image"
                },
                "image_data": {
                  "description": "Reference image as Base64. Mutually exclusive with image_url and cannot be mixed with audio references.",
                  "type": "string",
                  "x-ui-component": "textarea",
                  "title": "Image (Base64)",
                  "x-ui-exclusive-group": "ref_source",
                  "x-hidden": true,
                  "x-ui-source-type": "image"
                }
              },
              "x-ui-exclusive": "ref_source"
            },
            "x-ui-cross-item-exclusive-by": "x-ui-source-type",
            "x-ui-source-type-max-items": {
              "audio": 3,
              "image": 1
            }
          },
          "format": {
            "description": "Output audio format.",
            "type": "string",
            "default": "mp3",
            "enum": [
              "mp3",
              "wav",
              "pcm",
              "ogg_opus"
            ],
            "x-ui-component": "select"
          },
          "sample_rate": {
            "description": "Output sample rate.",
            "type": "integer",
            "default": 24000,
            "enum": [
              8000,
              16000,
              24000,
              32000,
              44100,
              48000
            ],
            "x-ui-component": "select"
          },
          "pitch_rate": {
            "description": "Pitch adjustment. Range [-12, 12], default 0.",
            "type": "integer",
            "default": 0,
            "minimum": -12,
            "maximum": 12,
            "x-ui-component": "slider"
          },
          "speech_rate": {
            "description": "Speech speed adjustment. Range [-50, 100], where 100 means 2.0x and -50 means 0.5x.",
            "type": "integer",
            "default": 0,
            "minimum": -50,
            "maximum": 100,
            "x-ui-component": "slider"
          },
          "loudness_rate": {
            "description": "Loudness adjustment. Range [-50, 100], where 100 means 2.0x loudness and -50 means 0.5x.",
            "type": "integer",
            "default": 0,
            "minimum": -50,
            "maximum": 100,
            "x-ui-component": "slider"
          }
        },
        "required": [
          "model",
          "text"
        ],
        "type": "object",
        "x-order-properties": [
          "model",
          "text",
          "references",
          "format",
          "sample_rate",
          "pitch_rate",
          "speech_rate",
          "loudness_rate"
        ]
      },
      "Reference": {
        "description": "One reference resource for Seed Audio. Use exactly one of speaker, audio_url, audio_data, image_url, or image_data per item. Audio supports wav/mp3/pcm/ogg_opus, up to 30 seconds and 10 MB per item. Image supports jpeg/png/webp, one image only, up to 10 MB.",
        "type": "object",
        "properties": {
          "speaker": {
            "description": "Voice ID from Doubao TTS 2.0 or a cloned voice. Mutually exclusive with audio_url and audio_data.",
            "type": "string",
            "enum": [
              "zh_female_vv_uranus_bigtts",
              "zh_female_xiaohe_uranus_bigtts",
              "zh_male_m191_uranus_bigtts",
              "zh_male_taocheng_uranus_bigtts",
              "zh_male_liufei_uranus_bigtts",
              "zh_female_sophie_uranus_bigtts",
              "zh_female_qingxinnvsheng_uranus_bigtts",
              "zh_female_cancan_uranus_bigtts",
              "zh_female_tianmeitaozi_uranus_bigtts",
              "zh_male_ruyayichen_uranus_bigtts"
            ],
            "x-ui-component": "voice-select",
            "x-enum-options": {
              "zh_female_vv_uranus_bigtts": {
                "name": "Vivi 2.0",
                "language": "en",
                "gender": "female",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_vv_uranus_bigtts.mp3"
              },
              "zh_female_xiaohe_uranus_bigtts": {
                "name": "Xiaohe 2.0",
                "language": "en",
                "gender": "female",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_xiaohe_uranus_bigtts.mp3"
              },
              "zh_male_m191_uranus_bigtts": {
                "name": "Yunzhou 2.0",
                "language": "en",
                "gender": "male",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_m191_uranus_bigtts.mp3"
              },
              "zh_male_taocheng_uranus_bigtts": {
                "name": "Xiaotian 2.0",
                "language": "en",
                "gender": "male",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_taocheng_uranus_bigtts.mp3"
              },
              "zh_male_liufei_uranus_bigtts": {
                "name": "Liufei 2.0",
                "language": "en",
                "gender": "male",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_liufei_uranus_bigtts.mp3"
              },
              "zh_female_sophie_uranus_bigtts": {
                "name": "Sophie 2.0",
                "language": "en",
                "gender": "female",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_sophie_uranus_bigtts.mp3"
              },
              "zh_female_qingxinnvsheng_uranus_bigtts": {
                "name": "Fresh Voice 2.0",
                "language": "en",
                "gender": "female",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_qingxinnvsheng_uranus_bigtts.mp3"
              },
              "zh_female_cancan_uranus_bigtts": {
                "name": "Cancan 2.0",
                "language": "en",
                "gender": "female",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_cancan_uranus_bigtts.mp3"
              },
              "zh_female_tianmeitaozi_uranus_bigtts": {
                "name": "Sweet Peach 2.0",
                "language": "en",
                "gender": "female",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_female_tianmeitaozi_uranus_bigtts.mp3"
              },
              "zh_male_ruyayichen_uranus_bigtts": {
                "name": "Yichen 2.0",
                "language": "en",
                "gender": "male",
                "example": "https://static.atlascloud.ai/model/example/seed-audio/voices/en/zh_male_ruyayichen_uranus_bigtts.mp3"
              }
            },
            "x-ui-source-type": "audio"
          },
          "audio_url": {
            "description": "Reference audio URL. Mutually exclusive with speaker and audio_data.",
            "type": "string",
            "x-ui-component": "uploader",
            "x-ui-source-type": "audio"
          },
          "audio_data": {
            "description": "Reference audio as Base64. Mutually exclusive with speaker and audio_url.",
            "type": "string",
            "x-ui-component": "textarea",
            "x-hidden": true,
            "x-ui-source-type": "audio"
          },
          "image_url": {
            "description": "Reference image URL. Mutually exclusive with image_data and cannot be mixed with audio references.",
            "type": "string",
            "x-ui-component": "uploader",
            "x-ui-source-type": "image"
          },
          "image_data": {
            "description": "Reference image as Base64. Mutually exclusive with image_url and cannot be mixed with audio references.",
            "type": "string",
            "x-ui-component": "textarea",
            "x-hidden": true,
            "x-ui-source-type": "image"
          }
        }
      },
      "PredictionResponse": {
        "properties": {
          "created_at": {
            "description": "ISO timestamp of when the request was created.",
            "format": "date-time",
            "type": "string"
          },
          "id": {
            "description": "Unique identifier for the prediction.",
            "type": "string"
          },
          "model": {
            "description": "Model ID used for the prediction.",
            "type": "string"
          },
          "outputs": {
            "description": "Array of URLs to the generated audio.",
            "items": {
              "type": "string",
              "x-format": "audio"
            },
            "type": "array"
          },
          "status": {
            "description": "Status of the task: created, processing, completed, or failed.",
            "type": "string"
          },
          "urls": {
            "description": "Object containing related API endpoints.",
            "type": "object"
          }
        },
        "type": "object"
      }
    }
  },
  "servers": [
    {
      "url": "https://api.atlascloud.ai"
    }
  ]
}

LLM-vriendelijke Promptsjabloon

# bytedance/seed-audio-1.0

> Doubao‑Audio‑Generate‑1.0 is Doubao Voice’s next‑generation audio‑generation engine. The industry‑first commercial tool creates film‑grade audio with just one prompt.
It eliminates cumbersome audio‑engineering work. Creators generate publish‑ready radio dramas, podcasts and branded audio easily, shifting from a simple voice‑generator to an AI audio director. It serves audiobooks, serialized episodes and commercial audio for high‑quality narrative‑driven production.


## Overview

- **Submit endpoint (POST)**: `https://api.atlascloud.ai/api/v1/model/generateAudio` — start an async generation; returns a `prediction_id`
- **Poll endpoint (GET)**: `https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}` — poll this until the prediction finishes
- **Model ID**: `bytedance/seed-audio-1.0`


## API Information

This model can be used via our HTTP API or more conveniently via our client libraries.
See the input and output schema below, as well as the usage examples.


### Input Schema

The API accepts the following input parameters:

- **`model`** (`string`, _required_):
  Model name.
  - Default: `"bytedance/seed-audio-1.0"`
  - Options: "bytedance/seed-audio-1.0"

- **`text`** (`string`, _required_):
  Prompt or text to synthesize into audio. Max 2048 characters. For reference audio generation, mention references using the upstream placeholder tokens such as `@audio1`, `@audio2`, and `@audio3`.
  - Default: `"A phone vibrates first, then a calm male voice says: Welcome to Seed Audio."`

- **`references`** (`array[object]`, _optional_):
  Optional reference resources. Omit for text-only generation. Provide up to 3 audio references or 1 image reference. Each reference item must contain exactly one of `speaker`, `audio_url`, `audio_data`, `image_url`, or `image_data`. Audio references can be cited in text using the upstream placeholder tokens such as `@audio1`, `@audio2`, and `@audio3`. Do not mix image references with audio references or speaker IDs in the same request.
  - Max items: 3
  - Item properties:
    - **`speaker`** (`string`, _optional_):
      Voice ID from Doubao TTS 2.0 or a cloned voice. Mutually exclusive with audio_url and audio_data.
      - Options: "zh_female_vv_uranus_bigtts", "zh_female_xiaohe_uranus_bigtts", "zh_male_m191_uranus_bigtts", "zh_male_taocheng_uranus_bigtts", "zh_male_liufei_uranus_bigtts", "zh_female_sophie_uranus_bigtts", "zh_female_qingxinnvsheng_uranus_bigtts", "zh_female_cancan_uranus_bigtts", "zh_female_tianmeitaozi_uranus_bigtts", "zh_male_ruyayichen_uranus_bigtts"

    - **`audio_url`** (`string`, _optional_):
      Reference audio URL. Max 30s and 10MB per file. Supports wav/mp3/pcm/ogg_opus. Mutually exclusive with speaker and audio_data.

    - **`audio_data`** (`string`, _optional_):
      Reference audio as Base64. Max 30s and 10MB per file. Supports wav/mp3/pcm/ogg_opus. Mutually exclusive with speaker and audio_url.

    - **`image_url`** (`string`, _optional_):
      Reference image URL. Mutually exclusive with image_data and cannot be mixed with audio references.

    - **`image_data`** (`string`, _optional_):
      Reference image as Base64. Mutually exclusive with image_url and cannot be mixed with audio references.


- **`format`** (`string`, _optional_):
  Output audio format.
  - Default: `"mp3"`
  - Options: "mp3", "wav", "pcm", "ogg_opus"

- **`sample_rate`** (`integer`, _optional_):
  Output sample rate.
  - Default: `24000`
  - Options: 8000, 16000, 24000, 32000, 44100, 48000

- **`pitch_rate`** (`integer`, _optional_):
  Pitch adjustment. Range [-12, 12], default 0.
  - Default: `0`
  - Min: -12
  - Max: 12

- **`speech_rate`** (`integer`, _optional_):
  Speech speed adjustment. Range [-50, 100], where 100 means 2.0x and -50 means 0.5x.
  - Default: `0`
  - Min: -50
  - Max: 100

- **`loudness_rate`** (`integer`, _optional_):
  Loudness adjustment. Range [-50, 100], where 100 means 2.0x loudness and -50 means 0.5x.
  - Default: `0`
  - Min: -50
  - Max: 100



**Required Parameters Example**:

```json
{
  "model": "bytedance/seed-audio-1.0",
  "text": "A phone vibrates first, then a calm male voice says: Welcome to Seed Audio."
}
```


**Full Example**:

```json
{
  "model": "bytedance/seed-audio-1.0",
  "text": "A phone vibrates first, then a calm male voice says: Welcome to Seed Audio.",
  "references": [
    {
      "speaker": "zh_female_vv_uranus_bigtts",
      "audio_url": "",
      "audio_data": "",
      "image_url": "",
      "image_data": ""
    }
  ],
  "format": "mp3",
  "sample_rate": 24000,
  "pitch_rate": 0,
  "speech_rate": 0,
  "loudness_rate": 0
}
```


### Output Schema

The API returns the following output format:


- **`created_at`** (`string`, _optional_):
  ISO timestamp of when the request was created.

- **`id`** (`string`, _optional_):
  Unique identifier for the prediction.

- **`model`** (`string`, _optional_):
  Model ID used for the prediction.

- **`outputs`** (`array[string]`, _optional_):
  Array of URLs to the generated audio.

- **`status`** (`string`, _optional_):
  Status of the task: created, processing, completed, or failed.

- **`urls`** (`object`, _optional_):
  Object containing related API endpoints.



**Example Response**:

```json
{
  "created_at": "",
  "id": "",
  "model": "",
  "outputs": [
    ""
  ],
  "status": "",
  "urls": {}
}
```


## Usage Examples

### cURL

```bash
# Step 1: Start generation (async)
curl -X POST "https://api.atlascloud.ai/api/v1/model/generateAudio" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "bytedance/seed-audio-1.0",
  "text": "A phone vibrates first, then a calm male voice says: Welcome to Seed Audio.",
  "references": [
    {
      "speaker": "zh_female_vv_uranus_bigtts",
      "audio_url": "",
      "audio_data": "",
      "image_url": "",
      "image_data": ""
    }
  ],
  "format": "mp3",
  "sample_rate": 24000,
  "pitch_rate": 0,
  "speech_rate": 0,
  "loudness_rate": 0
}'

# Response will contain: {"code": 200, "data": {"id": "prediction_id", "status": "processing"}}

# Step 2: Poll for result (replace {prediction_id} with the id returned above)
curl -X GET "https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY"

# Keep polling until status is "completed", "succeeded" or "failed"
# When completed, outputs will contain the generated content URL(s)
```

## Additional Resources

### Documentation

- [Model Playground](https://www.atlascloud.ai/models/bytedance/seed-audio-1.0)

Geen voorbeelden beschikbaar

Laden...

Seed Audio 1.0

Seed Audio 1.0 is ByteDance's audio generation model for producing speech from text prompts, with optional reference audio, speaker, or image inputs. It is exposed on AtlasCloud through the standard asynchronous audio generation API.

Highlights

Text-to-speech generation: Convert text prompts into speech audio.
Reference audio control: Provide up to three reference audios or a speaker ID to guide the voice, tone, or delivery. Refer to them in the prompt using the upstream placeholder tokens @audio1, @audio2, and @audio3.
Reference image control: Provide one reference image to guide the generated audio style or character context.
Reference exclusivity: Each reference item must contain exactly one of speaker, audio_url, audio_data, image_url, or image_data. The upstream API rejects mixed audio + image references in the same request.
Audio format control: Generate mp3, wav, pcm, or ogg_opus.
Sample rate control: Choose common output sample rates from 8000 to 48000.
Speech controls: Adjust pitch, speech speed, and loudness with optional rate parameters.

Parameters

Parameter	Required	Description
`model`	Yes	Use `bytedance/seed-audio-1.0`.
`text`	Yes	Text prompt to synthesize into speech.
`references`	No	Optional array of reference inputs. Use `audio_url`, `audio_data`, or `speaker` for audio/voice references; use `image_url` or `image_data` for image references. Do not mix image references with audio or speaker references.
`format`	No	Output audio format. Default: `mp3`.
`sample_rate`	No	Output sample rate. Default: `24000`.
`pitch_rate`	No	Pitch adjustment. Default: `0`.
`speech_rate`	No	Speech speed adjustment. Default: `0`.
`loudness_rate`	No	Loudness adjustment. Default: `0`.

Example Request

Full-featured example with a reference audio and all tunable controls:

{
  "model": "bytedance/seed-audio-1.0",
  "text": "Use the voice and delivery of @audio1 and say in natural, clear English with a light broadcast tone: Welcome to Seed Audio. This is the most complete reference example.",
  "references": [
    {
      "audio_url": "https://static.atlascloud.ai/model/example/bytedance-seed-audio-1.0.mp3"
    }
  ],
  "format": "mp3",
  "sample_rate": 44100,
  "pitch_rate": 2,
  "speech_rate": 15,
  "loudness_rate": 10
}

Text-only example:

{
  "model": "bytedance/seed-audio-1.0",
  "text": "Hello, this is a Seed Audio text-to-speech test.",
  "format": "mp3",
  "sample_rate": 24000,
  "pitch_rate": 0,
  "speech_rate": 0,
  "loudness_rate": 0
}

With a reference audio:

{
  "model": "bytedance/seed-audio-1.0",
  "text": "Use the voice and delivery of @audio1 and say: The city sounds especially quiet today.",
  "references": [
    {
      "audio_url": "https://static.atlascloud.ai/model/example/bytedance-seed-audio-1.0.mp3"
    }
  ],
  "format": "mp3",
  "sample_rate": 24000
}

With a reference image:

{
  "model": "bytedance/seed-audio-1.0",
  "text": "Using the mood and character context from the reference image, say in a bright, youthful tone: After the rain stopped, the street lit up again.",
  "references": [
    {
      "image_url": "https://static.atlascloud.ai/uploads/models/ebeecbb1-1904-464c-ad24-6a631fa83ab6.png"
    }
  ],
  "format": "mp3",
  "sample_rate": 24000
}

Pricing

Seed Audio 1.0 is billed by input text length.

Unit	Price
Per 1,000 characters	`$0.015`

Ontdek Vergelijkbare Modellen

NEW

Tekst-naar-Spraak

xAI TTS v1

xAI TTS v1 is a high-fidelity text-to-speech model that converts text into natural, expressive speech with sub-second latency, supporting 20 languages and 80+ voices with fine-grained delivery control.

ElevenLabs v3 Text-to-Speech

ElevenLabs v3 Text-to-Speech model. High-quality speech generation from text prompts.

From

$0.1/K chars

Eén API voor alle media-AI.

Verken alle modellen

Seed Audio 1.0 API by ByteDance

Invoer

Uitvoer

Parameters

Codevoorbeeld

Installeren

Authenticatie

HTTP-headers

Een verzoek indienen

Een verzoek indienen

Verzoekinhoud

Antwoord

Status controleren

Polling-voorbeeld

Statuswaarden

Voltooid antwoord

Invoer-Schema

Voorbeeld verzoekinhoud

Uitvoer-Schema

Voorbeeldantwoord

Atlas Cloud Skills

Ondersteunde clients

Installeren

API-sleutel instellen

Mogelijkheden

MCP-server

Ondersteunde clients

Installeren

Configuratie

Beschikbare tools

API Schema

LLM-vriendelijke Promptsjabloon

Seed Audio 1.0

Highlights

Parameters

Example Request

Pricing

Ontdek Vergelijkbare Modellen

xAI TTS v1

ElevenLabs v3 Text-to-Speech

Eén API voor alle media-AI.

Join our Discord community

Invoer

Uitvoer

Parameters

Codevoorbeeld

Installeren

Authenticatie

HTTP-headers

Een verzoek indienen

Een verzoek indienen

Verzoekinhoud

Antwoord

Status controleren

Polling-voorbeeld

Statuswaarden

Voltooid antwoord

Invoer-Schema

Voorbeeld verzoekinhoud

Uitvoer-Schema

Voorbeeldantwoord

Atlas Cloud Skills

Ondersteunde clients

Installeren

API-sleutel instellen

Mogelijkheden

MCP-server

Ondersteunde clients

Installeren

Configuratie

Beschikbare tools

API Schema

LLM-vriendelijke Promptsjabloon