xai/tts-v1

testo-in-audio

xAI TTS v1 API by xAI

xai/tts-v1

Tts-v1

xAI TTS v1 is a high-fidelity text-to-speech model that converts text into natural, expressive speech with sub-second latency, supporting 20 languages and 80+ voices with fine-grained delivery control.

INPUT

Testo *

Type [ or < to reference effect.

Lingua *

Voce

Codec

Frequenza di campionamento

Bitrate

Velocità

Impostazioni Aggiuntive

OUTPUT

In attesa

Il tuo audio generato apparirà qui

Inserisci il testo e fai clic su Esegui per sintetizzare la voce.

Parametri

Esempio di codice
import requests
import time

# Step 1: Start audio generation
generate_url = "https://api.atlascloud.ai/api/v1/model/generateAudio"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "xai/tts-v1",  # Required. model name
    "text": "example_value",  # Required. The text to convert to speech
    "language": "auto",  # Required. BCP-47 language code or 'auto' for automatic language detection
    "voice_id": "eve",  # Voice identifier
    "codec": "mp3",  # Audio codec for the output. options: mp3 | wav | pcm | mulaw | alaw
    "sample_rate": 24000,  # Sample rate in Hz. options: 8000 | 16000 | 22050 | 24000 | 44100 | 48000
    "bit_rate": 128000,  # Bit rate in bps. options: 32000 | 64000 | 96000 | 128000 | 192000
    "speed": 1,  # Speech speed multiplier. (min: 0.7, max: 1.5)
    "text_normalization": False,  # Enable text normalization before synthesis
    "optimize_streaming_latency": 0,  # Latency optimization level for streaming synthesis. options: 0 | 1 | 2
}

generate_response = requests.post(generate_url, headers=headers, json=data)
generate_result = generate_response.json()
prediction_id = generate_result["data"]["id"]

# Step 2: Poll for result
poll_url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"

def check_status():
    while True:
        response = requests.get(poll_url, headers={"Authorization": "Bearer $ATLASCLOUD_API_KEY"})
        result = response.json()

        if result["data"]["status"] in ["completed", "succeeded"]:
            print("Generated audio:", result["data"]["outputs"][0])
            return result["data"]["outputs"][0]
        elif result["data"]["status"] == "failed":
            raise Exception(result["data"]["error"] or "Generation failed")
        else:
            # Still processing, wait 2 seconds
            time.sleep(2)

audio_url = check_status()

Installa

Installa il pacchetto di dipendenze richiesto.

pip install requests

Autenticazione

Tutte le richieste API richiedono l'autenticazione tramite una chiave API. Puoi ottenere la tua chiave API dalla dashboard di Atlas Cloud.

export ATLASCLOUD_API_KEY="your-api-key-here"

Header HTTP

import os

API_KEY = os.environ.get("ATLASCLOUD_API_KEY")
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}

Proteggi la tua chiave API

Non esporre mai la tua chiave API nel codice lato client o nei repository pubblici. Utilizza invece variabili d'ambiente o un proxy backend.

Invia una richiesta

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateAudio"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "your-model",
    "text": "Hello, welcome to AtlasCloud text-to-speech."
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

Invia una richiesta

Invia una richiesta di generazione asincrona. L'API restituisce un ID di previsione che puoi usare per controllare lo stato e recuperare il risultato.

POST/api/v1/model/generateAudio

Corpo della richiesta

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateAudio"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}

data = {
    "model": "xai/tts-v1",
    "text": "Hello, welcome to AtlasCloud text-to-speech."
}

response = requests.post(url, headers=headers, json=data)
result = response.json()

print(f"Prediction ID: {result['data']['id']}")
print(f"Status: {result['data']['status']}")

Risposta

{
  "code": 200,
  "data": {
    "id": "pred_abc123",
    "status": "processing",
    "model": "model-name",
    "created_at": "2025-01-01T00:00:00Z"
  }
}

Controlla lo stato

Interroga l'endpoint di previsione per verificare lo stato attuale della tua richiesta.

GET/api/v1/model/prediction/{prediction_id}

Esempio di polling

import requests
import time

prediction_id = "pred_abc123"
url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

while True:
    response = requests.get(url, headers=headers)
    result = response.json()
    status = result["data"]["status"]
    print(f"Status: {status}")

    if status in ["completed", "succeeded"]:
        output_url = result["data"]["outputs"][0]
        print(f"Output URL: {output_url}")
        break
    elif status == "failed":
        print(f"Error: {result['data'].get('error', 'Unknown')}")
        break

    time.sleep(3)

Valori di stato

processingLa richiesta è ancora in fase di elaborazione.

completedGenerazione completata. Gli output sono disponibili.

succeededGenerazione riuscita. Gli output sono disponibili.

failedLa generazione è fallita. Controlla il campo errore.

Risposta completata

{
  "data": {
    "id": "pred_abc123",
    "status": "completed",
    "outputs": [
      "https://storage.atlascloud.ai/outputs/result.mp3"
    ],
    "metrics": {
      "predict_time": 8.3
    },
    "created_at": "2025-01-01T00:00:00Z",
    "completed_at": "2025-01-01T00:00:10Z"
  }
}

Schema di input

I seguenti parametri sono accettati nel corpo della richiesta.

Totale: 10Obbligatorio: 3Opzionale: 7

modelstringrequired

model name

Default: "xai/tts-v1"

textstringrequired

The text to convert to speech. Maximum 15,000 characters. Supports inline instant tags (e.g. [pause]) and wrapping style tags (e.g. <whisper>text</whisper>).

languagestringrequired

BCP-47 language code or 'auto' for automatic language detection. Case-insensitive.

Default: "auto"

autoenzhar-EGar-SAar-AEbnfrdehiiditjakopt-BRpt-PTrues-MXes-EStrvi

voice_idstring

Voice identifier. Multilingual voices (ara, eve, leo, rex, sal) support all languages; other voices are optimized for their native language and will cascade-update the language field.

Default: "eve"

araeveleorexsaljpi39icgd18jlf6v33g9t0jl26w6ihxidr8gqysuwy0m9l5wom17curyx7avnu1kbcs7l2c3hqxr4yubh27ltdnz89q2pnko73xd5dum0p0rt7o1hbxkrnwm69smp8rmyis75yfpekhwx401jupvcf340hhfxxqq0ih5oi34gwnexu6y97zmdc6sfc7de6afcf6c7a9ee820b3420895a5b8ce5cf8cf5c2c78d496819d0bd28d79f3a8b96d4378a495fdbb39e22152e06fd8490ea3be50b11f046a033914dfe7b9e7d217c3a2c594479e83c6f4fea98e34fd4dce1ba3d634b6da3d3b670a0c3ac005182a91893636d0cb9ff07d95b1a7441b97a1bf9fe5b5f981b5ae17439907a0401c9101f823be42535a45abfbdf26f1156da5baee46d03d030bc92a87a13662ba951c58d27475085e247783ebdd51244e27b3920097fabd54445f37329fd8895a2badb5f46b1e1b12d5daee6b908c4626660f4ff93971bfdc70013edeb8e835c8d7f60dc823468361b4ef458705c0713941321eb4129540f31906b23d3a7889066fa2

codecstring

Audio codec for the output.

Default: "mp3"

mp3wavpcmmulawalaw

sample_rateinteger

Sample rate in Hz. Supported values: 8000, 16000, 22050, 24000, 44100, 48000.

Default: 24000

80001600022050240004410048000

bit_rateinteger

Bit rate in bps. Applies to MP3 codec only. Supported values: 32000, 64000, 96000, 128000, 192000.

Default: 128000

320006400096000128000192000

speednumber

Speech speed multiplier. 1.0 is normal speed. Values below 1.0 slow down speech, values above 1.0 speed it up. Range: 0.7 to 1.5.

Default: 1Min: 0.7Max: 1.5

text_normalizationboolean

Enable text normalization before synthesis. When enabled, the model normalizes written-form text (e.g. numbers, abbreviations, symbols) into spoken-form before generating audio.

Default: false

optimize_streaming_latencyinteger

Latency optimization level for streaming synthesis. 0 (default): No optimization — best audio quality. 1: Reduced first-chunk size for lower time-to-first-audio, with minor quality tradeoff at chunk boundaries. 2: Further reduced first-chunk size for lowest time-to-first-audio, with more noticeable quality tradeoff at chunk boundaries.

Default: 0

012

Esempio di corpo della richiesta

{
  "model": "xai/tts-v1",
  "text": "example_text",
  "language": "auto",
  "voice_id": "eve",
  "codec": "mp3",
  "sample_rate": 24000,
  "bit_rate": 128000,
  "speed": 1,
  "text_normalization": false,
  "optimize_streaming_latency": 0
}

Schema di output

L'API restituisce una risposta di previsione con gli URL degli output generati.

codeinteger

HTTP status code of the response.

messagestring

Human-readable message; non-empty on failure.

dataobject

Esempio di risposta

{
  "id": "pred_abc123",
  "status": "completed",
  "model": "model-name",
  "outputs": [
    "https://storage.atlascloud.ai/outputs/result.png"
  ],
  "metrics": {
    "predict_time": 8.3
  },
  "created_at": "2025-01-01T00:00:00Z",
  "completed_at": "2025-01-01T00:00:10Z"
}

Atlas Cloud Skills

Atlas Cloud Skills integra oltre 400 modelli di IA direttamente nel tuo assistente di codifica IA. Un comando per installare, poi usa il linguaggio naturale per generare immagini, video e chattare con LLM.

Client supportati

Claude Code

OpenAI Codex

Gemini CLI

Cursor

Windsurf

VS Code

Trae

GitHub Copilot

Cline

Roo Code

Amp

Goose

Replit

40+ client supportati

Installa

npx skills add AtlasCloudAI/atlas-cloud-skills

Configura chiave API

Ottieni la tua chiave API dalla dashboard di Atlas Cloud e impostala come variabile d'ambiente.

export ATLASCLOUD_API_KEY="your-api-key-here"

Funzionalità

Una volta installato, puoi usare il linguaggio naturale nel tuo assistente IA per accedere a tutti i modelli Atlas Cloud.

Generazione di immaginiGenera immagini con modelli come Nano Banana 2, Z-Image e altri.

Creazione di videoCrea video da testo o immagini con Kling, Vidu, Veo, ecc.

Chat LLMChatta con Qwen, DeepSeek e altri grandi modelli linguistici.

Caricamento mediaCarica file locali per la modifica di immagini e flussi di lavoro da immagine a video.

Scopri di più

github.com/AtlasCloudAI/atlas-cloud-skills

Server MCP

Il server MCP di Atlas Cloud collega il tuo IDE con oltre 400 modelli di IA tramite il Model Context Protocol. Funziona con qualsiasi client compatibile MCP.

Client supportati

Cursor

VS Code

Windsurf

Claude Code

OpenAI Codex

Gemini CLI

Cline

Roo Code

100+ client supportati

Installa

npx -y atlascloud-mcp

Configurazione

Aggiungi la seguente configurazione al file delle impostazioni MCP del tuo IDE.

{
  "mcpServers": {
    "atlascloud": {
      "command": "npx",
      "args": [
        "-y",
        "atlascloud-mcp"
      ],
      "env": {
        "ATLASCLOUD_API_KEY": "your-api-key-here"
      }
    }
  }
}

Strumenti disponibili

atlas_generate_imageGenera immagini da prompt testuali.

atlas_generate_videoCrea video da testo o immagini.

atlas_chatChatta con grandi modelli linguistici.

atlas_list_modelsEsplora oltre 400 modelli di IA disponibili.

atlas_quick_generateCreazione di contenuti in un solo passaggio con selezione automatica del modello.

atlas_upload_mediaCarica file locali per i flussi di lavoro API.

Scopri di più

github.com/AtlasCloudAI/mcp-server

API Schema

{
  "info": {
    "title": "AtlasCloud API",
    "version": "1.0.0",
    "description": "The AtlasCloud API."
  },
  "openapi": "3.0.0",
  "paths": {
    "/api/v1/model/generateAudio": {
      "post": {
        "requestBody": {
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/Input"
              }
            }
          },
          "required": true
        },
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "The request status."
          }
        }
      },
      "x-api-name": "model_run"
    },
    "/api/v1/model/prediction/{request_id}": {
      "get": {
        "parameters": [
          {
            "in": "path",
            "name": "request_id",
            "required": true,
            "schema": {
              "description": "Request ID",
              "type": "string"
            }
          }
        ],
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "Result of the request."
          }
        }
      },
      "x-api-name": "model_result"
    }
  },
  "components": {
    "schemas": {
      "Input": {
        "properties": {
          "model": {
            "type": "string",
            "description": "model name",
            "default": "xai/tts-v1"
          },
          "text": {
            "description": "The text to convert to speech. Maximum 15,000 characters. Supports inline instant tags (e.g. [pause]) and wrapping style tags (e.g. <whisper>text</whisper>).",
            "type": "string",
            "x-speech-tags": {
              "instant": [
                "[pause]",
                "[long-pause]",
                "[hum-tune]",
                "[laugh]",
                "[chuckle]",
                "[giggle]",
                "[cry]",
                "[tsk]",
                "[tongue-click]",
                "[lip-smack]",
                "[breath]",
                "[inhale]",
                "[exhale]",
                "[sigh]"
              ],
              "wrapping": [
                "<soft></soft>",
                "<whisper></whisper>",
                "<loud></loud>",
                "<build-intensity></build-intensity>",
                "<decrease-intensity></decrease-intensity>",
                "<higher-pitch></higher-pitch>",
                "<lower-pitch></lower-pitch>",
                "<slow></slow>",
                "<fast></fast>",
                "<sing-song></sing-song>",
                "<singing></singing>",
                "<laugh-speak></laugh-speak>",
                "<emphasis></emphasis>"
              ]
            }
          },
          "language": {
            "description": "BCP-47 language code or 'auto' for automatic language detection. Case-insensitive.",
            "type": "string",
            "default": "auto",
            "enum": [
              "auto",
              "en",
              "zh",
              "ar-EG",
              "ar-SA",
              "ar-AE",
              "bn",
              "fr",
              "de",
              "hi",
              "id",
              "it",
              "ja",
              "ko",
              "pt-BR",
              "pt-PT",
              "ru",
              "es-MX",
              "es-ES",
              "tr",
              "vi"
            ],
            "x-ui-component": "select",
            "x-enum-labels": {
              "auto": "Auto Detect",
              "en": "English",
              "zh": "Chinese (Mandarin)",
              "ar-EG": "Arabic (Egypt)",
              "ar-SA": "Arabic (Saudi Arabia)",
              "ar-AE": "Arabic (UAE)",
              "bn": "Bengali",
              "fr": "French",
              "de": "German",
              "hi": "Hindi",
              "id": "Indonesian",
              "it": "Italian",
              "ja": "Japanese",
              "ko": "Korean",
              "pt-BR": "Portuguese (Brazil)",
              "pt-PT": "Portuguese (Portugal)",
              "ru": "Russian",
              "es-MX": "Spanish (Mexico)",
              "es-ES": "Spanish (Spain)",
              "tr": "Turkish",
              "vi": "Vietnamese"
            }
          },
          "voice_id": {
            "description": "Voice identifier. Multilingual voices (ara, eve, leo, rex, sal) support all languages; other voices are optimized for their native language and will cascade-update the language field.",
            "type": "string",
            "default": "eve",
            "enum": [
              "ara",
              "eve",
              "leo",
              "rex",
              "sal",
              "jpi39icg",
              "d18jlf6v",
              "33g9t0jl",
              "26w6ihxi",
              "dr8gqysu",
              "wy0m9l5w",
              "om17cury",
              "x7avnu1k",
              "bcs7l2c3",
              "hqxr4yub",
              "h27ltdnz",
              "89q2pnko",
              "73xd5dum",
              "0p0rt7o1",
              "hbxkrnwm",
              "69smp8rm",
              "yis75yfp",
              "ekhwx401",
              "jupvcf34",
              "0hhfxxqq",
              "0ih5oi34",
              "gwnexu6y",
              "97zmdc6s",
              "fc7de6afcf6c",
              "7a9ee820b342",
              "0895a5b8ce5c",
              "f8cf5c2c78d4",
              "96819d0bd28d",
              "79f3a8b96d43",
              "78a495fdbb39",
              "e22152e06fd8",
              "490ea3be50b1",
              "1f046a033914",
              "dfe7b9e7d217",
              "c3a2c594479e",
              "83c6f4fea98e",
              "34fd4dce1ba3",
              "d634b6da3d3b",
              "670a0c3ac005",
              "182a91893636",
              "d0cb9ff07d95",
              "b1a7441b97a1",
              "bf9fe5b5f981",
              "b5ae17439907",
              "a0401c9101f8",
              "23be42535a45",
              "abfbdf26f115",
              "6da5baee46d0",
              "3d030bc92a87",
              "a13662ba951c",
              "58d27475085e",
              "247783ebdd51",
              "244e27b39200",
              "97fabd54445f",
              "37329fd8895a",
              "2badb5f46b1e",
              "1b12d5daee6b",
              "908c4626660f",
              "4ff93971bfdc",
              "70013edeb8e8",
              "35c8d7f60dc8",
              "23468361b4ef",
              "458705c07139",
              "41321eb41295",
              "40f31906b23d",
              "3a7889066fa2"
            ],
            "x-ui-component": "select",
            "x-enum-options": {
              "ara": {
                "name": "Ara",
                "language": "multilingual",
                "gender": "female",
                "example": "https://data.x.ai/audio-samples/voice_ara.mp3"
              },
              "eve": {
                "name": "Eve",
                "language": "multilingual",
                "gender": "female",
                "example": "https://data.x.ai/audio-samples/voice_eve.mp3"
              },
              "leo": {
                "name": "Leo",
                "language": "multilingual",
                "gender": "male",
                "example": "https://data.x.ai/audio-samples/voice_leo.mp3"
              },
              "rex": {
                "name": "Rex",
                "language": "multilingual",
                "gender": "male",
                "example": "https://data.x.ai/audio-samples/voice_rex.mp3"
              },
              "sal": {
                "name": "Sal",
                "language": "multilingual",
                "gender": "male",
                "example": "https://data.x.ai/audio-samples/voice_sal.mp3"
              },
              "jpi39icg": {
                "name": "Jian",
                "language": "zh-CN",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/5_Jian_jpi39icg.mp3"
              },
              "d18jlf6v": {
                "name": "Hao",
                "language": "zh-CN",
                "gender": "female",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/6_Hao_d18jlf6v.mp3"
              },
              "33g9t0jl": {
                "name": "Xia",
                "language": "zh-CN",
                "gender": "female",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/7_Xia_33g9t0jl.mp3"
              },
              "26w6ihxi": {
                "name": "Pavel",
                "language": "ru",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/8_Pavel_26w6ihxi.mp3"
              },
              "dr8gqysu": {
                "name": "Andrei",
                "language": "ru",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/9_Andrei_dr8gqysu.mp3"
              },
              "wy0m9l5w": {
                "name": "Dmitri",
                "language": "ru",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/10_Dmitri_wy0m9l5w.mp3"
              },
              "om17cury": {
                "name": "Irina",
                "language": "ru",
                "gender": "female",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/11_Irina_om17cury.mp3"
              },
              "x7avnu1k": {
                "name": "Enzo",
                "language": "it",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/12_Enzo_x7avnu1k.mp3"
              },
              "bcs7l2c3": {
                "name": "Matteo",
                "language": "it",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/13_Matteo_bcs7l2c3.mp3"
              },
              "hqxr4yub": {
                "name": "Luca",
                "language": "it",
                "gender": "female",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/14_Luca_hqxr4yub.mp3"
              },
              "h27ltdnz": {
                "name": "Alessandro",
                "language": "it",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/15_Alessandro_h27ltdnz.mp3"
              },
              "89q2pnko": {
                "name": "Karan",
                "language": "hi",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/16_Karan_89q2pnko.mp3"
              },
              "73xd5dum": {
                "name": "Ananya",
                "language": "hi",
                "gender": "female",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/17_Ananya_73xd5dum.mp3"
              },
              "0p0rt7o1": {
                "name": "Remi",
                "language": "fr",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/18_Remi_0p0rt7o1.mp3"
              },
              "hbxkrnwm": {
                "name": "Hugo",
                "language": "fr",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/19_Hugo_hbxkrnwm.mp3"
              },
              "69smp8rm": {
                "name": "Camille",
                "language": "fr",
                "gender": "female",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/20_Camille_69smp8rm.mp3"
              },
              "yis75yfp": {
                "name": "Manuel",
                "language": "es",
                "gender": "male",
                "age": "old",
                "example": "https://static.atlascloud.ai/media/audios/21_Manuel_yis75yfp.mp3"
              },
              "ekhwx401": {
                "name": "Javier",
                "language": "es",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/22_Javier_ekhwx401.mp3"
              },
              "jupvcf34": {
                "name": "Diego",
                "language": "es",
                "example": "https://static.atlascloud.ai/media/audios/23_Diego_jupvcf34.mp3"
              },
              "0hhfxxqq": {
                "name": "Andres",
                "language": "es",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/24_Andres_0hhfxxqq.mp3"
              },
              "0ih5oi34": {
                "name": "Kasper",
                "language": "da",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/25_Kasper_0ih5oi34.mp3"
              },
              "gwnexu6y": {
                "name": "Lars",
                "language": "da",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/26_Lars_gwnexu6y.mp3"
              },
              "97zmdc6s": {
                "name": "Ida",
                "language": "da",
                "gender": "female",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/27_Ida_97zmdc6s.mp3"
              },
              "fc7de6afcf6c": {
                "name": "Duc",
                "language": "vi",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/28_Duc_fc7de6afcf6c.mp3"
              },
              "7a9ee820b342": {
                "name": "Minh",
                "language": "vi",
                "gender": "male",
                "example": "https://static.atlascloud.ai/media/audios/45_Minh_7a9ee820b342.mp3"
              },
              "0895a5b8ce5c": {
                "name": "Mai",
                "language": "vi",
                "gender": "female",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/70_Mai_0895a5b8ce5c.mp3"
              },
              "f8cf5c2c78d4": {
                "name": "Grace",
                "language": "en",
                "gender": "female",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/29_Grace_f8cf5c2c78d4.mp3"
              },
              "96819d0bd28d": {
                "name": "Daniel",
                "language": "en",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/42_Daniel_96819d0bd28d.mp3"
              },
              "79f3a8b96d43": {
                "name": "Claire",
                "language": "en",
                "gender": "female",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/46_Claire_79f3a8b96d43.mp3"
              },
              "78a495fdbb39": {
                "name": "James",
                "language": "en",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/47_James_78a495fdbb39.mp3"
              },
              "e22152e06fd8": {
                "name": "Axel",
                "language": "sv-SE",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/30_Axel_e22152e06fd8.mp3"
              },
              "490ea3be50b1": {
                "name": "Saga",
                "language": "sv-SE",
                "gender": "female",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/53_Saga_490ea3be50b1.mp3"
              },
              "1f046a033914": {
                "name": "Erik",
                "language": "sv-SE",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/67_Erik_1f046a033914.mp3"
              },
              "dfe7b9e7d217": {
                "name": "Valtteri",
                "language": "fi",
                "gender": "male",
                "example": "https://static.atlascloud.ai/media/audios/31_Valtteri_dfe7b9e7d217.mp3"
              },
              "c3a2c594479e": {
                "name": "Helmi",
                "language": "fi",
                "gender": "female",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/34_Helmi_c3a2c594479e.mp3"
              },
              "83c6f4fea98e": {
                "name": "Eero",
                "language": "fi",
                "gender": "male",
                "example": "https://static.atlascloud.ai/media/audios/44_Eero_83c6f4fea98e.mp3"
              },
              "34fd4dce1ba3": {
                "name": "Elina",
                "language": "fi",
                "gender": "female",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/61_Elina_34fd4dce1ba3.mp3"
              },
              "d634b6da3d3b": {
                "name": "Aylin",
                "language": "tr",
                "gender": "female",
                "age": "old",
                "example": "https://static.atlascloud.ai/media/audios/32_Aylin_d634b6da3d3b.mp3"
              },
              "670a0c3ac005": {
                "name": "Emre",
                "language": "tr",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/50_Emre_670a0c3ac005.mp3"
              },
              "182a91893636": {
                "name": "Kaan",
                "language": "tr",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/69_Kaan_182a91893636.mp3"
              },
              "d0cb9ff07d95": {
                "name": "Sakura",
                "language": "ja",
                "gender": "female",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/33_Sakura_d0cb9ff07d95.mp3"
              },
              "b1a7441b97a1": {
                "name": "Ren",
                "language": "ja",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/37_Ren_b1a7441b97a1.mp3"
              },
              "bf9fe5b5f981": {
                "name": "Jun-seo",
                "language": "ko",
                "gender": "male",
                "example": "https://static.atlascloud.ai/media/audios/35_Jun-seo_bf9fe5b5f981.mp3"
              },
              "b5ae17439907": {
                "name": "Min-jun",
                "language": "ko",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/36_Min-jun_b5ae17439907.mp3"
              },
              "a0401c9101f8": {
                "name": "Seo-yeon",
                "language": "ko",
                "gender": "female",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/40_Seo-yeon_a0401c9101f8.mp3"
              },
              "23be42535a45": {
                "name": "Ji-yeon",
                "language": "ko",
                "gender": "female",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/65_Ji-yeon_23be42535a45.mp3"
              },
              "abfbdf26f115": {
                "name": "Mateus",
                "language": "pt",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/38_Mateus_abfbdf26f115.mp3"
              },
              "6da5baee46d0": {
                "name": "Beatriz",
                "language": "pt",
                "gender": "female",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/49_Beatriz_6da5baee46d0.mp3"
              },
              "3d030bc92a87": {
                "name": "Rafael",
                "language": "pt",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/57_Rafael_3d030bc92a87.mp3"
              },
              "a13662ba951c": {
                "name": "Thijs",
                "language": "nl",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/39_Thijs_a13662ba951c.mp3"
              },
              "58d27475085e": {
                "name": "Femke",
                "language": "nl",
                "gender": "female",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/51_Femke_58d27475085e.mp3"
              },
              "247783ebdd51": {
                "name": "Noor",
                "language": "nl",
                "gender": "female",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/63_Noor_247783ebdd51.mp3"
              },
              "244e27b39200": {
                "name": "Ruben",
                "language": "nl",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/64_Ruben_244e27b39200.mp3"
              },
              "97fabd54445f": {
                "name": "Katarzyna",
                "language": "pl",
                "gender": "female",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/41_Katarzyna_97fabd54445f.mp3"
              },
              "37329fd8895a": {
                "name": "Mateusz",
                "language": "pl",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/59_Mateusz_37329fd8895a.mp3"
              },
              "2badb5f46b1e": {
                "name": "Jakub",
                "language": "pl",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/62_Jakub_2badb5f46b1e.mp3"
              },
              "1b12d5daee6b": {
                "name": "Aleksandra",
                "language": "pl",
                "gender": "female",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/68_Aleksandra_1b12d5daee6b.mp3"
              },
              "908c4626660f": {
                "name": "Krit",
                "language": "th",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/43_Krit_908c4626660f.mp3"
              },
              "4ff93971bfdc": {
                "name": "Aroon",
                "language": "th",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/52_Aroon_4ff93971bfdc.mp3"
              },
              "70013edeb8e8": {
                "name": "Khalid",
                "language": "ar",
                "gender": "male",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/48_Khalid_70013edeb8e8.mp3"
              },
              "35c8d7f60dc8": {
                "name": "Layla",
                "language": "ar",
                "gender": "female",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/60_Layla_35c8d7f60dc8.mp3"
              },
              "23468361b4ef": {
                "name": "Tariq",
                "language": "ar",
                "gender": "male",
                "example": "https://static.atlascloud.ai/media/audios/66_Tariq_23468361b4ef.mp3"
              },
              "458705c07139": {
                "name": "Clara",
                "language": "de",
                "gender": "female",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/54_Clara_458705c07139.mp3"
              },
              "41321eb41295": {
                "name": "Moritz",
                "language": "de",
                "gender": "male",
                "example": "https://static.atlascloud.ai/media/audios/55_Moritz_41321eb41295.mp3"
              },
              "40f31906b23d": {
                "name": "Niklas",
                "language": "de",
                "gender": "male",
                "age": "middle-aged",
                "example": "https://static.atlascloud.ai/media/audios/56_Niklas_40f31906b23d.mp3"
              },
              "3a7889066fa2": {
                "name": "Lena",
                "language": "de",
                "gender": "female",
                "age": "young",
                "example": "https://static.atlascloud.ai/media/audios/58_Lena_3a7889066fa2.mp3"
              }
            }
          },
          "codec": {
            "description": "Audio codec for the output.",
            "type": "string",
            "default": "mp3",
            "enum": [
              "mp3",
              "wav",
              "pcm",
              "mulaw",
              "alaw"
            ],
            "x-ui-component": "select"
          },
          "sample_rate": {
            "description": "Sample rate in Hz. Supported values: 8000, 16000, 22050, 24000, 44100, 48000.",
            "type": "integer",
            "default": 24000,
            "enum": [
              8000,
              16000,
              22050,
              24000,
              44100,
              48000
            ],
            "x-ui-component": "select"
          },
          "bit_rate": {
            "description": "Bit rate in bps. Applies to MP3 codec only. Supported values: 32000, 64000, 96000, 128000, 192000.",
            "type": "integer",
            "default": 128000,
            "enum": [
              32000,
              64000,
              96000,
              128000,
              192000
            ],
            "x-ui-component": "select"
          },
          "speed": {
            "description": "Speech speed multiplier. 1.0 is normal speed. Values below 1.0 slow down speech, values above 1.0 speed it up. Range: 0.7 to 1.5.",
            "type": "number",
            "default": 1,
            "minimum": 0.7,
            "maximum": 1.5
          },
          "text_normalization": {
            "description": "Enable text normalization before synthesis. When enabled, the model normalizes written-form text (e.g. numbers, abbreviations, symbols) into spoken-form before generating audio.",
            "type": "boolean",
            "default": false
          },
          "optimize_streaming_latency": {
            "description": "Latency optimization level for streaming synthesis. 0 (default): No optimization — best audio quality. 1: Reduced first-chunk size for lower time-to-first-audio, with minor quality tradeoff at chunk boundaries. 2: Further reduced first-chunk size for lowest time-to-first-audio, with more noticeable quality tradeoff at chunk boundaries.",
            "type": "integer",
            "default": 0,
            "enum": [
              0,
              1,
              2
            ],
            "x-ui-component": "select",
            "x-enum-labels": {
              "0": "Best quality (no optimization)",
              "1": "Lower latency, minor quality tradeoff",
              "2": "Lowest latency, noticeable quality tradeoff"
            }
          }
        },
        "required": [
          "model",
          "text",
          "language"
        ],
        "type": "object",
        "x-order-properties": [
          "model",
          "text",
          "language",
          "voice_id",
          "codec",
          "sample_rate",
          "bit_rate",
          "speed",
          "text_normalization",
          "optimize_streaming_latency"
        ],
        "allOf": [
          {
            "if": {
              "properties": {
                "voice_id": {
                  "enum": [
                    "jpi39icg",
                    "d18jlf6v",
                    "33g9t0jl"
                  ]
                }
              },
              "required": [
                "voice_id"
              ]
            },
            "then": {
              "properties": {
                "language": {
                  "const": "zh"
                }
              }
            }
          },
          {
            "if": {
              "properties": {
                "voice_id": {
                  "enum": [
                    "26w6ihxi",
                    "dr8gqysu",
                    "wy0m9l5w",
                    "om17cury"
                  ]
                }
              },
              "required": [
                "voice_id"
              ]
            },
            "then": {
              "properties": {
                "language": {
                  "const": "ru"
                }
              }
            }
          },
          {
            "if": {
              "properties": {
                "voice_id": {
                  "enum": [
                    "x7avnu1k",
                    "bcs7l2c3",
                    "hqxr4yub",
                    "h27ltdnz"
                  ]
                }
              },
              "required": [
                "voice_id"
              ]
            },
            "then": {
              "properties": {
                "language": {
                  "const": "it"
                }
              }
            }
          },
          {
            "if": {
              "properties": {
                "voice_id": {
                  "enum": [
                    "89q2pnko",
                    "73xd5dum"
                  ]
                }
              },
              "required": [
                "voice_id"
              ]
            },
            "then": {
              "properties": {
                "language": {
                  "const": "hi"
                }
              }
            }
          },
          {
            "if": {
              "properties": {
                "voice_id": {
                  "enum": [
                    "0p0rt7o1",
                    "hbxkrnwm",
                    "69smp8rm"
                  ]
                }
              },
              "required": [
                "voice_id"
              ]
            },
            "then": {
              "properties": {
                "language": {
                  "const": "fr"
                }
              }
            }
          },
          {
            "if": {
              "properties": {
                "voice_id": {
                  "enum": [
                    "yis75yfp",
                    "ekhwx401",
                    "jupvcf34",
                    "0hhfxxqq"
                  ]
                }
              },
              "required": [
                "voice_id"
              ]
            },
            "then": {
              "properties": {
                "language": {
                  "const": "es-ES"
                }
              }
            }
          },
          {
            "if": {
              "properties": {
                "voice_id": {
                  "enum": [
                    "fc7de6afcf6c",
                    "7a9ee820b342",
                    "0895a5b8ce5c"
                  ]
                }
              },
              "required": [
                "voice_id"
              ]
            },
            "then": {
              "properties": {
                "language": {
                  "const": "vi"
                }
              }
            }
          },
          {
            "if": {
              "properties": {
                "voice_id": {
                  "enum": [
                    "f8cf5c2c78d4",
                    "96819d0bd28d",
                    "79f3a8b96d43",
                    "78a495fdbb39"
                  ]
                }
              },
              "required": [
                "voice_id"
              ]
            },
            "then": {
              "properties": {
                "language": {
                  "const": "en"
                }
              }
            }
          },
          {
            "if": {
              "properties": {
                "voice_id": {
                  "enum": [
                    "d634b6da3d3b",
                    "670a0c3ac005",
                    "182a91893636"
                  ]
                }
              },
              "required": [
                "voice_id"
              ]
            },
            "then": {
              "properties": {
                "language": {
                  "const": "tr"
                }
              }
            }
          },
          {
            "if": {
              "properties": {
                "voice_id": {
                  "enum": [
                    "d0cb9ff07d95",
                    "b1a7441b97a1"
                  ]
                }
              },
              "required": [
                "voice_id"
              ]
            },
            "then": {
              "properties": {
                "language": {
                  "const": "ja"
                }
              }
            }
          },
          {
            "if": {
              "properties": {
                "voice_id": {
                  "enum": [
                    "bf9fe5b5f981",
                    "b5ae17439907",
                    "a0401c9101f8",
                    "23be42535a45"
                  ]
                }
              },
              "required": [
                "voice_id"
              ]
            },
            "then": {
              "properties": {
                "language": {
                  "const": "ko"
                }
              }
            }
          },
          {
            "if": {
              "properties": {
                "voice_id": {
                  "enum": [
                    "abfbdf26f115",
                    "6da5baee46d0",
                    "3d030bc92a87"
                  ]
                }
              },
              "required": [
                "voice_id"
              ]
            },
            "then": {
              "properties": {
                "language": {
                  "const": "pt-PT"
                }
              }
            }
          },
          {
            "if": {
              "properties": {
                "voice_id": {
                  "enum": [
                    "70013edeb8e8",
                    "35c8d7f60dc8",
                    "23468361b4ef"
                  ]
                }
              },
              "required": [
                "voice_id"
              ]
            },
            "then": {
              "properties": {
                "language": {
                  "const": "ar-EG"
                }
              }
            }
          },
          {
            "if": {
              "properties": {
                "voice_id": {
                  "enum": [
                    "458705c07139",
                    "41321eb41295",
                    "40f31906b23d",
                    "3a7889066fa2"
                  ]
                }
              },
              "required": [
                "voice_id"
              ]
            },
            "then": {
              "properties": {
                "language": {
                  "const": "de"
                }
              }
            }
          }
        ]
      },
      "PredictionResponse": {
        "type": "object",
        "properties": {
          "code": {
            "description": "HTTP status code of the response.",
            "type": "integer"
          },
          "message": {
            "description": "Human-readable message; non-empty on failure.",
            "type": "string"
          },
          "data": {
            "type": "object",
            "properties": {
              "id": {
                "description": "Unique identifier for the prediction.",
                "type": "string"
              },
              "model": {
                "description": "Model ID used for the prediction.",
                "type": "string"
              },
              "outputs": {
                "description": "Array of URLs to the generated content. Null when status is not completed.",
                "type": "array",
                "items": {
                  "type": "string"
                },
                "nullable": true
              },
              "urls": {
                "description": "Object containing related API endpoints.",
                "type": "object",
                "properties": {
                  "get": {
                    "description": "URL to poll for the prediction result.",
                    "type": "string",
                    "format": "uri"
                  }
                }
              },
              "has_nsfw_contents": {
                "description": "Array of boolean values indicating NSFW detection for each output. Null if not applicable.",
                "type": "array",
                "items": {
                  "type": "boolean"
                },
                "nullable": true
              },
              "status": {
                "description": "Status of the task: created, processing, completed, timeout, or failed.",
                "type": "string"
              },
              "created_at": {
                "description": "ISO timestamp of when the request was created (e.g., \"2023-04-01T12:34:56.789Z\").",
                "format": "date-time",
                "type": "string"
              },
              "error": {
                "description": "Error message if the task failed, empty string otherwise.",
                "type": "string"
              },
              "error_code": {
                "description": "Error code if the task failed.",
                "type": "integer"
              },
              "executionTime": {
                "description": "Total execution time in milliseconds.",
                "type": "number"
              },
              "timings": {
                "description": "Detailed timing breakdown.",
                "type": "object",
                "properties": {
                  "inference": {
                    "description": "Inference time in milliseconds.",
                    "type": "number"
                  }
                }
              }
            }
          }
        }
      }
    },
    "securitySchemes": {
      "apiKeyAuth": {
        "in": "header",
        "name": "Authorization",
        "type": "apiKey"
      }
    }
  },
  "servers": [
    {
      "url": "https://api.atlascloud.ai"
    }
  ]
}

Template di prompt ottimizzato per LLM

# xai/tts-v1

> xAI TTS v1 is a high-fidelity text-to-speech model that converts text into natural, expressive speech with sub-second latency, supporting 20 languages and 80+ voices with fine-grained delivery control.


## Overview

- **Submit endpoint (POST)**: `https://api.atlascloud.ai/api/v1/model/generateAudio` — start an async generation; returns a `prediction_id`
- **Poll endpoint (GET)**: `https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}` — poll this until the prediction finishes
- **Model ID**: `xai/tts-v1`


## API Information

This model can be used via our HTTP API or more conveniently via our client libraries.
See the input and output schema below, as well as the usage examples.


### Input Schema

The API accepts the following input parameters:

- **`model`** (`string`, _required_):
  model name
  - Default: `"xai/tts-v1"`

- **`text`** (`string`, _required_):
  The text to convert to speech. Maximum 15,000 characters. Supports inline instant tags (e.g. [pause]) and wrapping style tags (e.g. <whisper>text</whisper>).

- **`language`** (`string`, _required_):
  BCP-47 language code or 'auto' for automatic language detection. Case-insensitive.
  - Default: `"auto"`
  - Options: "auto", "en", "zh", "ar-EG", "ar-SA", "ar-AE", "bn", "fr", "de", "hi", "id", "it", "ja", "ko", "pt-BR", "pt-PT", "ru", "es-MX", "es-ES", "tr", "vi"

- **`voice_id`** (`string`, _optional_):
  Voice identifier. Multilingual voices (ara, eve, leo, rex, sal) support all languages; other voices are optimized for their native language and will cascade-update the language field.
  - Default: `"eve"`
  - Options: "ara", "eve", "leo", "rex", "sal", "jpi39icg", "d18jlf6v", "33g9t0jl", "26w6ihxi", "dr8gqysu", "wy0m9l5w", "om17cury", "x7avnu1k", "bcs7l2c3", "hqxr4yub", "h27ltdnz", "89q2pnko", "73xd5dum", "0p0rt7o1", "hbxkrnwm", "69smp8rm", "yis75yfp", "ekhwx401", "jupvcf34", "0hhfxxqq", "0ih5oi34", "gwnexu6y", "97zmdc6s", "fc7de6afcf6c", "7a9ee820b342", "0895a5b8ce5c", "f8cf5c2c78d4", "96819d0bd28d", "79f3a8b96d43", "78a495fdbb39", "e22152e06fd8", "490ea3be50b1", "1f046a033914", "dfe7b9e7d217", "c3a2c594479e", "83c6f4fea98e", "34fd4dce1ba3", "d634b6da3d3b", "670a0c3ac005", "182a91893636", "d0cb9ff07d95", "b1a7441b97a1", "bf9fe5b5f981", "b5ae17439907", "a0401c9101f8", "23be42535a45", "abfbdf26f115", "6da5baee46d0", "3d030bc92a87", "a13662ba951c", "58d27475085e", "247783ebdd51", "244e27b39200", "97fabd54445f", "37329fd8895a", "2badb5f46b1e", "1b12d5daee6b", "908c4626660f", "4ff93971bfdc", "70013edeb8e8", "35c8d7f60dc8", "23468361b4ef", "458705c07139", "41321eb41295", "40f31906b23d", "3a7889066fa2"

- **`codec`** (`string`, _optional_):
  Audio codec for the output.
  - Default: `"mp3"`
  - Options: "mp3", "wav", "pcm", "mulaw", "alaw"

- **`sample_rate`** (`integer`, _optional_):
  Sample rate in Hz. Supported values: 8000, 16000, 22050, 24000, 44100, 48000.
  - Default: `24000`
  - Options: 8000, 16000, 22050, 24000, 44100, 48000

- **`bit_rate`** (`integer`, _optional_):
  Bit rate in bps. Applies to MP3 codec only. Supported values: 32000, 64000, 96000, 128000, 192000.
  - Default: `128000`
  - Options: 32000, 64000, 96000, 128000, 192000

- **`speed`** (`number`, _optional_):
  Speech speed multiplier. 1.0 is normal speed. Values below 1.0 slow down speech, values above 1.0 speed it up. Range: 0.7 to 1.5.
  - Default: `1`
  - Min: 0.7
  - Max: 1.5

- **`text_normalization`** (`boolean`, _optional_):
  Enable text normalization before synthesis. When enabled, the model normalizes written-form text (e.g. numbers, abbreviations, symbols) into spoken-form before generating audio.
  - Default: `false`

- **`optimize_streaming_latency`** (`integer`, _optional_):
  Latency optimization level for streaming synthesis. 0 (default): No optimization — best audio quality. 1: Reduced first-chunk size for lower time-to-first-audio, with minor quality tradeoff at chunk boundaries. 2: Further reduced first-chunk size for lowest time-to-first-audio, with more noticeable quality tradeoff at chunk boundaries.
  - Default: `0`
  - Options: 0, 1, 2



**Required Parameters Example**:

```json
{
  "model": "xai/tts-v1",
  "text": "",
  "language": "auto"
}
```


**Full Example**:

```json
{
  "model": "xai/tts-v1",
  "text": "",
  "language": "auto",
  "voice_id": "eve",
  "codec": "mp3",
  "sample_rate": 24000,
  "bit_rate": 128000,
  "speed": 1,
  "text_normalization": false,
  "optimize_streaming_latency": 0
}
```


### Output Schema

The API returns the following output format:


- **`code`** (`integer`, _optional_):
  HTTP status code of the response.

- **`message`** (`string`, _optional_):
  Human-readable message; non-empty on failure.

- **`data`** (`object`, _optional_):
  - Properties:
    - **`id`** (`string`, _optional_):
      Unique identifier for the prediction.

    - **`model`** (`string`, _optional_):
      Model ID used for the prediction.

    - **`outputs`** (`array[string]`, _optional_):
      Array of URLs to the generated content. Null when status is not completed.

    - **`urls`** (`object`, _optional_):
      Object containing related API endpoints.
      - Properties:
        - **`get`** (`string`, _optional_):
          URL to poll for the prediction result.


    - **`has_nsfw_contents`** (`array[boolean]`, _optional_):
      Array of boolean values indicating NSFW detection for each output. Null if not applicable.

    - **`status`** (`string`, _optional_):
      Status of the task: created, processing, completed, timeout, or failed.

    - **`created_at`** (`string`, _optional_):
      ISO timestamp of when the request was created (e.g., "2023-04-01T12:34:56.789Z").

    - **`error`** (`string`, _optional_):
      Error message if the task failed, empty string otherwise.

    - **`error_code`** (`integer`, _optional_):
      Error code if the task failed.

    - **`executionTime`** (`number`, _optional_):
      Total execution time in milliseconds.

    - **`timings`** (`object`, _optional_):
      Detailed timing breakdown.
      - Properties:
        - **`inference`** (`number`, _optional_):
          Inference time in milliseconds.





**Example Response**:

```json
{
  "code": 0,
  "message": "",
  "data": {
    "id": "",
    "model": "",
    "outputs": [
      ""
    ],
    "urls": {
      "get": ""
    },
    "has_nsfw_contents": [],
    "status": "",
    "created_at": "",
    "error": "",
    "error_code": 0,
    "executionTime": 0,
    "timings": {
      "inference": 0
    }
  }
}
```


## Usage Examples

### cURL

```bash
# Step 1: Start generation (async)
curl -X POST "https://api.atlascloud.ai/api/v1/model/generateAudio" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "xai/tts-v1",
  "text": "",
  "language": "auto",
  "voice_id": "eve",
  "codec": "mp3",
  "sample_rate": 24000,
  "bit_rate": 128000,
  "speed": 1,
  "text_normalization": false,
  "optimize_streaming_latency": 0
}'

# Response will contain: {"code": 200, "data": {"id": "prediction_id", "status": "processing"}}

# Step 2: Poll for result (replace {prediction_id} with the id returned above)
curl -X GET "https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY"

# Keep polling until status is "completed", "succeeded" or "failed"
# When completed, outputs will contain the generated content URL(s)
```

## Additional Resources

### Documentation

- [Model Playground](https://www.atlascloud.ai/models/xai/tts-v1)

So I walked in and [pause] there it was. [laugh] I honestly could not believe it! <whisper>It was a secret the whole time.</whisper> Pretty cool, right?

Caricamento...

xAI TTS v1 — Text to Speech

Developer: xAI
Model ID: xai/tts-v1
Release Date: April 2026

Overview

xAI TTS v1 is a high-fidelity text-to-speech model developed by xAI, the company behind Grok. It converts text into natural, expressive speech with sub-second latency, supporting 20 languages and a rich voice library of 80+ voices. The model features fine-grained delivery control through inline speech tags, multiple audio output formats for both streaming and telephony use cases, and a voice cloning capability that can produce a production-ready custom voice from roughly a minute of reference audio.

xAI TTS v1 is designed for real-world production workloads — customer service bots, content narration, accessibility tools, and real-time voice agents — offering competitive per-character pricing with enterprise-grade compliance.

Key Capabilities

Expressive delivery control — 14 instant speech tags and 13 wrapping style tags for pauses, laughter, whispers, pitch shifts, speed changes, and more.
80+ voices across 20 languages — Five universal multilingual voices plus language-optimized voices for Chinese, Russian, Italian, French, Spanish, Hindi, Japanese, Korean, Portuguese, German, Dutch, Polish, Turkish, Arabic, Vietnamese, Thai, Danish, Swedish, Finnish, and English.
Flexible audio output — MP3, WAV, PCM, μ-law, and A-law codecs with sample rates from 8 kHz (telephony) to 48 kHz (studio).
Voice cloning — Record up to 120 seconds of reference audio; a custom voice_id is ready in under 2 minutes at no additional cost.
Streaming and batch modes — Both a standard REST endpoint and a WebSocket streaming endpoint for real-time audio delivery.
Automatic language detection — Set language: "auto" to let the model identify the input language.
Privacy-first — Audio is never stored or used for model training.

Use Cases

Customer service / IVR — Low-latency telephony output (μ-law/A-law at 8 kHz) for automated voice response systems.
Content narration — Podcast episodes, audiobooks, and video voiceovers with expressive, human-like delivery.
Accessibility — Real-time screen readers and reading assistants for visually impaired users.
Real-time voice agents — Paired with xAI's Speech-to-Text and Voice Agent APIs for end-to-end conversational AI.
Multilingual applications — Localized content delivery across 20 languages from a single API.
Custom brand voices — Clone a spokesperson's voice to maintain consistent audio identity at scale.

Voices

Multilingual Voices

These five voices support all 20 available languages.

Voice ID	Name	Gender	Character
`eve`	Eve	Female	Energetic, upbeat — default voice
`ara`	Ara	Female	Warm and conversational
`leo`	Leo	Male	Authoritative, instructional
`rex`	Rex	Male	Professional, business tone
`sal`	Sal	Male	Versatile, neutral

Language-Optimized Voices (selected)

Language-specific voices are optimized for their native language and automatically lock the language field when selected.

Language	Voices
English	Grace (F), Claire (F), James (M), Daniel (M)
Chinese (Mandarin)	Jian (M), Hao (F), Xia (F)
Russian	Pavel (M), Andrei (M), Dmitri (M), Irina (F)
French	Remi (M), Hugo (M), Camille (F)
Spanish	Manuel (M), Javier (M), Diego (M), Andres (M)
Italian	Enzo (M), Matteo (M), Alessandro (M), Luca (F)
German	Moritz (M), Niklas (M), Clara (F), Lena (F)
Japanese	Ren (M), Sakura (F)
Korean	Jun-seo (M), Min-jun (M), Seo-yeon (F), Ji-yeon (F)
Portuguese	Mateus (M), Rafael (M), Beatriz (F)
Hindi	Karan (M), Ananya (F)
Arabic	Khalid (M), Tariq (M), Layla (F)
Turkish	Emre (M), Kaan (M), Aylin (F)
Dutch	Thijs (M), Ruben (M), Femke (F), Noor (F)
Polish	Mateusz (M), Jakub (M), Katarzyna (F), Aleksandra (F)
Vietnamese	Duc (M), Minh (M), Mai (F)
Swedish	Axel (M), Erik (M), Saga (F)
Finnish	Valtteri (M), Eero (M), Helmi (F), Elina (F)
Thai	Krit (M), Aroon (M)
Danish	Lars (M), Kasper (M), Ida (F)

The complete voice library, including custom cloned voices, is accessible via the voice_id parameter.

Supported Languages

Code	Language
`auto`	Auto Detect
`en`	English
`zh`	Chinese (Mandarin)
`ar-EG`	Arabic (Egypt)
`ar-SA`	Arabic (Saudi Arabia)
`ar-AE`	Arabic (UAE)
`bn`	Bengali
`fr`	French
`de`	German
`hi`	Hindi
`id`	Indonesian
`it`	Italian
`ja`	Japanese
`ko`	Korean
`pt-BR`	Portuguese (Brazil)
`pt-PT`	Portuguese (Portugal)
`ru`	Russian
`es-MX`	Spanish (Mexico)
`es-ES`	Spanish (Spain)
`tr`	Turkish
`vi`	Vietnamese

Input Parameters

Parameter	Type	Default	Description
`model`	string	`xai/tts-v1`	Model identifier. Required.
`text`	string	—	Text to synthesize. Maximum 15,000 characters. Supports inline speech tags. Required.
`language`	string	`auto`	BCP-47 language code or `"auto"` for automatic detection. Required.
`voice_id`	string	`eve`	Voice identifier. Multilingual voices support all languages; language-optimized voices lock to their native language.
`codec`	string	`mp3`	Output audio codec: `mp3`, `wav`, `pcm`, `mulaw`, `alaw`.
`sample_rate`	integer	`24000`	Sample rate in Hz. Options: `8000`, `16000`, `22050`, `24000`, `44100`, `48000`.
`bit_rate`	integer	`128000`	Bit rate in bps. MP3 only. Options: `32000`, `64000`, `96000`, `128000`, `192000`.
`speed`	number	`1.0`	Playback speed multiplier. Range: `0.7` (slower) to `1.5` (faster).
`text_normalization`	boolean	`false`	When `true`, converts written-form text (numbers, abbreviations, symbols) to spoken form before synthesis.
`optimize_streaming_latency`	integer	`0`	Streaming latency trade-off: `0` = best quality, `1` = lower first-chunk latency, `2` = lowest first-chunk latency.

Audio Format Recommendations

Use Case	Codec	Sample Rate	Bit Rate
Standard playback (default)	`mp3`	24,000 Hz	128,000 bps
Studio / high-fidelity	`mp3` or `wav`	44,100 or 48,000 Hz	192,000 bps
Telephony / IVR	`mulaw` or `alaw`	8,000 Hz	—
Streaming (web)	`mp3`	22,050 Hz	64,000 bps

Inline Speech Tags

Speech tags let you control delivery at a granular level within the text field.

Instant Tags

Insert at any point in the text to trigger an immediate vocal event.

Tag	Effect
`[pause]`	Brief pause
`[long-pause]`	Extended pause
`[hum-tune]`	Soft humming sound
`[laugh]`	Full laugh
`[chuckle]`	Short chuckle
`[giggle]`	Light giggle
`[cry]`	Crying sound
`[tsk]`	Disapproving tsk
`[tongue-click]`	Tongue click
`[lip-smack]`	Lip-smack sound
`[breath]`	Audible breath
`[inhale]`	Inhalation
`[exhale]`	Exhalation
`[sigh]`	Sigh

Wrapping Style Tags

Wrap text to apply a delivery style to the enclosed span.

Tag	Effect
`<soft>…</soft>`	Softer, quieter delivery
`<whisper>…</whisper>`	Whispered speech
`<loud>…</loud>`	Louder, more projected delivery
`<build-intensity>…</build-intensity>`	Gradually increasing intensity
`<decrease-intensity>…</decrease-intensity>`	Gradually decreasing intensity
`<higher-pitch>…</higher-pitch>`	Raised pitch
`<lower-pitch>…</lower-pitch>`	Lowered pitch
`<slow>…</slow>`	Slower delivery
`<fast>…</fast>`	Faster delivery
`<sing-song>…</sing-song>`	Melodic, sing-song pattern
`<singing>…</singing>`	Full singing mode
`<laugh-speak>…</laugh-speak>`	Spoken with laughter mixed in
`<emphasis>…</emphasis>`	Stressed emphasis on words

Example:

"Welcome back! [pause] <whisper>Just between us</whisper>, the sale ends tonight. [laugh] Don't miss it!"

Rate Limits

Limit	Value
Requests per minute (RPM)	3,000
Requests per second (RPS)	50
Concurrent sessions per team	100
Maximum characters per request	15,000

Pricing

Pricing is based on the number of characters in the input text field.

SKU

SKU	Description	Unit Price
`sku_per_1k_chars`	Per 1,000 characters synthesized	$0.015

Formula

cost = countChars(text) / 1000 × sku_per_1k_chars

Where countChars(text) counts the total number of characters in the input text (including spaces and punctuation, excluding inline speech tags consumed by the model).

The effective rate is $15.00 per 1,000,000 characters.

Examples

Input Length	Characters	Cost
Short sentence	100 chars	$0.0015
One paragraph	500 chars	$0.0075
Article excerpt	2,000 chars	$0.0300
Maximum single request	15,000 chars	$0.2250

Custom cloned voices are billed at the same per-character rate as built-in voices — there is no additional charge for using a voice_id from the Voice Library.

Enterprise Features

SOC 2 Type II certified
HIPAA eligibility with Business Associate Agreement (BAA)
GDPR compliance with data residency options
SAML SSO and role-based access control (RBAC)
Multi-region infrastructure with custom SLAs available
No data retention — audio is never stored or used for model training

References

Esplora Modelli Simili

NEW

audio-in-testo

xAI STT v1

xAI STT v1 is a production-grade speech-to-text model that transcribes audio into accurate, formatted text. It supports 24+ languages with automatic language detection, word-level timestamps, speaker diarization, multichannel transcription, and inverse text normalization.

Seed Audio 1.0

Doubao‑Audio‑Generate‑1.0 is Doubao Voice’s next‑generation audio‑generation engine. The industry‑first commercial tool creates film‑grade audio with just one prompt. It eliminates cumbersome audio‑engineering work. Creators generate publish‑ready radio dramas, podcasts and branded audio easily, shifting from a simple voice‑generator to an AI audio director. It serves audiobooks, serialized episodes and commercial audio for high‑quality narrative‑driven production.

ElevenLabs v3 Text-to-Speech

ElevenLabs v3 Text-to-Speech model. High-quality speech generation from text prompts.

Suno chirp-auk

Suno text-to-music (chirp): generate a full song from a text prompt (and optional lyrics). Async; returns 2 variations.

Suno chirp-fenix

Suno text-to-music (chirp): generate a full song from a text prompt (and optional lyrics). Async; returns 2 variations.

Suno chirp-v3-0

Suno text-to-music (chirp): generate a full song from a text prompt (and optional lyrics). Async; returns 2 variations.

Suno chirp-v3-5

Suno text-to-music (chirp): generate a full song from a text prompt (and optional lyrics). Async; returns 2 variations.

Suno chirp-v3-5-tau

Suno text-to-music (chirp): generate a full song from a text prompt (and optional lyrics). Async; returns 2 variations.

Suno chirp-v4

Suno text-to-music (chirp): generate a full song from a text prompt (and optional lyrics). Async; returns 2 variations.

Suno chirp-v4-tau

Suno text-to-music (chirp): generate a full song from a text prompt (and optional lyrics). Async; returns 2 variations.

Suno chirp-v5

Suno text-to-music (chirp): generate a full song from a text prompt (and optional lyrics). Async; returns 2 variations.

MiniMax Music 2.6

MiniMax text-to-music (latest): generate a full vocal or instrumental song from a style prompt plus lyrics with [Verse]/[Chorus] structure tags. Synchronous single-call generation.

Seed ASR 2.0

BytePlus Seed Speech recording-file recognition (ASR 2.0): transcribe an audio file to text with punctuation, ITN, sentence segmentation, speaker diarization, and 51-language support.

ASR

From

$0.002/minuto

Un'unica API per tutta l'IA multimediale.

Esplora tutti i modelli