Create Async TTS Task

Create an async long-text TTS task, supporting direct text input or text file for synthesis.

POST/v1/audio/speech/tasks

Authorization

AuthorizationstringheaderRequired
HTTP: Bearer Auth
  • Security Scheme Type: http
  • HTTP Authorization Scheme: Bearer API_key,Used to verify account information, can be viewed in Project Management > API Key .

Request Header

Content-Typeenum<string>Default:application/jsonRequired

The media type of the request body, please set it toapplication/jsonto ensure the correct format of the requested data.

Available options:application/json

Request Bodyapplication/json

modelstringRequired

Model code, optional values: u2-tts, u2-tts-clone

textstring

The text to be synthesized. When model is u2-tts, the maximum is 50,000 characters; when model is u2-tts-clone, the maximum is 20,000 characters. Either this or text_file_id is required.

text_file_idlong

The text file ID (txt) to be synthesized. When model is u2-tts, max 50,000 chars; when model is u2-tts-clone, max 20,000 chars. Either this or text is required.

voice_settingobjectRequired

Basic voice settings

voice_setting.voice_idstring

System/Clone voice ID, can be obtained through the query available voices API

voice_setting.speedinteger

Speed range [0.5, 2], default 1.0

voice_setting.volumeinteger

voice_setting.pitchinteger

Pitch range [-12, 12], default 0

voice_setting.brightinteger

voice_setting.emotionstring

Pronunciation emotion, optional values: happy, angry, old, robot, slow, depressed, whisper, fast, loudly. Currently only supported by the chenyu speaker.

audio_settingobject

Audio output settings

audio_setting.audio_sample_rateinteger

Sample rate, enum [8000, 16000, 24000, 32000], default 32000

audio_setting.formatstring

Output format, enum [mp3, pcm], default mp3

audio_setting.channelinteger

Number of channels, enum [1]

pronunciation_dictobject

Custom pronunciation rules

pronunciation_dict.tonestring[ ]

Pronunciation/phonetic replacement rules, example: ["Liangshan/Liang<py>liang2</py>shan"]

language_booststring

Language boost mode, default auto

Response Body Structure

task_idstring

Unique identification ID of the current async synthesis task

file_idlong

Audio file ID returned when the task is successfully created

usage_charactersinteger

Number of characters consumed

base_respobject

base_resp.status_codeinteger

Request status code, 0 = normal

base_resp.status_msgstring

Status description, success indicates success