Authorizations
HTTP: Bearer Auth
- Security Scheme Type: http
- HTTP Authorization Scheme:
Bearer API_key, can be found in Account Management>API Keys.
Headers
The media type of the request body. Must be set to application/json to ensure the data is sent in JSON format.
application/json Body
Voice clone request parameters
The file_id of the audio to be cloned, obtained through the File Upload API.
Uploaded files must comply with the following rules:
- Accepted audio formats: mp3, m4a, wav
- Audio duration: at least 10 seconds, no longer than 5 minutes
- File size: no larger than 20 MB
The voice_id of the cloned voice. Example: "MiniMax001". When defining a custom voice_id, note the following rules:
- Length range: [8, 256]
- Must start with an English letter
- Can contain letters, digits,
-, and_ - Cannot end with
-or_ - Must not duplicate an existing
voice_id, otherwise an error will occur
Voice cloning parameters. Providing this field helps improve the similarity and stability of synthesized voice. If used, you must also upload a short sample audio clip (less than 8s, supported formats: mp3, m4a, wav) along with its corresponding transcript.
Optional preview text, up to 2000 characters. The cloned voice will be used to read the text, and an audio preview link will be returned. Note: Preview requests are charged based on character count, consistent with T2A pricing.
Specifies which voice synthesis model to use for generating the preview audio. Required when the text field is provided.
speech-2.5-hd-preview, speech-2.5-turbo-preview, speech-02-hd, speech-02-turbo, speech-01-hd, speech-01-turbo Controls whether recognition for specific minority languages and dialects is enhanced. Default is null. If the language type is unknown, set to "auto" and the model will automatically detect it.
Supported values: ['Chinese', 'Chinese,Yue', 'English', 'Arabic', 'Russian', 'Spanish', 'French', 'Portuguese', 'German', 'Turkish', 'Dutch', 'Ukrainian', 'Vietnamese', 'Indonesian', 'Japanese', 'Italian', 'Korean', 'Thai', 'Polish', 'Romanian', 'Greek', 'Czech', 'Finnish', 'Hindi', 'Bulgarian', 'Danish', 'Hebrew', 'Malay', 'Persian', 'Slovak', 'Swedish', 'Croatian', 'Filipino', 'Hungarian', 'Norwegian', 'Slovenian', 'Catalan', 'Nynorsk', 'Tamil', 'Afrikaans', 'auto']
Indicates whether to enable noise reduction.
Indicates whether to enable volume normalization.






