Text to Speech (T2A)

This API provides synchronous text-to-speech (T2A) generation, supporting up to 10,000 characters per request. The interface is stateless: each call only processes the provided input without involving business logic, and the model does not store any user data. Key Features

Access to 300+ system voices and custom cloned voices.
Adjustable volume, pitch, speed, and output formats.
Support for proportional audio mixing.
Configurable fixed time intervals.
Multiple audio formats and specifications supported: mp3, pcm, flac, wav (wav is supported only in non-streaming mode).
Support for streaming output.

Typical Use Cases: short text generation, voice chat, online social interactions.

Supported Models

Model	Description
speech-2.8-hd	Latest HD model. Perfecting Tonal Nuances. Maximizing Timbre Similarity.
speech-2.8-turbo	Latest Turbo model. Perfecting Tonal Nuances. Maximizing Timbre Similarity.
speech-2.6-hd	HD model with outstanding prosody and excellent cloning similarity.
speech-2.6-turbo	Turbo model with support for 40 languages.
speech-02-hd	Superior rhythm and stability, with outstanding performance in replication similarity and sound quality.
speech-02-turbo	Superior rhythm and stability, with enhanced multilingual capabilities and excellent performance.

Available Interfaces

Synchronous speech synthesis provides two interfaces. Choose based on your needs:

Supported Languages

MiniMax speech synthesis models offer robust multilingual capability, supporting 40 widely used languages worldwide. Our goal is to break down language barriers and build a truly global AI model.

Support Languages
1. Chinese	15. Turkish	28. Malay
2. Cantonese	16. Dutch	29. Persian
3. English	17. Ukrainian	30. Slovak
4. Spanish	18. Thai	31. Swedish
5. French	19. Polish	32. Croatian
6. Russian	20. Romanian	33. Filipino
7. German	21. Greek	34. Hungarian
8. Portuguese	22. Czech	35. Norwegian
9. Arabic	23. Finnish	36. Slovenian
10. Italian	24. Hindi	37. Catalan
11. Japanese	25. Bulgarian	38. Nynorsk
12. Korean	26. Danish	39. Tamil
13. Indonesian	27. Hebrew	40. Afrikaans
14. Vietnamese

Official MCP

MiniMax provides official Model Context Protocol (MCP) server implementations with speech synthesis support:

For detailed usage instructions, see the MiniMax MCP User Guide.

Using the API

Text

Speech

Video

Image

Music

File

Supported Models

Available Interfaces

Supported Languages

Official MCP

Using the API

Text

Speech

Video

Image

Music

File

​Supported Models

​Available Interfaces

​Supported Languages

​Official MCP

Supported Models

Available Interfaces

Supported Languages

Official MCP