Skip to main content
Use cases: fast replication of a target timbre (IP voice recreation, voice cloning) where you need to quickly clone a specific voice. The API supports cloning from mono or stereo audio and can rapidly reproduce speech that matches the timbre of a provided reference file. Notes
  • Using this API to clone a voice does not immediately incur a cloning fee. The fee is charged the first time you synthesize speech with the cloned voice in a T2A synthesis API (trial preview within this API is excluded).
  • Voices produced via this rapid cloning API are temporary. To keep a cloned voice permanently, call any T2A speech synthesis API with that voice within 168 hours (7 days) (previews within this API do not count). After the time limit, the voice will be deleted.

Supported Models

ModelDescription
speech-2.6-hdLatest HD model with real-time response, intelligent parsing, fluent LoRA voice
speech-2.6-turboLatest Turbo model. Ultimate Value, 40 Languages
speech-02-hdSuperior rhythm and stability, with outstanding performance in replication similarity and sound quality.
speech-02-turboSuperior rhythm and stability, with enhanced multilingual capabilities and excellent performance.

Official MCP

MiniMax provides official Model Context Protocol (MCP) server implementations for both Python and JavaScript version, with support for voice cloning. For details, see the MiniMax MCP User Guide.