Workflow
The quick cloning feature follows these steps:- Upload the source audio Use the File Upload API to upload the audio you want to clone and obtain a
file_id.
- Requirements for uploaded files:
- Supported formats: mp3, m4a, wav
- Duration: minimum 10 seconds, maximum 5 minutes
- File size: up to 20 MB
- Upload example audio (optional) To enhance cloning quality, you can upload an example audio file via the File Upload API and obtain a
file_id. Include this inclone_promptunderprompt_audio.
- Requirements for example files:
- Supported formats: mp3, m4a, wav
- Duration: less than 8 seconds
- File size: up to 20 MB
- Call the cloning API Use the obtained
file_idand a customvoice_idas input parameters to call the Voice Clone API to clone the voice. - Use the cloned voice With the generated
voice_id, you can call the speech synthesis API as needed, for example:
Process Examples
1. Upload Source Audio
2. Upload Example Audio
3. Clone the Voice
Full Example
Example Results
- Cloned Audio
- Example Audio
- Resulting Audio
Recommended Reading
Voice Cloning
Use this API for rapid voice cloning.
Synchronous Text-to-Speech Guide (WebSocket)
Synchronous TTS allows real-time text-to-speech synthesis, handling up to 10,000 characters per request.
Pricing
Detailed information on model pricing and API packages.
Rate Limits
Rate limits are restrictions that our API imposes on the number of times a user or client can access our services within a specified period of time.