Skip to main content
POST
/
v1
/
music_cover_preprocess
Music Cover Preprocess
curl --request POST \
  --url https://api.minimax.io/v1/music_cover_preprocess \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "model": "music-cover",
  "audio_url": "https://example.com/song.mp3"
}
'
{
  "cover_feature_id": "a1b2c3d4e5f67890abcdef1234567890",
  "formatted_lyrics": "[Verse 1]\nFirst line of the song\nSecond line continues\n\n[Chorus]\nThis is the chorus\nSinging out loud",
  "structure_result": "{\"num_segments\":4,\"segments\":[{\"start\":0,\"end\":15.5,\"label\":\"intro\"},{\"start\":15.5,\"end\":45.2,\"label\":\"verse\"},{\"start\":45.2,\"end\":75.0,\"label\":\"chorus\"},{\"start\":75.0,\"end\":90.0,\"label\":\"outro\"}]}",
  "audio_duration": 90,
  "trace_id": "061e5f144eb7f10b1fdde81126e24f91",
  "base_resp": {
    "status_code": 0,
    "status_msg": "success"
  }
}

Authorizations

Authorization
string
header
required

HTTP: Bearer Auth

Headers

Content-Type
enum<string>
default:application/json
required

The media type of the request body. Must be set to application/json to ensure the data is sent in JSON format.

Available options:
application/json

Body

application/json
model
enum<string>
required

Model name. Must be music-cover.

Available options:
music-cover
audio_url
string

URL of the reference audio. Exactly one of audio_url or audio_base64 must be provided.

Reference audio constraints:

  • Duration: 6 seconds to 6 minutes
  • Size: max 50 MB
  • Format: common audio formats (mp3, wav, flac, etc.)
audio_base64
string

Base64-encoded reference audio. Exactly one of audio_url or audio_base64 must be provided.

Reference audio constraints:

  • Duration: 6 seconds to 6 minutes
  • Size: max 50 MB
  • Format: common audio formats (mp3, wav, flac, etc.)

Response

200 - application/json
cover_feature_id
string

Unique identifier for the preprocessed audio features. Valid for 24 hours. Pass this to the Music Generation API cover_feature_id parameter for two-step cover generation.

Same audio content returns the same cover_feature_id (MD5-based deduplication).

formatted_lyrics
string

Structured lyrics extracted from the reference audio via ASR, formatted with section tags such as [Verse], [Chorus], [Bridge], etc. You can modify these lyrics before passing them to the Music Generation API.

structure_result
string

JSON string containing the song structure analysis result, including segment types (intro, verse, chorus, bridge, outro, inst, silence) and their start/end timestamps in seconds.

audio_duration
number<double>

Duration of the reference audio in seconds.

trace_id
string

Unique trace ID for request tracking.

base_resp
object

Status code and details