
MiniMax M2
An Efficient Model for the Agentic Era

MiniMax Hailuo 2.3
Breathtaking Motion, Lifelike Emotion

MiniMax Speech 2.6
Real-Time Response, Intelligent Parsing
Models Overview
Text
| Models | Description | Features |
|---|---|---|
| MiniMax-M2 | • Context Length: 200k tokens • Maximum Output: 128k tokens (including CoT) | • Agentic capabilities • Function calling • Advanced reasoning • Real-time streaming |
Audio
| Models | Description | Features |
|---|---|---|
| speech-2.6-hd | • Ultimate Similarity • Ultra-High Quality | • 40 languages supported • 7 emotions supported • specified languages and dialects supported |
| speech-2.6-turbo | • Ultimate Value • Low latency | • 40 languages supported • 7 emotions supported • specified languages and dialects supported |
| speech-02-hd | • Stronger replication similarity • High quality voice generation | • 24 languages supported • 7 emotions supported • specified languages and dialects supported |
| speech-02-turbo | • Superior rhythm and stability • Low latency | • 24 languages supported • 7 emotions supported • specified languages and dialects supported |
Video
| Models | Description | Res.& Dur. | FPS |
|---|---|---|---|
| MiniMax Hailuo 2.3 | • Text to Video & Image to Video • SOTA instruction following • Extreme physics mastery | • 1080p 6s • 768p 6s, 10s | 24 fps |
| MiniMax Hailuo 2.3Fast | • Image to Video • Extreme physics mastery • Value and Efficiency | • 1080p 6s • 768p 6s, 10s | 24 fps |
| MiniMax Hailuo 02 | • Text to Video & Image to Video • SOTA instruction following • Extreme physics mastery | • 1080p 6s • 768p 6s, 10s • 512p 6s, 10s | 24 fps |
Music
| Models | Description | Features |
|---|---|---|
| Music-2.0 | • Text to Music • Enhanced musicality • Natural vocals and smooth melodies | • Human-like performance • Riche emotional expression • Enhanced tone control • Realistic, expressive vocals |