Skip to main content

Models Overview

Text

Models          Description          Features                
MiniMax-M2• Context Length: 200k tokens
• Maximum Output: 128k tokens (including CoT)
• Agentic capabilities
• Function calling
• Advanced reasoning
• Real-time streaming
MiniMax-M1• 80K CoT Length
• 1M max input tokens
• 80K max output tokens
• Streaming
• Function calling
• Reasoning

Audio

Models            Description          Features                          
speech-2.5-hd-preview• Ultimate Similarity
• Ultra-High Quality
• 40 languages supported
• 7 emotions supported
• specified languages and dialects supported
speech-2.5-turbo-preview• Ultimate Value
• Low latency
• 40 languages supported
• 7 emotions supported
• specified languages and dialects supported
speech-02-hd• Stronger replication similarity
• High quality voice generation
• 24 languages supported
• 7 emotions supported
• specified languages and dialects supported
speech-02-turbo• Superior rhythm and stability
• Low latency
• 24 languages supported
• 7 emotions supported
• specified languages and dialects supported

Video

Models            Description                  Res.& Dur.        FPS        
MiniMax Hailuo 02• Text to Video & Image to Video
• SOTA instruction following
• Extreme physics mastery
• 1080p 6s
• 768p 6s, 10s
• 512p 6s, 10s
24 fps
T2V-01-Director• Text to Video
• Enhanced precision shot control
• 720p 6s25 fps
I2V-01-Director• Image to Video
• Enhanced precision shot control
• 720p 6s25 fps
S2V-01• Image to Video
• Maintaining Character Consistency
• 720p 6s25 fps
⌘I