Subscribe to Coding Plan to use MiniMax text models at ultra-low prices!
Model Overview
MiniMax offers multiple text models to meet different scenario requirements. MiniMax-M2.5 achieves or sets new SOTA benchmarks in programming, tool calling and search, office productivity and other scenarios, while MiniMax-M2 is built for efficient coding and Agent workflows.Supported Models
| Model Name | Context Window | Description |
|---|---|---|
| MiniMax-M2.5 | 204,800 | Peak Performance. Ultimate Value. Master the Complex (output speed approximately 60 tps) |
| MiniMax-M2.5-highspeed | 204,800 | M2.5 highspeed: Same performance, faster and more agile (output speed approximately 100 tps) |
| MiniMax-M2.1 | 204,800 | Powerful Multi-Language Programming Capabilities with Comprehensively Enhanced Programming Experience (output speed approximately 60 tps) |
| MiniMax-M2.1-highspeed | 204,800 | Faster and More Agile (output speed approximately 100 tps) |
| MiniMax-M2 | 204,800 | Agentic capabilities, Advanced reasoning |
For details on how tps (Tokens Per Second) is calculated, please refer to FAQ > About APIs.
MiniMax M2.5 Key Highlights
Programming: Think and Build Like an Architect
Programming: Think and Build Like an Architect
M2.5 has been trained on over 10 languages (including GO, C, C++, TS, Rust, Kotlin, Python, Java, JS, PHP, Lua, Dart, Ruby) across hundreds of thousands of real-world environments. The model has evolved native Spec behavior: before writing code, it proactively decomposes functionality, structure, and UI design from an architect’s perspective, enabling comprehensive upfront planning.
Office Productivity: Deliverable Quality in Word, PPT, and Excel Financial Modeling
Office Productivity: Deliverable Quality in Word, PPT, and Excel Financial Modeling
M2.5 deeply integrates real-world needs and tacit knowledge from experts in finance, law, and social sciences, building professional evaluation systems and cost monitoring frameworks. In advanced office scenarios such as Word, PPT, and financial modeling, it delivers high-quality, quantifiable, and scalable professional output.
Efficiency: Faster Decomposition, Faster Execution, Faster Delivery
Efficiency: Faster Decomposition, Faster Execution, Faster Delivery
With 100TPS inference speed, reinforcement learning-optimized complex task decomposition, and improved token efficiency, M2.5 significantly reduces end-to-end time for complex tasks—on SWE-Bench Verified, average time decreased from 31.3 minutes to 22.8 minutes, achieving 37% speedup and entering the mainstream top-tier model efficiency range.
Cost: Making Complex Agents Truly Sustainable for Long-Term Operation
Cost: Making Complex Agents Truly Sustainable for Long-Term Operation
M2.5 delivers high-performance output at highly competitive pricing—continuous operation at 100TPS costs only about $1 per hour, with even lower costs for the 50TPS version. This makes long-term, multi-agent, year-round scaled deployment a reality, ushering agents into an era of “economic sustainability.”
For more model details, please refer to MiniMax M2.5
Calling Example
API Reference
Anthropic API Compatible (Recommended)
Call MiniMax models via Anthropic SDK, supporting streaming output and Interleaved Thinking
OpenAI API Compatible
Call MiniMax models via OpenAI SDK
Text Generation
Call text generation API directly via HTTP requests
Using M2.5 in AI Coding Tools
Use M2.5 in Claude Code, Cursor, Cline and other tools
Contact Us
If you encounter any issues while using MiniMax models:- Contact our technical support team through official channels such as email [email protected]
- Submit an Issue on our Github repository