Skip to main content
Subscribe to Coding Plan to use MiniMax text models at ultra-low prices!

Model Overview

MiniMax offers multiple text models to meet different scenario requirements. MiniMax-M2.5 achieves or sets new SOTA benchmarks in programming, tool calling and search, office productivity and other scenarios, while MiniMax-M2 is built for efficient coding and Agent workflows.

Supported Models

Model NameContext WindowDescription
MiniMax-M2.5204,800Peak Performance. Ultimate Value. Master the Complex (output speed approximately 60 tps)
MiniMax-M2.5-highspeed204,800M2.5 highspeed: Same performance, faster and more agile (output speed approximately 100 tps)
MiniMax-M2.1204,800Powerful Multi-Language Programming Capabilities with Comprehensively Enhanced Programming Experience (output speed approximately 60 tps)
MiniMax-M2.1-highspeed204,800Faster and More Agile (output speed approximately 100 tps)
MiniMax-M2204,800Agentic capabilities, Advanced reasoning
For details on how tps (Tokens Per Second) is calculated, please refer to FAQ > About APIs.

MiniMax M2.5 Key Highlights

M2.5 has been trained on over 10 languages (including GO, C, C++, TS, Rust, Kotlin, Python, Java, JS, PHP, Lua, Dart, Ruby) across hundreds of thousands of real-world environments. The model has evolved native Spec behavior: before writing code, it proactively decomposes functionality, structure, and UI design from an architect’s perspective, enabling comprehensive upfront planning.
M2.5 deeply integrates real-world needs and tacit knowledge from experts in finance, law, and social sciences, building professional evaluation systems and cost monitoring frameworks. In advanced office scenarios such as Word, PPT, and financial modeling, it delivers high-quality, quantifiable, and scalable professional output.
With 100TPS inference speed, reinforcement learning-optimized complex task decomposition, and improved token efficiency, M2.5 significantly reduces end-to-end time for complex tasks—on SWE-Bench Verified, average time decreased from 31.3 minutes to 22.8 minutes, achieving 37% speedup and entering the mainstream top-tier model efficiency range.
M2.5 delivers high-performance output at highly competitive pricing—continuous operation at 100TPS costs only about $1 per hour, with even lower costs for the 50TPS version. This makes long-term, multi-agent, year-round scaled deployment a reality, ushering agents into an era of “economic sustainability.”
For more model details, please refer to MiniMax M2.5

Calling Example

1

Install Anthropic SDK (Recommended)

pip install anthropic
2

Set Environment Variables

export ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic
export ANTHROPIC_API_KEY=${YOUR_API_KEY}
3

Call MiniMax-M2.5

Python
import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="MiniMax-M2.5",
    max_tokens=1000,
    system="You are a helpful assistant.",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Hi, how are you?"
                }
            ]
        }
    ]
)

for block in message.content:
    if block.type == "thinking":
        print(f"Thinking:\n{block.thinking}\n")
    elif block.type == "text":
        print(f"Text:\n{block.text}\n")
4

Example Output

{
  "thinking": "The user is just greeting me casually. I should respond in a friendly, professional manner.",
  "text": "Hi there! I'm doing well, thanks for asking. I'm ready to help you with whatever you need today—whether it's coding, answering questions, brainstorming ideas, or just chatting. What can I do for you?"
}

API Reference


Contact Us

If you encounter any issues while using MiniMax models:
  • Contact our technical support team through official channels such as email [email protected]
  • Submit an Issue on our Github repository