Skip to main content

Which models does Coding Plan support? How do I switch models?

Coding Plan supports the following text models: MiniMax-M2.5, MiniMax-M2.1, MiniMax-M2. The High-Speed subscription also supports the MiniMax-M2.5-highspeed model. To switch models, modify the model parameter in your API calls:
import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="MiniMax-M2.5",  # Switch to other models like MiniMax-M2.1
    max_tokens=1000,
    system="You are a helpful assistant.",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Hi, how are you?"
                }
            ]
        }
    ]
)
All models share the same subscription quota and billing method.

What is the High-Speed subscription? How does it differ from the Standard plan?

The High-Speed subscription is a new plan offered by Coding Plan that provides dedicated support for the MiniMax-M2.5-highspeed model. Differences between MiniMax-M2.5-highspeed and MiniMax-M2.5:
  • Same performance: MiniMax-M2.5-highspeed delivers the same model capability and output quality as MiniMax-M2.5
  • Significantly faster: MiniMax-M2.5-highspeed offers considerably higher inference output speed than MiniMax-M2.5
If you have high requirements for coding tool response speed, the High-Speed subscription is recommended.

Can I upgrade my subscription plan?

Yes. Coding Plan supports upgrading your plan at any time during your subscription period, including upgrading from a Standard plan to a High-Speed plan, or upgrading to a higher tier within the same plan type. You only need to pay the price difference, and the new plan takes effect immediately.

How to check Coding Plan usage?

You can check your Coding Plan usage in two ways: Method 1: Visit the Subscription Management Page Visit the Billing > Coding Plan page to view your usage. Method 2: Use the API Endpoint
curl --location 'https://www.minimax.io/v1/api/openplatform/coding_plan/remains' \
--header 'Authorization: Bearer <API Key>' \
--header 'Content-Type: application/json'

How is the “reset every 5 hours” calculated?

It is a dynamic rate limit. The system calculates your total prompt usage within the current 5-hours window. Any usage from more than 5 hours ago is automatically released from the count.

Why is “1 prompt ≈ 15 model calls”?

In an AI coding tool, a single action you take (like requesting code completion or an explanation) may be broken down by the tool into multiple, consecutive interactions with the AI model behind the scenes (e.g., fetching context, generating suggestions, refining suggestions, etc.). To simplify billing, we bundle these backend calls into a single “prompt” count. This means that 1 “prompt” within your plan actually covers multiple complex model invocations.

Can the Coding Plan API Key and the standard Open Platform API Key be used interchangeably?

No, they cannot.
  • Coding Plan API Key: Is exclusively for the Coding Plan subscription. Usage is measured by the number of “prompts” and is subject to the 5-hours limit.
  • Other Open Platform API Keys: Are used for all other text-based AI services (including when you switch to pay-as-you-go within a coding tool). Billing is based on actual token consumption and depletes your account balance.

How is TPS (Tokens Per Second) calculated for text models?

TPS measures the number of tokens generated per second, and is used to evaluate the inference output speed of a model. The formula is: TPS=Number of output tokensTime of last tokenTime of first token\text{TPS} = \frac{\text{Number of output tokens}}{\text{Time of last token} - \text{Time of first token}} In other words, timing starts when the model outputs the first token and ends when the last token is generated. The total number of tokens produced is then divided by that elapsed time (in seconds).
TPS may fluctuate during actual usage. The TPS values indicated on each model page are reference values.