Skip to main content

Which models does Token Plan support? How do I switch models?

Token Plan supports MiniMax models across all modalities — text, speech, video, image, and music — under a single quota. The High-Speed subscription also supports the MiniMax-M2.7-highspeed and MiniMax-M2.5-highspeed models. To switch models, modify the model parameter in your API calls:
import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="MiniMax-M2.7",  # Switch to other models like MiniMax-M2.5
    max_tokens=1000,
    system="You are a helpful assistant.",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Hi, how are you?"
                }
            ]
        }
    ]
)
All models share the same subscription quota and billing method.

What is the High-Speed subscription? How does it differ from the Standard plan?

The High-Speed subscription is a new plan offered by Token Plan that provides dedicated support for the MiniMax-M2.7-highspeed model. Differences between MiniMax-M2.7-highspeed and MiniMax-M2.7:
  • Same performance: MiniMax-M2.7-highspeed delivers the same model capability and output quality as MiniMax-M2.7
  • Significantly faster: MiniMax-M2.7-highspeed offers considerably higher inference output speed than MiniMax-M2.7
If you have high requirements for coding tool response speed, the High-Speed subscription is recommended.

Can I upgrade my subscription plan?

Yes. Token Plan supports upgrading your plan at any time during your subscription period, including upgrading from a Standard plan to a High-Speed plan, or upgrading to a higher tier within the same plan type. You only need to pay the price difference, and the new plan takes effect immediately.

How to check Token Plan usage?

You can check your Token Plan usage in two ways: Method 1: Visit the Subscription Management Page Visit the Billing > Token Plan page to view your usage. Method 2: Use the API Endpoint
curl --location 'https://www.minimax.io/v1/api/openplatform/coding_plan/remains' \
--header 'Authorization: Bearer <API Key>' \
--header 'Content-Type: application/json'

How is the “reset every 5 hours” calculated?

It is a dynamic rate limit. The system calculates your total request usage within the current 5-hour window. Any usage from more than 5 hours ago is automatically released from the count.

Can the Token Plan API Key and the standard Open Platform API Key be used interchangeably?

No, they cannot.
  • Token Plan API Key: Is exclusively for the Token Plan subscription. Usage is measured by the number of requests and is subject to the 5-hour rolling limit. It provides access to models across all modalities.
  • Other Open Platform API Keys: Are used for pay-as-you-go access to all MiniMax models. Billing is based on actual token consumption and depletes your account balance.

How is TPS (Tokens Per Second) calculated for text models?

TPS measures the number of tokens generated per second, and is used to evaluate the inference output speed of a model. The formula is: TPS=Number of output tokensTime of last tokenTime of first token\text{TPS} = \frac{\text{Number of output tokens}}{\text{Time of last token} - \text{Time of first token}} In other words, timing starts when the model outputs the first token and ends when the last token is generated. The total number of tokens produced is then divided by that elapsed time (in seconds).
TPS may fluctuate during actual usage. The TPS values indicated on each model page are reference values.

What are the limits of the Token Plan? Is it suitable for production?

The Token Plan is designed for individual, interactive developer use, with higher-tier plans offering increased quotas. It is recommended to use pay-as-you-go for production use. Key limits include:
  • Rate limits (RPM / TPM): Requests may be throttled when exceeded; typically reset within ~1 minute and may tighten during peak traffic
  • Quota limits: Usage caps apply per 5-hour window and per week, with automatic resets