Which models does Token Plan support? How do I switch models?
Token Plan supports MiniMax models across all modalities — text, speech, video, image, and music — under a single quota. The High-Speed subscription also supports theMiniMax-M2.7-highspeed and MiniMax-M2.5-highspeed models.
To switch models, modify the model parameter in your API calls:
What is the High-Speed subscription? How does it differ from the Standard plan?
The High-Speed subscription is a new plan offered by Token Plan that provides dedicated support for theMiniMax-M2.7-highspeed model.
Differences between MiniMax-M2.7-highspeed and MiniMax-M2.7:
- Same performance:
MiniMax-M2.7-highspeeddelivers the same model capability and output quality asMiniMax-M2.7 - Significantly faster:
MiniMax-M2.7-highspeedoffers considerably higher inference output speed thanMiniMax-M2.7
Can I upgrade my subscription plan?
Yes. Token Plan supports upgrading your plan at any time during your subscription period, including upgrading from a Standard plan to a High-Speed plan, or upgrading to a higher tier within the same plan type. You only need to pay the price difference, and the new plan takes effect immediately.How to check Token Plan usage?
You can check your Token Plan usage in two ways: Method 1: Visit the Subscription Management Page Visit the Billing > Token Plan page to view your usage. Method 2: Use the API EndpointHow is the “reset every 5 hours” calculated?
It is a dynamic rate limit. The system calculates your total request usage within the current 5-hour window. Any usage from more than 5 hours ago is automatically released from the count.Can the Token Plan API Key and the standard Open Platform API Key be used interchangeably?
No, they cannot.- Token Plan API Key: Is exclusively for the Token Plan subscription. Usage is measured by the number of requests and is subject to the 5-hour rolling limit. It provides access to models across all modalities.
- Other Open Platform API Keys: Are used for pay-as-you-go access to all MiniMax models. Billing is based on actual token consumption and depletes your account balance.
How is TPS (Tokens Per Second) calculated for text models?
TPS measures the number of tokens generated per second, and is used to evaluate the inference output speed of a model. The formula is: In other words, timing starts when the model outputs the first token and ends when the last token is generated. The total number of tokens produced is then divided by that elapsed time (in seconds).TPS may fluctuate during actual usage. The TPS values indicated on each model page are reference values.
What are the limits of the Token Plan? Is it suitable for production?
The Token Plan is designed for individual, interactive developer use, with higher-tier plans offering increased quotas. It is recommended to use pay-as-you-go for production use. Key limits include:- Rate limits (RPM / TPM): Requests may be throttled when exceeded; typically reset within ~1 minute and may tighten during peak traffic
- Quota limits: Usage caps apply per 5-hour window and per week, with automatic resets