What changed in the Token Plan upgrade?
Token Plan now uses usage-based deduction. The main changes are:- Usage-based deduction: Usage is deducted according to actual resource consumption. Simple tasks consume less, while complex tasks deduct based on real usage.
- Unified quota pool: Model usage covered by Token Plan shares the same included Token Plan quota instead of separate quotas by capability.
- Clearer usage display: The console shows current usage through a usage bar.
- Updated model capabilities: New model capabilities such as M3 can be used through the relevant API or tool integration pages.
Will the upgrade reduce my usable quota?
No. Daily chat, writing, translation, coding, and search workflows should remain stable. With usage-based deduction, simple tasks usually consume less, while long-context, multi-turn reasoning, multimodal tasks, and complex agent workflows deduct according to actual resource usage. The console usage bar is the source of truth for your current available usage.Is usage-based deduction a price increase?
No. Under the older fixed-count model, a simple question and a complex reasoning task could consume the same amount of quota. With usage-based deduction, usage follows actual resource consumption: simple tasks no longer deduct the same amount as complex ones. You do not need to manually calculate model-specific allowances. In normal use, track the usage bar in the console.What is a Subscription Key?
The Subscription Key is the key used for both Token Plan subscriptions and purchased Credits. Each user has a dedicated Subscription Key for each Team they belong to. The key can exist before any Token Plan seat or Credits are available. In that state, it has no usable paid resources. Once a Token Plan seat is assigned, or Credits access is available, the same key can use those resources. Subscription Keys are separate from standard pay-as-you-go API Keys.Can I use Credits without a Token Plan subscription?
Yes. Credits can be purchased and used without an assigned Token Plan subscription seat. Purchased Credits still use the Subscription Key and have the same resource coverage as Token Plan. If you have no Token Plan seat but have Credits access, usage within that coverage is charged to purchased Credits. If you have both a Token Plan subscription quota and purchased Credits, the subscription quota is used first, then purchased Credits automatically cover eligible overflow within Token Plan resource coverage. In short, Token Plan is the primary subscription quota, and purchased Credits are the supplemental balance. Both work through the same Subscription Key. Use a pay-as-you-go API Key for resources outside Token Plan coverage. For details, see Token Plan pricing.What is the Default Team?
Every user gets a Default Team when creating an account. It is the user’s personal Team:- It has one member only.
- Other users cannot be invited to it.
- The user is the sole Owner.
- The Owner can buy individual Token Plan subscriptions and Credits for personal use.
Which resources does Token Plan support? How do I switch resources?
Token Plan supports all models on the API Platform. You do not need to track separate allowances by model; usage is shown through a unified usage bar. For API endpoints that have pay-as-you-go pricing, usage deducts from the included Token Plan quota according to the corresponding endpoint pricing. The usage bar is the user-facing indicator of remaining subscription usage. Different plans include different quotas and approximate usage capacity. See the pricing page for details. To switch models, use the model ID documented by the relevant API or tool integration page.Are text, image, audio, and other quotas separated?
No. Model usage covered by Token Plan shares one included Token Plan quota. You can flexibly use the quota across supported models and capabilities. The console is the source of truth for available resources and actual usage.Which Token Plan tiers are available?
The currently available public subscription tiers are Plus, Max, and Ultra. Tiers differ by price, quota windows, and typical agent usage. See Token Plan pricing for details.Can I upgrade my subscription plan?
Yes. Token Plan supports upgrading your plan during your subscription period. You only need to pay the price difference, and the new plan takes effect immediately.How to check Token Plan usage?
You can check your Token Plan usage in two ways: Method 1: Visit the Subscription Management Page Visit the Billing > Token Plan page to view your usage. Method 2: Use the API Endpoint- Low consumption: Daily chat, translation, and simple writing.
- Medium consumption: Code generation and multi-turn conversations.
- Higher consumption: Long-context reasoning, multimodal tasks, and complex agent workflows.
How is usage reset?
Token Plan usage is shown through the console usage bar and controlled by quota windows:- Included Token Plan quota: Controlled by a 5-hour rolling window and a weekly window.
- Subscription cycle: Unused included Token Plan quota does not carry over to the next billing cycle.
What are migration compensation Credits?
Migration compensation Credits are a transition grant for some users moved from legacy Token Plan behavior to the current usage-based quota model. They are intended to make the transition smoother while users adapt to the new quota windows. Compensation Credits can cover eligible Token Plan resource usage and have their own validity period. The exact amount, validity, and eligibility are shown in the console. For annual subscriptions, eligible compensation Credits may be issued along with the remaining subscription cycles; each grant has its own validity period.What happens to my legacy plan after the upgrade?
Paid-cycle benefits remain available. Active legacy plans are either kept or migrated to the corresponding current tier according to the plan type. The Subscription Management page is the source of truth for your current plan status. Notes:- Some retired tiers are available only to existing users and cannot be subscribed to again after cancellation.
- Retired tiers continue under the migration rules during the current subscription period, then move to the corresponding current tier as shown in the console.
- Annual-plan users keep paid-month benefits; any compensation Credits follow the validity period shown in the console.
What happens when I reach the usage limit?
When reaching the 5-hour or weekly quota:- Use purchased Credits
If purchased Credits are available, usage within Token Plan resource coverage can be automatically covered by purchased Credits. - Upgrade your subscription
You can visit the Token Plan page to upgrade to a higher-tier plan for more request quota. Token Plan supports upgrading at any time, and upgrades take effect immediately. - Switch to pay-as-you-go
If you wish to continue without rate limits, you can replace your Subscription Key with your standard MiniMax Open Platform API Key from the account management system. This will switch the tool to a pay-as-you-go model based on actual token usage, which will consume your Open Platform account balance. - Wait for the quota window to reset
Included Token Plan quota is controlled by 5-hour rolling and weekly windows. Unused included quota does not carry over to the next billing cycle.
Can the Subscription Key and the standard Open Platform API Key be used interchangeably?
No, they cannot.- Subscription Key: Is used for Token Plan subscriptions and purchased Credits. Pay-as-you-go-priced API endpoints deduct from the included Token Plan quota according to the corresponding endpoint pricing. Purchased Credits have the same resource coverage as Token Plan and can cover eligible overflow beyond subscription quota.
- Other Open Platform API Keys: Are used for pay-as-you-go access to standard Open Platform API endpoints. Billing is based on actual token consumption and depletes your account balance.
How does API-vlm work with Token Plan?
API-vlm supports multimodal understanding for image inputs. Its output is text. When called with Token Plan, API-vlm deducts from the included Token Plan quota according to its pay-as-you-go price. If the included quota is exhausted and purchased Credits are available, additional usage can be automatically covered by purchased Credits.Can I use my subscription in multiple tools at the same time?
Yes. You can use the same subscription in all supported tools, but the quota is shared. Usage from all tools consumes the same included Token Plan quota.How do I cancel auto-renewal?
You can cancel auto-renewal on the Subscription Management page. Before canceling, note:- Included Token Plan quota already issued for the current period remains usable within its validity period.
- Any migration compensation Credits you have received remain usable within their validity period.
- Some retired tiers are available only to existing users and cannot be subscribed to again after cancellation.
How is TPS (Tokens Per Second) calculated for LLMs?
TPS measures the number of tokens generated per second, and is used to evaluate the inference output speed of a model. The formula is: In other words, timing starts when the model outputs the first token and ends when the last token is generated. The total number of tokens produced is then divided by that elapsed time (in seconds).TPS may fluctuate during actual usage. The TPS values indicated on each model page are reference values.
What are the limits of the Token Plan? Is it suitable for production?
The Token Plan is designed for individual, interactive developer use, with higher-tier plans offering increased quotas. It is recommended to use pay-as-you-go for production use. Key limits include:- Rate limits (RPM / TPM): Requests may be throttled when exceeded; typically reset within ~1 minute and may tighten during peak traffic.
- Included Token Plan quota: 5-hour rolling and weekly quota windows.
What are the platform traffic rules?
To ensure service stability and availability for all users, the MiniMax platform may implement dynamic rate limiting during peak hours. The details are as follows: We have observed that some requests come from ultra-high-concurrency automated batch tasks or multi-user sharing patterns. To prevent a small number of abnormal traffic from occupying public computing resources and to ensure a stable experience for the majority of users, we will implement rate control based on account usage dimensions to ensure fair distribution of computing resources. Platform Rate Limiting Rules Consistent with industry practices, MiniMax will implement dynamic rate limiting during peak hours:- Peak Traffic Hours: Dynamically adjusted based on cluster load, typically occurring on weekdays from 15:00-17:30:
- Plus: Supports approximately 3-4 agents
- Max: Supports approximately 4-5 agents
- Ultra: Supports approximately 6-7 agents
- Included Token Plan quota: Controlled by 5-hour rolling and weekly windows. Unused included quota does not carry over to the next billing cycle.