M-series Usage Tips - MiniMax API Docs

These patterns help you write stronger prompts for MiniMax Token Plan models. Each subsection pairs a weak prompt with a stronger one and explains why the stronger version is easier for the model to follow.

Jump to the topic you need — General principles for everyday quality, Tool use for agentic workflows, Long context for large source packages.

General principles

Be clear and direct

The model responds best when the task, constraints, and desired output are explicit. Tell it what to build, what to prioritize, and what a good answer should look like. Golden rule: Show your prompt to a colleague who has no context on the task. If they would be confused, the model will be too.

Less effective:

Create a visualization website

More effective:

Create an enterprise-grade data visualization website.

Requirements:
- Include charts, filters, drill-down views, and export actions.
- Prioritize fast scanning for business analysts.
- Use a polished dashboard layout instead of a marketing landing page.
- Return the implementation plan before writing code.

Add context to improve performance

When you explain why a constraint matters, the model can choose better tradeoffs. Context is especially valuable for formatting, safety, accessibility, and workflow constraints.

Less effective:

Do not use document symbols

More effective:

Your response will be read aloud by a text-to-speech model. Use plain text only, avoid document symbols, and keep sentences short enough to sound natural when spoken.

Add intent whenever the model could satisfy the literal instruction in a way that misses the real use case.

Use examples effectively

A few well-crafted examples (few-shot or multishot prompting) usually beat abstract style instructions. When adding examples, make them:

Relevant: mirror your actual use case.
Diverse: cover edge cases and at least one ambiguous input.
Concrete: show the exact output style, not just the topic.

Less effective:

Write an engaging product description for a smart thermos.

More effective:

Write a product description for a smart thermos.

Good example:
This desk lamp uses full-spectrum LED technology that simulates natural morning light to gently wake you up. It features 6 brightness levels for reading, working, and resting.

Avoid this kind of vague description:
This desk lamp is great, the light is comfortable, and the design is nice.

Now write the smart thermos description in the same concrete, benefit-led style.

For classification, structured extraction, or edge-case handling, give 3 to 5 diverse examples instead of one.

Less effective:

Classify each ticket as bug, feature request, or question.

More effective:

Classify each ticket as bug, feature request, or question.

Examples:
- "App crashes when I open settings" → bug
- "Add dark mode to the export panel" → feature request
- "How do I export to PDF?" → question
- "Crashes after requesting a new export option" → bug (the crash is the report; the feature mention is context)
- "Is this expected behavior?" with no other detail → question (do not classify as bug without confirmation)

Include 3–5 diverse examples for hard pattern-matching tasks. Repeating similar examples wastes tokens without teaching the model anything new.

Use prompt templates

For repeated tasks, turn the prompt into a reusable template with named variables. This makes it easier to test across many inputs, compare versions, and keep behavior stable.

Less effective:

Reply to this customer complaint politely.

More effective:

You are a support specialist for [product name].

Customer message:
[customer message]

Known facts you can use:
[known facts]

Write a reply that:
- Acknowledges the customer's issue in the first sentence.
- Explains the next step using only the known facts above.
- Does not promise compensation, refunds, or timelines unless they appear in the known facts.
- Ends with one clear action for the customer.

Variables make the task, inputs, and guardrails visible, which helps when you debug regressions in production prompts.

Match the output language

When the input mixes languages or you need a specific output language, say so explicitly — for example, “Reply in Chinese even if the source is in English.” Without this, the model tends to follow the dominant input language.

Output and formatting

Structure prompts with clear sections

When your prompt mixes instructions, source material, examples, constraints, and output requirements, label each section so the model can tell them apart. Bold headers or labels with a trailing colon work better than running text.

Less effective:

Review this launch plan and tell me what to improve. Keep it practical. The audience is the sales team. We care about enterprise customers and partner motions. Also make a table.

[long launch plan]

More effective:

Task: review the launch plan and identify the highest-impact improvements.

Context: the audience is the sales team. Prioritize enterprise customers and partner motions.

Source:
[long launch plan]

Output format:
Return a table with columns: Area, Issue, Recommendation, Priority.
Keep each recommendation actionable and under 40 words.

Use short, descriptive labels — Task, Context, Source, Constraints, Output format. A bold header or trailing colon is enough to mark each section. Keep the structure flat; deep nesting hurts readability.

Set role, format, and length

Role instructions work best when they define expertise, scope, and decision criteria. Output instructions work best when they specify sections, fields, and length limits that are easy to verify.

Less effective:

You are a senior engineer. Review this code and be concise.

More effective:

You are a senior backend reviewer focused on correctness, reliability, and maintainability.

Diff to review:
[diff]

Return exactly these sections:
1. Summary — 3 bullets maximum
2. Blocking issues — table with File, Risk, Recommendation
3. Non-blocking suggestions — 5 bullets maximum

Do not rewrite the entire file. Only suggest changes directly related to this diff.

Prefer concrete output contracts — section names, table columns, bullet limits, scope boundaries. Avoid vague asks like “be detailed” or “make it short” when the output must fit a downstream workflow.

Long context

Token Plan models support long context windows for both input and output. Long context works best when source material is clearly delimited, indexed, and followed by a specific task.

Place the task after the source

For long inputs, write your question or task after the source documents, not before. The model is more likely to keep the task in focus when it is closest to its own response.

Of all long-context techniques, placing the task at the end of the prompt has the largest single impact on answer quality.

Index and delimit source material

Less effective:

Read all of this and summarize the important parts.

[very long notes, specs, meeting transcripts, and code snippets]

More effective:

Sources (oldest first):

**launch-plan** — 2026-04-12
[launch plan]

**pricing-notes** — 2026-04-18
[pricing notes]

**meeting-transcript** — 2026-04-21
[meeting transcript]

Task: produce an executive brief for the launch owner. If sources conflict, prefer the newest dated source and call out the conflict.

Output format:
- Decision summary — 5 bullets maximum
- Risks — table with source references
- Open questions — owner, blocker, next action

For very large inputs, ask the model to quote or summarize the relevant parts of each document before answering. Grounding in quotes cuts noise and makes the final answer easier to verify.

Tool use

Token Plan models support tool calling. Strong tool-use prompts define when tools should be used, when they should not be, and how tool results should be combined into the final answer.

Tool definitions

Define each tool with a clear name, purpose, inputs, return shape, and failure behavior. The model should understand the tool contract before it decides to call the tool.

Less effective:

Use search when needed.

More effective:

Tool: search_docs

Purpose: search internal documentation for factual product or policy details.

Use when:
- The user asks about current product behavior, limits, pricing, or release notes.
- You need a source before making a claim that may change over time.

Do not use when:
- The user asks only for rewriting, formatting, or brainstorming.
- The answer is already fully supported by the provided context.

Arguments:
- query: concise keyword query
- product_area: optional product or feature area

Return: a list of results with title, URL, date, and snippet.

Failure: if two searches fail, stop retrying and explain what could not be verified.

Parallel tool calls

Tell the model to parallelize independent tool calls. Keep calls sequential only when one result determines the next query or action.

Less effective:

Check the docs, the issue tracker, and the changelog, then tell me whether this bug is already fixed.

More effective:

Check these independent sources in parallel:
- documentation search — current expected behavior
- issue tracker search — matching bug reports
- changelog search — recent fixes

After all results return, answer:
- Is the bug already fixed?
- Which source supports that conclusion?
- What should the user do next?

Use parallel calls for independent read-only lookups. Use sequential calls for workflows like “find the customer, then update that customer’s record.”

Avoid overeagerness

In agentic workflows, set clear stopping rules. The model should use tools when they materially improve the answer, not just to appear busy.

Less effective:

Use any available tools to solve the task.

More effective:

Use tools only when they materially improve the answer.

Rules:
- Answer directly when the question is conceptual or based only on provided context.
- Search before making claims about current prices, releases, incidents, or policies.
- Ask for confirmation before destructive actions, purchases, messages, or external writes.
- If a tool fails twice, stop retrying and explain the blocker.
- Keep tool arguments minimal and specific.

Thinking and reasoning

Control reasoning depth

Ask for deeper analysis when the task involves planning, debugging, tradeoffs, or long-horizon execution. Ask for a direct answer when the task is extraction, rewriting, or formatting.

Less effective:

Think step by step for every request, then answer.

More effective:

Use deeper reasoning for this migration plan.

First analyze:
- compatibility risks
- data migration order
- rollback strategy
- tests that must pass before release

Then return only the final plan:
1. Recommended approach
2. Key risks and mitigations
3. Release checklist
4. Open questions

Keep the reasoning concise in the final answer. Do not include hidden chain-of-thought or unrelated exploration.

For simple requests, be explicit that no deep analysis is needed:

Extract the company names from the text below. Return a JSON array only. No explanation.

Reduce hallucinations

For tasks where the model might invent facts — citations, API references, version-specific behavior, customer data — give it explicit permission to refuse and provide reference material it can quote.

Less effective:

Answer the user's billing question.

More effective:

Answer the user's billing question using only the policies below.

Policies:
[billing policy]

If the answer cannot be supported by these policies, reply:
"I cannot confirm this from current policy — please ask billing support."

Quote the exact policy line you relied on at the end of your answer.

Three patterns reduce hallucinations:

Permission to refuse — explicitly tell the model what to say when it does not know.
Reference grounding — require the model to quote or cite the source it used.
Boundaries before generation — state the allowed sources, time range, or product version before the task, not after.

Agentic and long-task workflows

For long-running tasks, give the model a small number of active goals at a time. This helps it maintain state, track decisions, and avoid juggling too many partially related tasks in parallel.

Single-window state tracking

The model can maintain strong task state inside one long context window. Keep the working plan, current status, and open questions visible in the prompt or project notes.

In tools that support context compression (such as Claude Code), keep system prompts concise. The model may terminate tasks early when approaching context capacity thresholds.

Multi-window workflow

When a task naturally breaks into phases, split it across windows.

Phased processing

Use the first window to set up the framework, tests, and scripts. Use the next window to iterate through the remaining tasks.

Structured testing

Ask the model to create tests.py or tests.json to track test cases during long iterations.

Initialization scripts

Create init.sh to start servers and run tests, avoiding repeated setup in new windows.

Restart vs compression

Use compression for one continuous task. Start a fresh window for a new task or a major change in direction.

Maximize context usage

Ask the model to finish the current part thoroughly before moving on.

Recommended system prompt for long tasks:

This is a very lengthy task. Make full use of the available context window. Complete each part thoroughly before continuing, and avoid exhausting tokens before the task is complete.

Evaluate and iterate

Treat important prompts like product configuration. Keep a small evaluation set, compare prompt versions, and record what changed.

Less effective:

Try a few prompt tweaks until the answer looks better.

More effective:

Prompt iteration workflow:
Define success criteria — correctness, format compliance, tool-use accuracy, tone.
Prepare 10–30 representative test cases, including edge cases.
Run the current prompt and the candidate prompt on the same cases.
Compare outputs side by side and record regressions.
Update the prompt only when the candidate improves the target metric without breaking required behavior.
Save a short changelog — what changed, why, and which cases improved.

This workflow helps you improve prompts without accidentally optimizing for a single impressive example.

​General principles

​Be clear and direct

​Add context to improve performance

​Use examples effectively

​Use prompt templates

​Match the output language

​Output and formatting

​Structure prompts with clear sections

​Set role, format, and length

​Long context

​Place the task after the source

​Index and delimit source material

​Tool use

​Tool definitions

​Parallel tool calls

​Avoid overeagerness

​Thinking and reasoning

​Control reasoning depth

​Reduce hallucinations

​Agentic and long-task workflows

​Single-window state tracking

​Multi-window workflow

​Evaluate and iterate

General principles

Be clear and direct

Add context to improve performance

Use examples effectively

Use prompt templates

Match the output language

Output and formatting

Structure prompts with clear sections

Set role, format, and length

Long context

Place the task after the source

Index and delimit source material

Tool use

Tool definitions

Parallel tool calls

Avoid overeagerness

Thinking and reasoning

Control reasoning depth

Reduce hallucinations

Agentic and long-task workflows

Single-window state tracking

Multi-window workflow

Evaluate and iterate