Jun 1, 2026 · 5 min read

MiniMax M3 vs M2.7: What Changed and Should You Upgrade?

MiniMax M3 launched on June 1, 2026, replacing M2.7 as the flagship model. This is not an incremental update — it is a generational leap. New architecture (MSA), 5× larger context window, native multimodal support, computer use, and open weights. Here is everything that changed and whether you should switch immediately.

The short answer: yes, upgrade now. M3 is better at everything and the pricing is comparable.

What changed at a glance

Feature	M2.7	M3	Change
Architecture	Standard attention	MiniMax Sparse Attention (MSA)	New
Context window	200K tokens	1,000,000 tokens	5×
Decoding speed (1M)	Baseline	15.6× faster	Massive
Prefill speed (1M)	Baseline	9.7× faster	Massive
Modalities	Text only	Text + images + video	New
Computer use	No	Yes (desktop operation)	New
SWE-bench Pro	~45%*	59.0%	+14 points
BrowseComp	—	83.5%	New benchmark
SVG-Bench	—	63.7%	New benchmark
Input pricing	$0.30/M	$0.60/M	2×
Output pricing	$1.20/M	$2.40/M	2×
Cache reads	N/A	$0.12/M	New
Open weight	No (API only)	Yes (10 days)	New
OpenRouter	✅	✅	Same
Coding interface	No	code.minimax.io	New

*M2.7’s SWE-bench Pro score is estimated from available benchmarks.

The MSA architecture

The biggest technical change is MiniMax Sparse Attention (MSA). This is not a minor optimization — it fundamentally changes how the model handles long contexts.

What MSA does:

Enables fast inference at 1M tokens without compressing key-values
Delivers 15.6× faster decoding at million-token contexts
Delivers 9.7× faster prefill at million-token contexts
Maintains full precision (unlike DeepSeek’s MLA which compresses KV cache)

Why it matters: With M2.7, using the full 200K context was slow and expensive. With M3, you can use 1M tokens and it is actually faster than M2.7 was at 200K. This makes long-context workloads (entire codebases, long documents, multi-hour agent sessions) practical for the first time.

Context window: 200K → 1M

The 5× context expansion is the most immediately useful change for developers:

Use case	M2.7 (200K)	M3 (1M)
Analyze a medium codebase	✅ Fits	✅ Fits easily
Analyze a large monorepo	❌ Truncated	✅ Fits
Process a 100-page document	✅ Fits	✅ Fits
Process a 500-page document	❌ Truncated	✅ Fits
Multi-hour agent session	⚠️ Context fills up	✅ Runs for hours
Video understanding	❌ Not supported	✅ Native

For detailed usage patterns, see our MiniMax M3 1M Context Guide.

Native multimodal

M2.7 was text-only. M3 handles images and video as first-class inputs:

Images: Parse UI screenshots, charts, diagrams, documents
Video: Process video frames for temporal reasoning
Computer use: Operate a desktop computer (click, type, navigate)
Visual code generation: Write code from mockups or screenshots

This puts M3 in the same category as Claude Opus 4.8 (which also has computer use) and Step 3.7 Flash (which also handles video).

Open weight

M2.7 was API-only. M3 will be fully open-weight with a technical report, expected within 10 days of launch. This means:

Self-host on your own infrastructure
Fine-tune for specific domains
Run completely offline for data privacy
Inspect model behavior and architecture
Community quantizations (GGUF, AWQ, etc.)

See our How to Run MiniMax M3 Locally guide for hardware requirements and setup instructions (ready for when weights drop).

Coding improvements

The coding capability jump is substantial:

SWE-bench Pro: ~45% → 59.0% (+14 points)
Terminal-Bench 2.1: 66.0% (new benchmark, competitive with GPT-5.5)
SVG-Bench: 63.7% (leads all models including Opus 4.7)
Long-horizon autonomy: Reproduced an ICLR 2025 paper (12 hours, 18 commits, 23 figures)

M3 now beats GPT-5.5 on SWE-bench Pro (59.0% vs 58.6%). For agentic coding workflows, see our MiniMax M3 for Agentic Coding guide.

Pricing changes

M3 is 2× more expensive per token than M2.7:

	M2.7	M3	M3 (launch discount)
Input	$0.30/M	$0.60/M	$0.30/M
Output	$1.20/M	$2.40/M	$1.20/M
Cache reads	N/A	$0.12/M	$0.06/M

However, M3 is significantly more capable. The cost per task may actually be lower because:

Better coding means fewer retries
MSA speed means faster completion
1M context means no context-overflow workarounds
Cache reads at $0.12/M make repeated contexts cheap

During the 7-day launch discount, M3 costs the same as M2.7 did.

Migration guide

Switching from M2.7 to M3 is a model string change:

# Before
model = "minimax-m2.7"

# After
model = "minimax-m3"

On OpenRouter:

# Before
model = "minimax/minimax-m2.7"

# After
model = "minimax/minimax-m3"

No other code changes needed. The API is compatible. Your existing prompts, system messages, and tool definitions work unchanged.

Should you upgrade?

Yes, immediately if you:

Need longer context (200K was limiting)
Want multimodal capabilities
Do agentic coding work
Plan to self-host when weights drop
Want better coding quality

Wait if you:

Are extremely cost-sensitive and the 2× price increase matters (use the launch discount to test)
Have production systems extensively tested against M2.7’s specific behavior
Need immediate self-hosting (weights are 10 days away)

For most users, the upgrade is straightforward and immediately beneficial.

FAQ

Is M3 worth the 2× price increase over M2.7?

Yes. The capability jump (14+ points on SWE-bench Pro, 5× context, multimodal, computer use) far exceeds the 2× cost increase. During the launch discount, it costs the same as M2.7.

Will M2.7 be deprecated?

MiniMax has not announced a deprecation date. Both models are available simultaneously. However, M3 is strictly better — there is no reason to use M2.7 for new projects.

How does M3 compare to DeepSeek V4-Pro?

DeepSeek is cheaper ($0.435/$0.87 vs $0.60/$2.40) and scores higher on SWE-bench Verified. M3 has native multimodal, computer use, and faster long-context via MSA. See our detailed comparison.

When can I self-host M3?

Weights and technical report expected within 10 days of the June 1 launch (around June 10-11). See our local deployment guide.

Does M3 work with my existing MiniMax API key?

Yes. Same API key, same endpoint. Just change the model string to minimax-m3.

What is the dedicated coding interface?

MiniMax launched code.minimax.io alongside M3 — a purpose-built coding environment similar to Claude Code or Codex. It provides an optimized interface for coding tasks with M3.