MiniMax M3 launched on June 1, 2026, replacing M2.7 as the flagship model. This is not an incremental update β it is a generational leap. New architecture (MSA), 5Γ larger context window, native multimodal support, computer use, and open weights. Here is everything that changed and whether you should switch immediately.
The short answer: yes, upgrade now. M3 is better at everything and the pricing is comparable.
What changed at a glance
| Feature | M2.7 | M3 | Change |
|---|---|---|---|
| Architecture | Standard attention | MiniMax Sparse Attention (MSA) | New |
| Context window | 200K tokens | 1,000,000 tokens | 5Γ |
| Decoding speed (1M) | Baseline | 15.6Γ faster | Massive |
| Prefill speed (1M) | Baseline | 9.7Γ faster | Massive |
| Modalities | Text only | Text + images + video | New |
| Computer use | No | Yes (desktop operation) | New |
| SWE-bench Pro | ~45%* | 59.0% | +14 points |
| BrowseComp | β | 83.5% | New benchmark |
| SVG-Bench | β | 63.7% | New benchmark |
| Input pricing | $0.30/M | $0.60/M | 2Γ |
| Output pricing | $1.20/M | $2.40/M | 2Γ |
| Cache reads | N/A | $0.12/M | New |
| Open weight | No (API only) | Yes (10 days) | New |
| OpenRouter | β | β | Same |
| Coding interface | No | code.minimax.io | New |
*M2.7βs SWE-bench Pro score is estimated from available benchmarks.
The MSA architecture
The biggest technical change is MiniMax Sparse Attention (MSA). This is not a minor optimization β it fundamentally changes how the model handles long contexts.
What MSA does:
- Enables fast inference at 1M tokens without compressing key-values
- Delivers 15.6Γ faster decoding at million-token contexts
- Delivers 9.7Γ faster prefill at million-token contexts
- Maintains full precision (unlike DeepSeekβs MLA which compresses KV cache)
Why it matters: With M2.7, using the full 200K context was slow and expensive. With M3, you can use 1M tokens and it is actually faster than M2.7 was at 200K. This makes long-context workloads (entire codebases, long documents, multi-hour agent sessions) practical for the first time.
Context window: 200K β 1M
The 5Γ context expansion is the most immediately useful change for developers:
| Use case | M2.7 (200K) | M3 (1M) |
|---|---|---|
| Analyze a medium codebase | β Fits | β Fits easily |
| Analyze a large monorepo | β Truncated | β Fits |
| Process a 100-page document | β Fits | β Fits |
| Process a 500-page document | β Truncated | β Fits |
| Multi-hour agent session | β οΈ Context fills up | β Runs for hours |
| Video understanding | β Not supported | β Native |
For detailed usage patterns, see our MiniMax M3 1M Context Guide.
Native multimodal
M2.7 was text-only. M3 handles images and video as first-class inputs:
- Images: Parse UI screenshots, charts, diagrams, documents
- Video: Process video frames for temporal reasoning
- Computer use: Operate a desktop computer (click, type, navigate)
- Visual code generation: Write code from mockups or screenshots
This puts M3 in the same category as Claude Opus 4.8 (which also has computer use) and Step 3.7 Flash (which also handles video).
Open weight
M2.7 was API-only. M3 will be fully open-weight with a technical report, expected within 10 days of launch. This means:
- Self-host on your own infrastructure
- Fine-tune for specific domains
- Run completely offline for data privacy
- Inspect model behavior and architecture
- Community quantizations (GGUF, AWQ, etc.)
See our How to Run MiniMax M3 Locally guide for hardware requirements and setup instructions (ready for when weights drop).
Coding improvements
The coding capability jump is substantial:
- SWE-bench Pro: ~45% β 59.0% (+14 points)
- Terminal-Bench 2.1: 66.0% (new benchmark, competitive with GPT-5.5)
- SVG-Bench: 63.7% (leads all models including Opus 4.7)
- Long-horizon autonomy: Reproduced an ICLR 2025 paper (12 hours, 18 commits, 23 figures)
M3 now beats GPT-5.5 on SWE-bench Pro (59.0% vs 58.6%). For agentic coding workflows, see our MiniMax M3 for Agentic Coding guide.
Pricing changes
M3 is 2Γ more expensive per token than M2.7:
| M2.7 | M3 | M3 (launch discount) | |
|---|---|---|---|
| Input | $0.30/M | $0.60/M | $0.30/M |
| Output | $1.20/M | $2.40/M | $1.20/M |
| Cache reads | N/A | $0.12/M | $0.06/M |
However, M3 is significantly more capable. The cost per task may actually be lower because:
- Better coding means fewer retries
- MSA speed means faster completion
- 1M context means no context-overflow workarounds
- Cache reads at $0.12/M make repeated contexts cheap
During the 7-day launch discount, M3 costs the same as M2.7 did.
Migration guide
Switching from M2.7 to M3 is a model string change:
# Before
model = "minimax-m2.7"
# After
model = "minimax-m3"
On OpenRouter:
# Before
model = "minimax/minimax-m2.7"
# After
model = "minimax/minimax-m3"
No other code changes needed. The API is compatible. Your existing prompts, system messages, and tool definitions work unchanged.
Should you upgrade?
Yes, immediately if you:
- Need longer context (200K was limiting)
- Want multimodal capabilities
- Do agentic coding work
- Plan to self-host when weights drop
- Want better coding quality
Wait if you:
- Are extremely cost-sensitive and the 2Γ price increase matters (use the launch discount to test)
- Have production systems extensively tested against M2.7βs specific behavior
- Need immediate self-hosting (weights are 10 days away)
For most users, the upgrade is straightforward and immediately beneficial.
FAQ
Is M3 worth the 2Γ price increase over M2.7?
Yes. The capability jump (14+ points on SWE-bench Pro, 5Γ context, multimodal, computer use) far exceeds the 2Γ cost increase. During the launch discount, it costs the same as M2.7.
Will M2.7 be deprecated?
MiniMax has not announced a deprecation date. Both models are available simultaneously. However, M3 is strictly better β there is no reason to use M2.7 for new projects.
How does M3 compare to DeepSeek V4-Pro?
DeepSeek is cheaper ($0.435/$0.87 vs $0.60/$2.40) and scores higher on SWE-bench Verified. M3 has native multimodal, computer use, and faster long-context via MSA. See our detailed comparison.
When can I self-host M3?
Weights and technical report expected within 10 days of the June 1 launch (around June 10-11). See our local deployment guide.
Does M3 work with my existing MiniMax API key?
Yes. Same API key, same endpoint. Just change the model string to minimax-m3.
What is the dedicated coding interface?
MiniMax launched code.minimax.io alongside M3 β a purpose-built coding environment similar to Claude Code or Codex. It provides an optimized interface for coding tasks with M3.