Anthropic dropped Claude Opus 4.7 yesterday (April 16, 2026), just ten weeks after Opus 4.6. The headline numbers are impressive — a 10.9-point jump on SWE-bench Pro, near-perfect vision scores, and a bunch of new agentic features. But there’s a catch: a new tokenizer that inflates token counts by up to 35%. Here’s the full breakdown.
At a glance
| Opus 4.7 | Opus 4.6 | |
|---|---|---|
| Released | April 16, 2026 | Feb 5, 2026 |
| SWE-bench Pro | 64.3% | 53.4% |
| CursorBench | 70% | 58% |
| Vision (XBOW) | 98.5% | 54.5% |
| SWE-bench Multilingual | 80.5% | 77.8% |
| Context window | 1M tokens | 1M tokens (beta) |
| Pricing (input/output) | $5 / $25 per 1M | $5 / $25 per 1M |
| Tokenizer | New (up to 35% more tokens) | Standard |
| Effort levels | low, medium, high, xhigh, max | low, medium, high, max |
| Vision resolution | 2,576px long edge (~3.75 MP) | ~1,568px long edge (~1.15 MP) |
| Prefilling | ❌ Not supported (400 error) | ✅ Supported |
The per-token pricing is identical, but the new tokenizer means the same text produces more tokens — so your effective cost goes up. More on that below.
Where Opus 4.7 wins
Coding
The jump from 53.4% to 64.3% on SWE-bench Pro is substantial — that’s a 20% relative improvement. CursorBench tells a similar story, going from 58% to 70%. In practice, Opus 4.7 handles multi-file refactors, complex debugging, and cross-language tasks noticeably better. The SWE-bench Multilingual bump from 77.8% to 80.5% confirms it’s not just English-centric improvements.
Vision
This is the biggest leap. XBOW scores went from 54.5% to 98.5% — nearly doubling. The max resolution jumped to 2,576px on the long edge (~3.75 megapixels), roughly 3x the area of Opus 4.6. If you’re doing screenshot analysis, diagram parsing, or UI review, this is a different model.
New features
- xhigh effort level — slots between
highandmax, giving you a better cost/quality tradeoff for tasks that need more reasoning but don’t justify max compute. - File system memory — the model can persist context across sessions by reading/writing to the file system. This is a big deal for agentic workflows.
- Self-verification — Opus 4.7 checks its own outputs before returning them. This reduces hallucinations and incorrect code, especially on longer tasks.
- Adaptive thinking is now the default — the
budget_tokensparameter is deprecated. The model decides how much to think on its own. - Task budgets (beta) — set spending limits on agentic tasks.
- Auto mode for Max users — Claude.ai Max subscribers get automatic model routing.
- /ultrareview in Claude Code — a new command for deep code review passes.
The elephant in the room: the tokenizer tax
The pricing table says $5/$25 per million tokens — same as Opus 4.6. But Opus 4.7 ships with a new tokenizer that produces up to 35% more tokens for the same input text. This means:
- A prompt that cost $1.00 on Opus 4.6 could cost up to $1.35 on Opus 4.7
- Your context window fills up faster
- Rate limits hit sooner
Anthropic hasn’t said much about why the tokenizer changed. The likely explanation is that the new tokenizer supports the improved multilingual and vision capabilities, but the cost impact is real. If you’re running high-volume API workloads, benchmark your actual token counts before switching. The 35% figure is a worst case — typical English text seems to land around 15-25% more tokens in early testing.
There’s no way to use the old tokenizer with the new model.
The Opus 4.6 degradation context
It’s impossible to talk about Opus 4.7 without addressing what happened to Opus 4.6 in the weeks before this release.
Starting in mid-March, users began reporting that Opus 4.6 felt noticeably worse — shorter responses, shallower reasoning, more refusals. This wasn’t just vibes. A HuggingFace analysis across 6,852 sessions documented a 67% drop in reasoning depth. BridgeBench accuracy fell from 83.3% (ranked #2) to 68.3% (ranked #10). An AMD senior director posted forensic evidence on GitHub showing measurable degradation in structured outputs.
The community coined the term “AI shrinkflation” — same price, less capability. Anthropic denied modifying the model weights.
Whether the degradation was intentional, a side effect of infrastructure changes, or something else entirely remains unclear. What is clear: many users feel that Opus 4.7 restores the quality they were getting from Opus 4.6 at launch. Some have joked that Opus 4.7 feels like “early Opus 4.6” — which, depending on your perspective, is either reassuring or concerning.
Migration checklist
If you’re moving API code from Opus 4.6 to 4.7, here’s what to change:
- Update the model string — use the new Opus 4.7 model identifier in your API calls.
- Remove
budget_tokens— this parameter is deprecated. Adaptive thinking is now the default. Sendingbudget_tokensstill works but is ignored. - Remove any prefilling — Opus 4.7 returns a 400 error if you include assistant prefill content. Strip any
assistantmessage prefills from your requests. - Test your token counts — the new tokenizer will inflate your usage. Run your typical prompts through the tokenizer endpoint and compare.
- Consider
xhigheffort — if you were usingmaxeffort and finding it slow/expensive, tryxhighas a middle ground. - Update vision pipelines — if you were downscaling images for Opus 4.6, you can now send higher-resolution images (up to 2,576px long edge) for better results.
- Test agent workflows — file system memory and self-verification change how the model behaves in multi-turn agentic loops. Test thoroughly.
Should you upgrade?
Claude.ai / Max users: Yes, just use it. You get better coding, dramatically better vision, and new features like /ultrareview and auto mode. The tokenizer change doesn’t affect you on subscription plans.
API users (low volume): Probably yes. The quality improvements are significant, especially for coding and vision tasks. The tokenizer cost increase is manageable at low volumes. Just remove prefilling and budget_tokens first.
API users (high volume): Test first. Run your actual workloads and measure the token count difference. For some use cases the 15-35% token inflation could meaningfully impact costs. If you’re heavily reliant on prefilling, you’ll need to refactor that out entirely. The quality gains are real, but so is the cost increase — do the math for your specific usage.
If you were affected by Opus 4.6 degradation: Upgrade immediately. Early reports suggest Opus 4.7 restores (and exceeds) the quality level of Opus 4.6 at launch.
Related: Claude Opus 4 vs GPT-5 · How to Use Claude Code · AI Coding Tools Pricing 2026 · AI Model Comparison