One of the biggest practical changes in Claude Sonnet 5 is selectable reasoning effort. Where Sonnet 4.6 ran at a fixed depth, Sonnet 5 lets you choose how hard the model thinks: low, medium, high, max, and x-high. Used well, effort levels are your single most powerful cost and quality lever. Used carelessly, they quietly erase the modelβs price advantage. Here is how to tune them.
What effort levels do
Higher effort means the model spends more reasoning tokens before answering. That generally improves accuracy on hard tasks, at the cost of more tokens and more latency. Lower effort is faster and cheaper, and for routine work it is often just as good.
| Effort level | Best for |
|---|---|
| low | Boilerplate, simple edits, formatting, quick lookups |
| medium | Standard feature work, routine bug fixes, most coding |
| high | Multi-file changes, trickier debugging, careful reasoning |
| max | Hard problems where accuracy clearly matters |
| x-high | The toughest reasoning tasks, used sparingly |
The cost trap you must understand
Here is the detail that surprises people. At its x-high setting, Sonnet 5 performs about in line with Opus 4.8 at a medium-to-high setting on benchmarks like OSWorld and BrowseComp. But running Sonnet 5 at x-high can cost more than running Opus 4.8 at that comparable accuracy point.
In other words, maxing out Sonnet 5 to match the flagship is not always the cheaper path. The savings from choosing Sonnet 5 come from running it at lower effort for the bulk of work, not from pushing it to its ceiling on everything.
This compounds with the new tokenizer, which can raise effective token counts by up to 1.35 times. See pricing explained for the full math.
A practical default strategy
- Set medium as your baseline. It handles most coding and knowledge work well.
- Drop to low for routine tasks. Boilerplate, formatting, and simple edits rarely need deep reasoning.
- Bump to high only when a task needs it. Reserve it for multi-file changes and harder debugging.
- Use max and x-high sparingly. Before reaching for x-high, ask whether Opus 4.8 at lower effort would be cheaper for the same accuracy.
- Escalate models, not just effort. For genuinely hard problems, switching to Opus 4.8 is often smarter than maxing out Sonnet 5.
Effort in Claude Code and the API
In the API, set the effort level per request so you can match it to task difficulty programmatically. In Claude Code, use lower effort for routine edits and raise it for complex work. The same discipline applies in Aider.
Measuring the tradeoff
Track tokens and outcomes per effort level on your real tasks. You will usually find that medium handles a large majority of work at a fraction of the cost of high or max. That data lets you set sensible defaults and avoid paying for reasoning you do not need. For tooling, see monitor and control AI API spending.
A worked example of the cost trap
Imagine a moderately hard debugging task. You could run it on Sonnet 5 at x-high effort, which uses a large number of reasoning tokens and gets you to roughly the accuracy of Opus 4.8 at a medium-to-high setting. But because x-high burns so many tokens, the total cost of that Sonnet 5 run can exceed what Opus 4.8 would have cost at that comparable accuracy. In other words, you paid mid-tier token rates but at such high volume that you beat the flagshipβs price, while getting flagship-level accuracy you could have bought more directly. The fix is simple: when a task is hard enough to need x-high, that is usually the signal to switch models, not to crank effort.
Mapping effort to common tasks
- Formatting, renaming, simple edits: low. These need almost no reasoning.
- Standard feature work, routine bug fixes, writing tests: medium. This is your default.
- Multi-file changes, non-obvious bugs, careful analysis: high.
- Genuinely hard reasoning where accuracy is critical: max, used sparingly.
- The toughest problems: consider Opus 4.8 instead of x-high on Sonnet 5.
How effort interacts with the tokenizer
Effort levels and the new tokenizer compound. The tokenizer can raise your input token count by up to 1.35 times, and higher effort raises reasoning tokens on top of that. A task run at x-high on long context can therefore cost several times what the same task costs at medium on trimmed context. This is why disciplined effort selection plus context trimming, covered in the token efficiency guide, is the core of keeping Sonnet 5 cheap.
Frequently asked questions
What effort levels does Sonnet 5 support? low, medium, high, max, and x-high (extra high).
Which effort level should I use by default? Medium for most work, dropping to low for routine tasks and raising to high only when needed.
Does higher effort cost more? Yes. Higher effort uses more reasoning tokens, so it directly increases cost.
Is x-high worth it? Rarely as a default. At x-high, Sonnet 5 can cost more than Opus 4.8 at a comparable accuracy point, so consider escalating to Opus instead.
Does higher effort change the modelβs knowledge? No. Effort controls how much the model reasons before answering, not what it knows. Higher effort helps on problems that benefit from deeper step-by-step thinking, not on simple recall.
Can I change effort per request? Yes, through the API you set effort per call, which lets you match it to each taskβs difficulty programmatically. In Claude Code and Cursor you control it at the session or task level.
The bottom line
Effort levels are where Sonnet 5βs value is won or lost. Default to medium, drop to low for routine work, and resist the urge to run everything at max. When a task truly needs more, escalating to Opus 4.8 is often cheaper than maxing out Sonnet. For the full model picture, read the complete guide.