A living comparison of the major AI models. Last updated: May 29, 2026.
Quick Comparison
| Model | Provider | Context | Input $/1M | Output $/1M | Best For |
|---|---|---|---|---|---|
| Claude Opus 4.8 π | Anthropic | 1M | $5 | $25 | Best coding model, agentic tasks, vision |
| Claude Opus 4.6 | Anthropic | 1M (beta) | $5 | $25 | Complex coding, agentic teams |
| Claude Sonnet 4.6 | Anthropic | 1M | $3 | $15 | Best value for coding |
| Claude Haiku 3.5 | Anthropic | 200K | $0.80 | $4 | Fast tasks, high volume |
| GPT-5.4 | OpenAI | 1M | $2.50 | $15 | Computer use, reasoning |
| GPT-4o | OpenAI | 128K | $2.50 | $10 | Multimodal, fast |
| GPT-4o Mini | OpenAI | 128K | $0.15 | $0.60 | Budget tasks, high volume |
| Gemini 3.1 Pro | 1M | $2 | $12 | Reasoning, multimodal, research | |
| Gemini 3.1 Flash-Lite | 1M | $0.25 | $1.50 | Cheapest option, enterprise scale | |
| Llama 3.1 405B | Meta (open) | 128K | Free (self-host) | Free (self-host) | Privacy, no API costs |
| Mistral Large | Mistral | 128K | $2 | $6 | European alternative, multilingual |
| MiMo-V2-Pro | Xiaomi | 1M | $1 | $3 | Agent tasks, budget frontier |
| MiMo V2.5 Pro π | Xiaomi | 1M | $1 | $3 | 57.2% SWE-bench Pro, 40-60% fewer tokens than Opus |
| DeepSeek V4-Pro π | DeepSeek (open) | 1M | $1.74 | $3.48 | 80.6% SWE-bench, 1.6T/49B MoE, MIT. Guide |
| DeepSeek V4-Flash π | DeepSeek (open) | 1M | $0.14 | $0.28 | 79.0% SWE-bench, 284B/13B MoE, MIT. Cheapest frontier model |
| Kimi K2.6 π | Moonshot AI (open) | 256K | $0.60 | $3.00 | Agentic coding, 300 sub-agent swarm |
| Qwen 3.6-35B-A3B π | Alibaba (open) | 262K (1M ext.) | Free (self-host) | Free (self-host) | Local coding agent, 3B active MoE |
Pricing from official provider pages as of March 2026. Always verify before committing to a model.
Whatβs new in April 2026
- DeepSeek V4-Pro & V4-Flash (April 24) β V4-Pro: 1.6T/49B MoE, 80.6% SWE-bench, $1.74/$3.48. V4-Flash: 284B/13B MoE, 79.0% SWE-bench, $0.14/$0.28. Both 1M context, MIT licensed. V4-Pro guide
- MiMo V2.5 Pro (April 23) β Xiaomiβs upgraded flagship. 57.2% SWE-bench Pro, 40-60% fewer tokens than Opus 4.6 at the same $1/$3 pricing. Complete guide
- Kimi K2.6 (April 20) β Open-source 1T/32B MoE agentic model. 80.2% SWE-Bench Verified, 300 sub-agent swarm, matches Opus 4.6 on coding at 25x lower cost. Modified MIT license. Complete guide Β· K2.6 vs K2.5 Β· K2.6 vs Opus 4.6 Β· K2.6 vs GPT-5.4
- Claude Opus 4.8 (May 28) β 69.2% SWE-bench Pro, dynamic workflows (hundreds of parallel subagents), 4Γ fewer unflagged errors, effort control, fast mode 3Γ cheaper. Same pricing as 4.7. Complete guide Β· Opus 4.8 vs 4.7 Β· Opus 4.8 vs GPT-5.5
- Claude Opus 4.7 (April 16) β 64.3% SWE-bench Pro, 98.5% vision accuracy, new xhigh effort level, /ultrareview in Claude Code. Same pricing as 4.6 but new tokenizer uses up to 35% more tokens. Complete guide Β· Opus 4.7 vs 4.6 Β· Opus 4.7 vs GPT-5.4
- Qwen 3.6-35B-A3B (April 15) β Open-weight 35B MoE with only 3B active parameters. 73.4% SWE-bench Verified, runs on a laptop (~21 GB quantized), Apache 2.0. Complete guide
Whatβs new in March 2026
- Xiaomi MiMo-V2-Pro (March 18) β Trillion-parameter MoE agent model. 1M context at $1/$3 per million tokens. Full breakdown.
- Xiaomi MiMo-V2-Flash (December 2025) β Open-source 309B MoE, 15B active. #1 on SWE-Bench for open-source at $0.10/$0.30. Full breakdown.
- Xiaomi MiMo-V2-Omni (March 18) β Multimodal model processing text, images, video, and 10+ hours of audio. Full breakdown.
- GPT-5.4 (March 5) β OpenAIβs most capable model. 1M context, native computer use, 75% on OSWorld (above human baseline of 72.4%).
- Gemini 3.1 Pro (Feb 19) β 77.1% on ARC-AGI-2 (double its predecessor). Best reasoning-to-cost ratio.
- Gemini 3.1 Flash-Lite (March 3) β $0.25/1M input. 2.5x faster than Gemini 2.5 Flash.
- Claude Sonnet 4.6 (Feb 17) β Near-Opus performance at Sonnet pricing. 79.6% SWE-bench.
- Claude Opus 4.6 (Feb 5) β 1M context, 128K output, collaborative agent teams.
Which model should you use?
For coding: Claude Opus 4.8 β 69.2% on SWE-bench Pro, dynamic workflows for codebase-scale tasks. The undisputed king of coding benchmarks. If you want near-Opus quality at lower cost, DeepSeek V4-Pro at $0.435/$0.87 scores 80.6% on SWE-bench Verified.
For reasoning & research: Gemini 3.1 Pro β 77.1% on ARC-AGI-2 and 94.3% on GPQA Diamond. Best for complex analytical tasks.
For computer use / agents: GPT-5.4 β 75% on OSWorld, native software control. First model to beat human baseline on desktop tasks.
For huge documents: Any of the 1M-context models (Claude 4.6, GPT-5.4, Gemini 3.1). All three now support 1M tokens.
On a budget: MiMo-V2-Flash at $0.10/$0.30 per million tokens β open source, 73.4% SWE-Bench. Gemini 3.1 Flash-Lite at $0.25/$1.50 is another option.
For privacy: Llama 3.1 405B. Run locally with Ollama β your data stays on your machine.
Subscription plans compared
| Plan | Price | What you get |
|---|---|---|
| ChatGPT Plus | $20/mo | GPT-4o, limited GPT-5.4, image gen, browsing |
| ChatGPT Pro | $200/mo | Unlimited GPT-5.4, o1 pro mode |
| Claude Pro | $20/mo | Sonnet 4.6 + limited Opus 4.6, priority access |
| Claude Max | $100-200/mo | Higher Opus limits, extended thinking |
| Gemini Advanced | $20/mo | Gemini 3.1 Pro, 1M context, Google integration |
Deep dive comparisons
Want more detail? Check out our head-to-head comparisons:
- Claude Opus 4.8 Complete Guide β Everything about the new flagship
- Claude Opus 4.8 vs 4.7 β What changed, is it worth upgrading?
- Claude Opus 4.8 vs GPT-5.5 β Head to head
- Claude Opus 4.7 Complete Guide β Previous flagship
- Claude Opus 4.7 vs 4.6 β What changed, is it worth upgrading?
- Claude Opus 4.7 vs GPT-5.4 β Head to head
- Claude Opus 4.6 vs 4.5 β What changed
- Claude Sonnet 4.6 vs 4.5 β What changed
- Sonnet 4.6 vs Opus 4.6 β Is Opus worth the premium?
- Claude Opus 4 vs GPT-5 β Head to head
- MiMo-V2-Pro vs Claude vs GPT β Where Xiaomiβs model stands
How to access via API
All providers offer pay-as-you-go API access with free tiers:
- OpenAI: platform.openai.com
- Anthropic: console.anthropic.com
- Google: ai.google.dev
This page is updated with every major model release. Bookmark it or subscribe to our newsletter to get notified.
Related: Free AI Token Counter
Related: Best Free AI Models in 2026: Llama, Mistral, DeepSeek and More