A living comparison of the major AI models. Last updated: April 24, 2026.
Quick Comparison
| Model | Provider | Context | Input $/1M | Output $/1M | Best For |
|---|---|---|---|---|---|
| Claude Opus 4.7 π | Anthropic | 1M | $5 | $25 | Best coding model, agentic tasks, vision |
| Claude Opus 4.6 | Anthropic | 1M (beta) | $5 | $25 | Complex coding, agentic teams |
| Claude Sonnet 4.6 | Anthropic | 1M | $3 | $15 | Best value for coding |
| Claude Haiku 3.5 | Anthropic | 200K | $0.80 | $4 | Fast tasks, high volume |
| GPT-5.4 | OpenAI | 1M | $2.50 | $15 | Computer use, reasoning |
| GPT-4o | OpenAI | 128K | $2.50 | $10 | Multimodal, fast |
| GPT-4o Mini | OpenAI | 128K | $0.15 | $0.60 | Budget tasks, high volume |
| Gemini 3.1 Pro | 1M | $2 | $12 | Reasoning, multimodal, research | |
| Gemini 3.1 Flash-Lite | 1M | $0.25 | $1.50 | Cheapest option, enterprise scale | |
| Llama 3.1 405B | Meta (open) | 128K | Free (self-host) | Free (self-host) | Privacy, no API costs |
| Mistral Large | Mistral | 128K | $2 | $6 | European alternative, multilingual |
| MiMo-V2-Pro | Xiaomi | 1M | $1 | $3 | Agent tasks, budget frontier |
| MiMo V2.5 Pro π | Xiaomi | 1M | $1 | $3 | 57.2% SWE-bench Pro, 40-60% fewer tokens than Opus |
| DeepSeek V4-Pro π | DeepSeek (open) | 1M | $1.74 | $3.48 | 80.6% SWE-bench, 1.6T/49B MoE, MIT. Guide |
| DeepSeek V4-Flash π | DeepSeek (open) | 1M | $0.14 | $0.28 | 79.0% SWE-bench, 284B/13B MoE, MIT. Cheapest frontier model |
| Kimi K2.6 π | Moonshot AI (open) | 256K | $0.60 | $3.00 | Agentic coding, 300 sub-agent swarm |
| Qwen 3.6-35B-A3B π | Alibaba (open) | 262K (1M ext.) | Free (self-host) | Free (self-host) | Local coding agent, 3B active MoE |
Pricing from official provider pages as of March 2026. Always verify before committing to a model.
Whatβs new in April 2026
- DeepSeek V4-Pro & V4-Flash (April 24) β V4-Pro: 1.6T/49B MoE, 80.6% SWE-bench, $1.74/$3.48. V4-Flash: 284B/13B MoE, 79.0% SWE-bench, $0.14/$0.28. Both 1M context, MIT licensed. V4-Pro guide
- MiMo V2.5 Pro (April 23) β Xiaomiβs upgraded flagship. 57.2% SWE-bench Pro, 40-60% fewer tokens than Opus 4.6 at the same $1/$3 pricing. Complete guide
- Kimi K2.6 (April 20) β Open-source 1T/32B MoE agentic model. 80.2% SWE-Bench Verified, 300 sub-agent swarm, matches Opus 4.6 on coding at 25x lower cost. Modified MIT license. Complete guide Β· K2.6 vs K2.5 Β· K2.6 vs Opus 4.6 Β· K2.6 vs GPT-5.4
- Claude Opus 4.7 (April 16) β 64.3% SWE-bench Pro, 98.5% vision accuracy, new xhigh effort level, /ultrareview in Claude Code. Same pricing as 4.6 but new tokenizer uses up to 35% more tokens. Complete guide Β· Opus 4.7 vs 4.6 Β· Opus 4.7 vs GPT-5.4
- Qwen 3.6-35B-A3B (April 15) β Open-weight 35B MoE with only 3B active parameters. 73.4% SWE-bench Verified, runs on a laptop (~21 GB quantized), Apache 2.0. Complete guide
Whatβs new in March 2026
- Xiaomi MiMo-V2-Pro (March 18) β Trillion-parameter MoE agent model. 1M context at $1/$3 per million tokens. Full breakdown.
- Xiaomi MiMo-V2-Flash (December 2025) β Open-source 309B MoE, 15B active. #1 on SWE-Bench for open-source at $0.10/$0.30. Full breakdown.
- Xiaomi MiMo-V2-Omni (March 18) β Multimodal model processing text, images, video, and 10+ hours of audio. Full breakdown.
- GPT-5.4 (March 5) β OpenAIβs most capable model. 1M context, native computer use, 75% on OSWorld (above human baseline of 72.4%).
- Gemini 3.1 Pro (Feb 19) β 77.1% on ARC-AGI-2 (double its predecessor). Best reasoning-to-cost ratio.
- Gemini 3.1 Flash-Lite (March 3) β $0.25/1M input. 2.5x faster than Gemini 2.5 Flash.
- Claude Sonnet 4.6 (Feb 17) β Near-Opus performance at Sonnet pricing. 79.6% SWE-bench.
- Claude Opus 4.6 (Feb 5) β 1M context, 128K output, collaborative agent teams.
Which model should you use?
For coding: Claude Opus 4.7 β 64.3% on SWE-bench Pro, 70% on CursorBench. The new king of coding benchmarks. If you want near-Opus quality at lower cost, Sonnet 4.6 at $3/$15 is still excellent.
For reasoning & research: Gemini 3.1 Pro β 77.1% on ARC-AGI-2 and 94.3% on GPQA Diamond. Best for complex analytical tasks.
For computer use / agents: GPT-5.4 β 75% on OSWorld, native software control. First model to beat human baseline on desktop tasks.
For huge documents: Any of the 1M-context models (Claude 4.6, GPT-5.4, Gemini 3.1). All three now support 1M tokens.
On a budget: MiMo-V2-Flash at $0.10/$0.30 per million tokens β open source, 73.4% SWE-Bench. Gemini 3.1 Flash-Lite at $0.25/$1.50 is another option.
For privacy: Llama 3.1 405B. Run locally with Ollama β your data stays on your machine.
Subscription plans compared
| Plan | Price | What you get |
|---|---|---|
| ChatGPT Plus | $20/mo | GPT-4o, limited GPT-5.4, image gen, browsing |
| ChatGPT Pro | $200/mo | Unlimited GPT-5.4, o1 pro mode |
| Claude Pro | $20/mo | Sonnet 4.6 + limited Opus 4.6, priority access |
| Claude Max | $100-200/mo | Higher Opus limits, extended thinking |
| Gemini Advanced | $20/mo | Gemini 3.1 Pro, 1M context, Google integration |
Deep dive comparisons
Want more detail? Check out our head-to-head comparisons:
- Claude Opus 4.7 Complete Guide β Everything about the new flagship
- Claude Opus 4.7 vs 4.6 β What changed, is it worth upgrading?
- Claude Opus 4.7 vs GPT-5.4 β Head to head
- Claude Opus 4.6 vs 4.5 β What changed
- Claude Sonnet 4.6 vs 4.5 β What changed
- Sonnet 4.6 vs Opus 4.6 β Is Opus worth the premium?
- Claude Opus 4 vs GPT-5 β Head to head
- MiMo-V2-Pro vs Claude vs GPT β Where Xiaomiβs model stands
How to access via API
All providers offer pay-as-you-go API access with free tiers:
- OpenAI: platform.openai.com
- Anthropic: console.anthropic.com
- Google: ai.google.dev
This page is updated with every major model release. Bookmark it or subscribe to our newsletter to get notified.
Related: Free AI Token Counter
Related: Best Free AI Models in 2026: Llama, Mistral, DeepSeek and More