A living comparison of the major AI models. Last updated: March 2026.
Quick Comparison
| Model | Provider | Context | Input $/1M | Output $/1M | Best For |
|---|---|---|---|---|---|
| Claude Opus 4.6 | Anthropic | 1M (beta) | $5 | $25 | Complex coding, agentic teams |
| Claude Sonnet 4.6 | Anthropic | 1M | $3 | $15 | Best value for coding |
| Claude Haiku 3.5 | Anthropic | 200K | $0.80 | $4 | Fast tasks, high volume |
| GPT-5.4 | OpenAI | 1M | $2.50 | $15 | Computer use, reasoning |
| GPT-4o | OpenAI | 128K | $2.50 | $10 | Multimodal, fast |
| GPT-4o Mini | OpenAI | 128K | $0.15 | $0.60 | Budget tasks, high volume |
| Gemini 3.1 Pro | 1M | $2 | $12 | Reasoning, multimodal, research | |
| Gemini 3.1 Flash-Lite | 1M | $0.25 | $1.50 | Cheapest option, enterprise scale | |
| Llama 3.1 405B | Meta (open) | 128K | Free (self-host) | Free (self-host) | Privacy, no API costs |
| Mistral Large | Mistral | 128K | $2 | $6 | European alternative, multilingual |
Pricing from official provider pages as of March 2026. Always verify before committing to a model.
Whatβs new in March 2026
- GPT-5.4 (March 5) β OpenAIβs most capable model. 1M context, native computer use, 75% on OSWorld (above human baseline of 72.4%).
- Gemini 3.1 Pro (Feb 19) β 77.1% on ARC-AGI-2 (double its predecessor). Best reasoning-to-cost ratio.
- Gemini 3.1 Flash-Lite (March 3) β $0.25/1M input. 2.5x faster than Gemini 2.5 Flash.
- Claude Sonnet 4.6 (Feb 17) β Near-Opus performance at Sonnet pricing. 79.6% SWE-bench.
- Claude Opus 4.6 (Feb 5) β 1M context, 128K output, collaborative agent teams.
Which model should you use?
For coding: Claude Sonnet 4.6 β 79.6% on SWE-bench at just $3/$15. Only 1.2 points behind Opus at a fraction of the cost.
For reasoning & research: Gemini 3.1 Pro β 77.1% on ARC-AGI-2 and 94.3% on GPQA Diamond. Best for complex analytical tasks.
For computer use / agents: GPT-5.4 β 75% on OSWorld, native software control. First model to beat human baseline on desktop tasks.
For huge documents: Any of the 1M-context models (Claude 4.6, GPT-5.4, Gemini 3.1). All three now support 1M tokens.
On a budget: Gemini 3.1 Flash-Lite at $0.25/$1.50 per million tokens. Nothing else comes close on price.
For privacy: Llama 3.1 405B. Run locally with Ollama β your data stays on your machine.
Subscription plans compared
| Plan | Price | What you get |
|---|---|---|
| ChatGPT Plus | $20/mo | GPT-4o, limited GPT-5.4, image gen, browsing |
| ChatGPT Pro | $200/mo | Unlimited GPT-5.4, o1 pro mode |
| Claude Pro | $20/mo | Sonnet 4.6 + limited Opus 4.6, priority access |
| Claude Max | $100-200/mo | Higher Opus limits, extended thinking |
| Gemini Advanced | $20/mo | Gemini 3.1 Pro, 1M context, Google integration |
Deep dive comparisons
Want more detail? Check out our head-to-head comparisons:
- Claude Opus 4.6 vs 4.5 β What changed
- Claude Sonnet 4.6 vs 4.5 β What changed
- Sonnet 4.6 vs Opus 4.6 β Is Opus worth the premium?
- Claude Opus 4 vs GPT-5 β Head to head
How to access via API
All providers offer pay-as-you-go API access with free tiers:
- OpenAI: platform.openai.com
- Anthropic: console.anthropic.com
- Google: ai.google.dev
This page is updated with every major model release. Bookmark it or subscribe to our newsletter to get notified.