AI API Pricing Compared: Every Provider in One Table (2026)
AI API pricing has dropped 85% since GPT-4 launched in 2023. Frontier model input now costs under $3/1M tokens. But prices vary wildly between providers, and the cheapest model isnโt always the cheapest for your use case.
Hereโs every major providerโs pricing in one place.
Frontier models
๐ June 10, 2026: Claude Fable 5 launched โ Anthropicโs most powerful model ever at $10/$50 per million tokens. Itโs a new premium tier above Opus, scoring 95% on SWE-bench Verified. See our full pricing analysis.
| Model | Input/1M | Output/1M | Context | Provider |
|---|---|---|---|---|
| Claude Fable 5 ๐ | $10.00 | $50.00 | 1M | Anthropic |
| GPT-5.4 | $2.50 | $15.00 | 1M | OpenAI |
| GPT-5.4 Pro | $30.00 | $180.00 | 1M | OpenAI |
| Claude Opus 4.8 | $5.00 | $25.00 | 1M | Anthropic |
| Claude Opus 4.6 | $15.00 | $75.00 | 1M | Anthropic |
| Claude Sonnet 4 | $3.00 | $15.00 | 200K | Anthropic |
| Gemini 3.5 Flash | $1.50 | $9.00 | 1M | |
| Gemini 3.1 Pro | $2.00 | $12.00 | 1M+ | |
| Gemini 2.5 Pro | $1.25 | $5.00 | 1M |
Mid-tier models (best value)
| Model | Input/1M | Output/1M | Context | Provider |
|---|---|---|---|---|
| GPT-5.4 Mini | $0.75 | $4.50 | 400K | OpenAI |
| Claude Haiku 4 | $0.25 | $1.25 | 200K | Anthropic |
| Gemini 2.5 Flash | $0.15 | $0.60 | 1M | |
| Mistral Large 2 | $2.00 | $6.00 | 128K | Mistral |
| Mistral Small | $0.10 | $0.30 | 32K | Mistral |
Budget models (cheapest)
| Model | Input/1M | Output/1M | Context | Provider |
|---|---|---|---|---|
| DeepSeek V4-Flash | $0.14 | $0.28 | 1M | DeepSeek |
| DeepSeek V4-Pro | $1.74 | $3.48 | 1M | DeepSeek |
| DeepSeek Chat | $0.14 | $0.28 | 128K | DeepSeek |
| DeepSeek Reasoner | $0.55 | $2.19 | 128K | DeepSeek |
| GPT-5 Mini | $0.25 | $2.00 | 128K | OpenAI |
| Qwen 3.6 Plus | Free tier | Free tier | 1M | Alibaba |
| Qwen 3.6 Flash | $0.25 | $1.00 | 1M | Alibaba |
| Qwen 3.6 Max Preview | $1.50 | $6.00 | 1M | Alibaba |
| GLM-5.1 | Free tier | Free tier | 128K | Z.ai |
Open-source via hosting providers
New (April 24, 2026): DeepSeek V4-Flash at $0.14/$0.28 per 1M tokens is now the cheapest frontier-class model available โ scoring 79.0% on SWE-bench with 1M context. V4-Pro at $1.74/$3.48 scores 80.6%. Both MIT licensed. See V4 Flash: cheapest frontier model.
| Model | Provider | Input/1M | Output/1M |
|---|---|---|---|
| Llama 4 Maverick | Together AI | ~$0.49 | ~$0.49 |
| Llama 4 Scout | Together AI | ~$0.10 | ~$0.10 |
| Qwen 3.5 27B | Fireworks | ~$0.20 | ~$0.20 |
| Any model | Ollama (local) | $0 | $0 |
| Any model | OpenRouter | Varies | Varies |
Tired of API costs? Self-hosting can be cheaper long-term. See how to self-host an LLM for โฌ4.99/month or deploy on Vultr with $250 free credits.
Cost per common task
What does a typical developer task actually cost?
| Task | Tokens (approx) | GPT-5.4 | GPT-5.4 Mini | DeepSeek | Local |
|---|---|---|---|---|---|
| Code review (1 file) | 3K in / 1K out | $0.02 | $0.007 | $0.001 | $0 |
| Bug fix | 5K in / 2K out | $0.04 | $0.01 | $0.002 | $0 |
| Full feature | 10K in / 5K out | $0.10 | $0.03 | $0.005 | $0 |
| Codebase analysis | 50K in / 5K out | $0.20 | $0.06 | $0.01 | $0 |
| 8-hour coding session | 500K in / 100K out | $2.75 | $0.83 | $0.10 | $0 |
Monthly cost estimates
| Usage level | GPT-5.4 | GPT-5.4 Mini | DeepSeek | Local |
|---|---|---|---|---|
| Light (10 tasks/day) | $6-12 | $2-4 | $0.30-0.60 | $0 |
| Medium (50 tasks/day) | $30-60 | $10-20 | $1.50-3 | $0 |
| Heavy (200 tasks/day) | $120-240 | $40-80 | $6-12 | $0 |
The smart approach: model routing
Donโt use one model for everything. Route by task complexity:
async def route(message, complexity):
if complexity == "simple":
return await call("deepseek-chat", message) # $0.001
elif complexity == "medium":
return await call("gpt-5.4-mini", message) # $0.01
else:
return await call("claude-sonnet-4", message) # $0.04
This gives you frontier quality when you need it and near-zero cost when you donโt. See our model routing guide for implementation details.
Free tiers worth using
| Provider | Whatโs free | Limitation |
|---|---|---|
| Gemini CLI | Full CLI access | Rate limited |
| Qwen 3.6 Plus | API access | Rate limited |
| GLM-5.1 | Z.ai free tier | Quota limited |
| OpenRouter | Several free models | Model-dependent |
| Ollama | Unlimited local | Hardware-dependent |
See our free AI coding tier review for real-world testing of each.
Price trends
Input token costs have dropped ~85% since 2023. The trend continues as competition intensifies between OpenAI, Anthropic, Google, and Chinese providers. Expect another 30-50% drop by end of 2026.
The biggest driver: open-source models (GLM-5.1, Qwen 3.6, Llama 4) matching proprietary quality at zero cost, forcing paid providers to lower prices.
Related: AI Coding Tools Pricing ยท FinOps for AI ยท AI Agent Cost Management ยท Monitor AI API Spending ยท OpenRouter Complete Guide ยท Cheapest AI Coding Setup ยท Tested Every Free AI Tier