OpenRouter gives you access to hundreds of AI models through a single API key. The problem: too many choices. This guide cuts through the noise and ranks the best models on OpenRouter by use case β so you know which model ID to put in your code.
Best for coding (ranked)
| # | Model | OpenRouter ID | Input/M | Output/M | Coding quality |
|---|---|---|---|---|---|
| 1 | Claude Opus 4.8 | anthropic/claude-opus-4-8 | $5.25 | $26.25 | 69.2% SWE-Pro |
| 2 | DeepSeek V4-Pro | deepseek/deepseek-v4-pro | $0.46 | $0.92 | 80.6% SWE-Verified |
| 3 | MiMo V2.5 Pro | xiaomi/mimo-v2.5-pro | $0.46 | $0.92 | 79.2% SWE-Verified |
| 4 | MiniMax M3 | minimax/minimax-m3 | $0.63 | $2.52 | 59% SWE-Pro |
| 5 | Qwen 3.7 Max | qwen/qwen3.7-max | $2.63 | $7.88 | ~58% SWE-Pro |
*OpenRouter prices include ~5.5% markup over provider direct pricing.
Best for speed + budget
| # | Model | OpenRouter ID | Input/M | Speed | Best for |
|---|---|---|---|---|---|
| 1 | Step 3.7 Flash | stepfun/step-3.7-flash | $0.21 | 400 t/s | Fastest overall |
| 2 | Gemini 3.5 Flash | google/gemini-3.5-flash | $0.16 | ~200 t/s | Cheapest capable |
| 3 | DeepSeek V4 Flash | deepseek/deepseek-v4-flash | $0.07 | ~150 t/s | Absolute cheapest |
| 4 | Qwen 3.6 35B-A3B | qwen/qwen3.6-35b-a3b | Low | ~100 t/s | Free tier available |
Best for multimodal
| # | Model | Vision | Video | Computer use | Price |
|---|---|---|---|---|---|
| 1 | MiniMax M3 | β | β | β | $0.63/$2.52 |
| 2 | Step 3.7 Flash | β | β | β | $0.21/$0.84 |
| 3 | Gemini 3.5 Flash | β | β | β | $0.16/$0.63 |
| 4 | Claude Opus 4.8 | β | β | β | $5.25/$26.25 |
Best free models on OpenRouter
| Model | OpenRouter ID | Quality | Limit |
|---|---|---|---|
| Qwen 3.6 35B-A3B | qwen/qwen3.6-35b-a3b:free | Good for coding | Rate limited |
| Qwen 3.5 | qwen/qwen3.5:free | Decent | Rate limited |
| MiniMax M2.5 | minimax/minimax-m2.5:free | Basic | Rate limited |
Free models are rate-limited but genuinely usable for testing and light work. See our best free AI APIs guide.
Setup (takes 2 minutes)
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="your-openrouter-key"
)
# Use any model by changing the model string
response = client.chat.completions.create(
model="deepseek/deepseek-v4-pro", # Change this to any model above
messages=[{"role": "user", "content": "Write a Python function to merge sorted lists"}]
)
With Aider:
aider --model openrouter/deepseek/deepseek-v4-pro
Model routing strategy
The power of OpenRouter is switching models per task:
def choose_model(task):
if task.complexity == "high" and task.stakes == "production":
return "anthropic/claude-opus-4-8"
elif task.has_images or task.has_video:
return "minimax/minimax-m3"
elif task.speed_critical:
return "stepfun/step-3.7-flash"
else:
return "deepseek/deepseek-v4-pro" # Best value default
See our migration guide for implementing model routing in production.
FAQ
Why use OpenRouter instead of direct APIs?
One API key, one billing account, access to every model. Plus automatic fallback routing if a provider is down. The 5.5% markup is worth the convenience for most developers. See OpenRouter vs Direct API.
Whatβs the single best model on OpenRouter?
For coding: DeepSeek V4-Pro (deepseek/deepseek-v4-pro). Best quality-to-cost ratio by far. See why Chinese models are 30Γ cheaper.
Are OpenRouter prices higher than direct?
Yes, ~5.5% markup. For DeepSeek V4-Pro: $0.46/$0.92 on OpenRouter vs $0.435/$0.87 direct. The convenience of unified billing usually justifies it.
Can I use OpenRouter with Claude Code?
No. Claude Code only connects to Anthropic directly. Use OpenRouter with Aider, Continue, or custom code.
How do I handle rate limits?
OpenRouter aggregates across providers and automatically routes to available capacity. See OpenRouter rate limit fix.
Which for a startup just getting started?
Start with deepseek/deepseek-v4-pro for coding. Add stepfun/step-3.7-flash for speed-critical tasks. Add anthropic/claude-opus-4-8 only when you hit quality limits. This three-tier setup covers 99% of needs.
Performance tips for OpenRouter
Caching and cost optimization
OpenRouter passes through provider caching. Models with cache hit pricing (DeepSeek at $0.003625/M, MiMo at $0.0036/M) maintain their cache discounts through OpenRouter. Keep system prompts stable to maximize cache hits.
Fallback routing
OpenRouter automatically routes to alternate providers if your primary is down. For production, this is valuable β your app stays up even if DeepSeek has a brief outage. Configure fallback preferences in your OpenRouter dashboard.
Rate limits
OpenRouter has its own rate limits on top of provider limits. For high-volume use, contact them for increased limits. See our rate limit fix guide.
When to use direct APIs instead
OpenRouter is not always the best choice:
- Claude Code β Only works with Anthropic direct API, not OpenRouter
- High volume (>$500/mo) β The 5.5% adds up. Contact providers directly for volume discounts
- Latency-critical β OpenRouter adds ~20-50ms routing overhead. For autocomplete, direct API is faster
- Enterprise compliance β Some enterprises require direct provider agreements
For most developers, OpenRouterβs convenience outweighs the markup. See our detailed comparison.
Monthly cost examples
| Use pattern | Model | OpenRouter monthly |
|---|---|---|
| Light coding (1hr/day) | DeepSeek V4-Pro | ~$3-5 |
| Daily coding (4hr/day) | DeepSeek V4-Pro | ~$10-15 |
| Heavy agent use (8hr/day) | MiMo V2.5 Pro | ~$15-25 |
| Multi-model routing | Mixed | ~$20-40 |
| Production API (24/7) | DeepSeek V4-Pro | ~$200-400 |
All dramatically cheaper than using OpenAI/Anthropic exclusively ($200-2000+/month for similar workloads).