How to Use the Devstral 2 API β Setup Guide With Code Examples
Devstral 2 is Mistralβs best coding model β 72.2% on SWE-bench, matching Claude Opus. Itβs available via the Mistral API, OpenRouter, and other providers. Hereβs everything you need to use it in your projects and coding tools.
API endpoints and authentication
Mistral API (direct)
- Base URL:
https://api.mistral.ai/v1 - Model ID:
devstral-2-latest - Auth: Bearer token via
Authorizationheader - Get your key: console.mistral.ai
OpenRouter
- Base URL:
https://openrouter.ai/api/v1 - Model ID:
mistralai/devstral-2 - Auth: Bearer token
- Get your key: openrouter.ai/keys
Both endpoints are OpenAI-compatible, so any library that works with the OpenAI API works with Devstral 2.
Code examples
Python β Mistral SDK
from mistralai import Mistral
client = Mistral(api_key="your-mistral-key")
response = client.chat.complete(
model="devstral-2-latest",
messages=[
{"role": "system", "content": "You are an expert software engineer."},
{"role": "user", "content": "Fix the race condition in this handler:\n\n```go\nfunc (s *Server) Handle(w http.ResponseWriter, r *http.Request) {\n s.count++\n fmt.Fprintf(w, \"Request %d\", s.count)\n}\n```"}
],
temperature=0.2,
max_tokens=2048
)
print(response.choices[0].message.content)
Python β OpenAI-compatible (works with any provider)
from openai import OpenAI
# Via Mistral directly
client = OpenAI(
base_url="https://api.mistral.ai/v1",
api_key="your-mistral-key"
)
# Or via OpenRouter
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="your-openrouter-key"
)
response = client.chat.completions.create(
model="devstral-2-latest", # or "mistralai/devstral-2" for OpenRouter
messages=[
{"role": "user", "content": "Refactor this class to use dependency injection"}
],
temperature=0.2
)
print(response.choices[0].message.content)
curl
curl https://api.mistral.ai/v1/chat/completions \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "devstral-2-latest",
"messages": [
{"role": "user", "content": "Write a Python function to merge two sorted arrays"}
],
"temperature": 0.2,
"max_tokens": 1024
}'
JavaScript/TypeScript
import MistralClient from "@mistralai/mistralai";
const client = new MistralClient("your-mistral-key");
const response = await client.chat({
model: "devstral-2-latest",
messages: [
{ role: "user", content: "Add error handling to this async function" }
],
temperature: 0.2,
});
console.log(response.choices[0].message.content);
Fill-in-the-Middle (FIM) support
Devstral 2 supports FIM for code completion β predicting what goes between a prefix and suffix. This is how IDE integrations provide inline completions.
response = client.fim.complete(
model="devstral-2-latest",
prompt="def calculate_total(items):\n ",
suffix="\n return total",
temperature=0.1,
max_tokens=256
)
print(response.choices[0].message.content)
# Output: total = sum(item.price * item.quantity for item in items)
curl for FIM
curl https://api.mistral.ai/v1/fim/completions \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "devstral-2-latest",
"prompt": "function fetchUser(id: string) {\n ",
"suffix": "\n return response.json();\n}",
"temperature": 0.1
}'
FIM is particularly useful for IDE plugins and coding tools that need to complete code at the cursor position.
Integration with coding tools
Aider
export MISTRAL_API_KEY=your-key
aider --model mistral/devstral-2-latest
Or in .aider.conf.yml:
model: mistral/devstral-2-latest
OpenCode
export MISTRAL_API_KEY=your-key
opencode --model mistral/devstral-2-latest
Continue.dev
In .continue/config.json:
{
"models": [{
"provider": "mistral",
"model": "devstral-2-latest",
"apiKey": "your-key"
}]
}
See our Aider guide and OpenCode guide for full tool setup.
Pricing
| Model | Input | Output | Context |
|---|---|---|---|
| Devstral 2 (123B) | $2.00/1M tokens | $6.00/1M tokens | 128K |
| Codestral (22B) | $0.30/1M tokens | $0.90/1M tokens | 32K |
| Claude Sonnet 4 | $3.00/1M tokens | $15.00/1M tokens | 200K |
| Claude Opus | $15.00/1M tokens | $75.00/1M tokens | 200K |
Devstral 2 matches Claude Opus on SWE-bench (72.2%) at a fraction of the cost. For coding tasks specifically, itβs one of the best value propositions available.
Cost estimate for typical usage: A heavy coding session (50 requests, ~2K input + 1K output tokens each) costs roughly $0.50 with Devstral 2 vs $3.75 with Claude Opus.
Rate limits
Mistral API rate limits (as of early 2026):
| Tier | Requests/min | Tokens/min | Tokens/day |
|---|---|---|---|
| Free | 2 | 4,000 | 100,000 |
| Build | 60 | 500,000 | 10M |
| Scale | 300 | 2,000,000 | Unlimited |
For coding tool integration (Aider, OpenCode), the Build tier is sufficient for individual developers. Teams should consider Scale tier for uninterrupted workflows.
OpenRouter has its own rate limits that vary by plan and model demand.
Streaming
For real-time output in coding tools, use streaming:
stream = client.chat.stream(
model="devstral-2-latest",
messages=[{"role": "user", "content": "Explain this code"}],
)
for chunk in stream:
content = chunk.data.choices[0].delta.content
if content:
print(content, end="", flush=True)
Streaming is essential for interactive coding tools where you want to see the response as it generates rather than waiting for the full completion.
Tips for best results
- Use low temperature (0.1-0.3) for code generation β higher temperatures introduce unnecessary variation
- Provide context β include relevant type definitions, interfaces, and surrounding code
- Be specific β βAdd error handling for network failures and invalid JSONβ beats βmake it betterβ
- Use system prompts β set the role and constraints upfront for consistent behavior
- Leverage FIM for completions β itβs specifically trained for this and produces more natural insertions
FAQ
Is the Devstral 2 API free?
Mistral offers a free tier with very limited rate limits (2 requests/min, 100K tokens/day) β enough for testing but not for real development work. The Build tier ($0 monthly + pay-per-token) is what most developers use. At $2/$6 per million tokens, a typical coding session costs $0.30-0.50. Via OpenRouter, pricing is similar. For completely free usage, run Devstral Small locally with Ollama instead.
Does Devstral support fill-in-the-middle?
Yes. Devstral 2 has native FIM support via the /v1/fim/completions endpoint. You provide a prompt (code before cursor) and suffix (code after cursor), and the model predicts what goes in between. This is how IDE integrations provide inline code completions. FIM works best with low temperature (0.1) and shorter max_tokens (128-256) for snappy completions.
How does Devstral API compare to Codestral?
Codestral is Mistralβs smaller (22B) coding model β faster and cheaper ($0.30/$0.90 per 1M tokens) but less capable. Devstral 2 (123B) scores 72.2% on SWE-bench vs Codestralβs ~45%. Use Codestral for fast completions and simple tasks where speed matters more than quality. Use Devstral 2 for complex refactoring, bug fixing, and tasks requiring deep understanding. Many developers use Codestral for FIM/autocomplete and Devstral 2 for chat-based coding assistance.
Related: Devstral 2 Complete Guide Β· What is Codestral 2026 Β· Mistral API Guide Β· Best AI Models for Coding Locally