Codestral is a 22-billion-parameter AI model from Mistral AI, built specifically for writing code. Unlike general-purpose models, Codestral is trained from the ground up on 80+ programming languages and optimized for Fill-in-the-Middle (FIM) β meaning it understands code before AND after your cursor.
What makes it special
- Best autocomplete β purpose-built for tab completions in your IDE
- FIM support β understands surrounding code context, not just what comes before
- 22B parameters β small enough to run on a single GPU (RTX 4090)
- 256K context β sees your entire project
- 80+ languages β not just Python and JavaScript
How Fill-in-the-Middle works
Traditional code completion models only see what comes before your cursor. FIM changes this by providing the model with both the prefix (code above) and suffix (code below), then asking it to fill the gap. This produces dramatically better completions because the model understands what the code needs to connect to.
For example, if youβre writing a function body, Codestral sees both the function signature above and the return statement below. It generates code that logically bridges both, rather than guessing what might come next in isolation.
Codestral 25.01 benchmarks
The January 2025 update brought major improvements:
| Benchmark | Codestral 25.01 | DeepSeek Coder V2 Lite |
|---|---|---|
| HumanEval | 86.6% | 83.5% |
| FIM Python | 92.5% | β |
| FIM Java | 97.1% | β |
| FIM Average | 95.3% | β |
| LiveCodeBench | 37.9% | 28.1% |
The FIM scores are particularly impressive β 95.3% average means it almost always gets autocomplete right.
How to use it
The most common setup is Codestral running locally via Ollama as your autocomplete engine:
ollama pull codestral:22b
Then connect it to Continue.dev in VS Code for free Copilot-like autocomplete.
Via Mistral API
curl https://api.mistral.ai/v1/chat/completions \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-d '{"model": "codestral-latest", "messages": [{"role": "user", "content": "Write a TypeScript function to debounce"}]}'
Pricing
- Input: $0.20 per million tokens
- Output: $0.60 per million tokens
At this price, you can run Codestral all day for autocomplete and spend less than a dollar. Compare that to GitHub Copilot at $10-19/month.
Codestral vs Devstral
Both are from Mistral but serve different purposes:
- Codestral β for autocomplete (tab completions, FIM, inline suggestions)
- Devstral 2 β for agent tasks (refactoring, bug fixing, building features)
Use Codestral as your autocomplete engine and Devstral as your coding agent.
FAQ
Can I run Codestral completely offline?
Yes. Download the model via Ollama (ollama pull codestral:22b) and it runs entirely on your machine with no internet connection required. You need approximately 16GB of RAM or a GPU with 12GB+ VRAM for comfortable performance.
Is Codestral better than GitHub Copilot for autocomplete?
Codestral 25.01 scores 95.3% on FIM benchmarks, which is state-of-the-art for fill-in-the-middle completions. Itβs competitive with or better than Copilotβs suggestions, and you can run it locally for free rather than paying $10-19/month for a Copilot subscription.
Whatβs the difference between Codestral and Codestral 25.01?
Codestral 25.01 is the latest version of the Codestral model, released in January 2025. Itβs roughly 2x faster than the original Codestral and scores significantly higher on coding benchmarks. When people say βCodestralβ today, they typically mean the 25.01 version.
Learn more
- Codestral Complete Guide β full setup, benchmarks, and comparisons
- Best AI Autocomplete Models 2026 β ranked alternatives
- Codestral vs DeepSeek Coder β head-to-head