What is Devstral 2? Mistral's Open-Source Coding Agent Model Explained
π’ Update: Mistral Medium 3.5 has replaced Devstral 2 as the default model in Vibe CLI. See the Medium 3.5 complete guide and Vibe 2.0 remote agents guide.
Devstral 2 is Mistral AIβs dedicated coding agent model β a 123B parameter model that scores 72.2% on SWE-bench Verified, matching Claude Opus 4.6. Itβs open-weight under a modified MIT license and designed for autonomous coding tasks.
Key facts
- 123B dense parameters β runs on a single server node
- 256K context window β the largest among coding models
- 72.2% SWE-bench β matches Claude Opus, beats GPT-5.4
- Modified MIT license β open for commercial use
- Also available as Devstral Small (24B) β runs on consumer hardware
What Devstral 2 does
Devstral 2 is built for agentic coding β tasks where the AI needs to autonomously plan, execute, and iterate. This includes:
- Bug fixing β reads error logs, traces the issue, applies a fix
- Feature implementation β takes a spec and builds it across multiple files
- Refactoring β restructures code while maintaining behavior
- Test generation β writes comprehensive test suites for existing code
- Code review β analyzes PRs and suggests improvements
Unlike Codestral (which is optimized for fast autocomplete), Devstral 2 is designed for complex multi-step tasks that require deep reasoning about code architecture.
Architecture
Devstral 2 is a dense transformer β all 123B parameters activate for every token. This is different from MoE models like Qwen 3.5 or DeepSeek V3 that only activate a subset. The dense architecture provides more consistent quality across different task types but requires more compute per token.
The 256K context window means it can process approximately 500-800 files of typical source code in a single pass, making it suitable for understanding entire microservice architectures or large monorepos.
Devstral 2 vs Codestral
Both are from Mistral but serve different purposes:
- Devstral 2 β for agent tasks (refactoring, bug fixing, building features)
- Codestral β for autocomplete (tab completions in your IDE)
The ideal setup is using both: Codestral for real-time inline suggestions as you type, and Devstral 2 for larger tasks you delegate to an AI agent.
How to use Devstral 2
- Vibe CLI β Mistralβs native terminal coding tool
- Aider β open-source CLI with Mistral support
- OpenCode β via Mistral API
- Mistral API β direct integration at $2/$6 per million tokens
Devstral Small (24B)
For developers who want to run locally, Devstral Small 2 is a 24B parameter version that fits on consumer hardware (16GB+ RAM). It sacrifices some quality compared to the full 123B model but still outperforms most open-source alternatives at its size.
ollama pull devstral-small:24b
FAQ
Can I run Devstral 2 locally?
The full 123B model requires significant hardware β approximately 80GB+ of VRAM across one or more GPUs. For local use, Devstral Small (24B) is the practical choice, running comfortably on machines with 16GB+ RAM via Ollama.
How does Devstral 2 compare to Claude Code?
Both score nearly identically on SWE-bench (72.2% vs 72.1%), so coding quality is comparable. The key difference is that Devstral 2 is a model you can access via API or self-host, while Claude Code is a complete CLI tool with built-in workflows. You can use Devstral 2 through Vibe CLI or Aider for a similar experience.
Is Devstral 2 truly open source?
Devstral 2 uses a modified MIT license that allows commercial use, modification, and redistribution. The weights are available for download. However, itβs βopen-weightβ rather than fully open-source β the training data and training code are not released, only the model weights.
Learn more
- Devstral 2 Complete Guide β full architecture and setup
- Devstral Small 2 Guide β the 24B version for local use
- Devstral 2 vs GLM-5.1 vs Codestral β comparison