Mistral has three models you can run locally for free: Codestral (autocomplete), Devstral Small 2 (coding agent), and Nemo (general chat). Here’s how to set them up.
Which model for your hardware
| Hardware | RAM/VRAM | Best Mistral model | Command |
|---|---|---|---|
| RTX 4090 (24GB) | 24GB | Codestral 22B + Devstral Small 24B | Both fit |
| RTX 4070 (12GB) | 12GB | Codestral 22B (Q4) | ollama pull codestral:22b |
| Mac M4 32GB | 32GB | Devstral Small 24B + Codestral 22B | Both fit |
| Mac M4 16GB | 16GB | Codestral 22B (Q4) | ollama pull codestral:22b |
| 8GB VRAM/RAM | 8GB | Nemo 12B (Q4) | ollama pull mistral-nemo |
Setup with Ollama
# Install Ollama
brew install ollama # macOS
# or: curl -fsSL https://ollama.com/install.sh | sh # Linux
# Pull your model
ollama pull codestral:22b # Best autocomplete (12GB)
ollama pull devstral-small:24b # Best local coding agent (14GB)
ollama pull mistral-nemo # General chat (7GB)
Use with coding tools
Continue.dev (VS Code autocomplete)
{
"tabAutocompleteModel": {
"provider": "ollama",
"model": "codestral:22b"
},
"models": [{
"provider": "ollama",
"model": "devstral-small:24b",
"title": "Devstral Small"
}]
}
See our Continue.dev guide.
Aider (terminal)
aider --model ollama/devstral-small:24b
See our Aider guide.
OpenCode (terminal)
{"providers": {"ollama": {"baseUrl": "http://localhost:11434"}}, "defaultModel": "ollama/devstral-small:24b"}
See our OpenCode guide.
Performance
| Model | Mac M4 32GB | RTX 4090 |
|---|---|---|
| Codestral 22B | ~25 tok/s | ~40 tok/s |
| Devstral Small 24B | ~22 tok/s | ~35 tok/s |
| Nemo 12B | ~35 tok/s | ~55 tok/s |
All fast enough for interactive coding. Codestral’s autocomplete feels instant at these speeds.
The ideal local Mistral setup
Run both Codestral (autocomplete) and Devstral Small (agent) on a 32GB machine. Ollama swaps models automatically — you don’t need to manage them manually.
For tasks that exceed local model quality, fall back to Devstral 2 via API ($2/1M tokens) or Mistral Large 2 for reasoning.
Related: Ollama Complete Guide · Best AI Models for Mac · Codestral Complete Guide · Best AI Models Under 16GB VRAM