🤖 AI Tools
· 2 min read

How to Run Mistral Models Locally — Ollama Setup Guide (2026)


Mistral has three models you can run locally for free: Codestral (autocomplete), Devstral Small 2 (coding agent), and Nemo (general chat). Here’s how to set them up.

Which model for your hardware

HardwareRAM/VRAMBest Mistral modelCommand
RTX 4090 (24GB)24GBCodestral 22B + Devstral Small 24BBoth fit
RTX 4070 (12GB)12GBCodestral 22B (Q4)ollama pull codestral:22b
Mac M4 32GB32GBDevstral Small 24B + Codestral 22BBoth fit
Mac M4 16GB16GBCodestral 22B (Q4)ollama pull codestral:22b
8GB VRAM/RAM8GBNemo 12B (Q4)ollama pull mistral-nemo

Setup with Ollama

# Install Ollama
brew install ollama  # macOS
# or: curl -fsSL https://ollama.com/install.sh | sh  # Linux

# Pull your model
ollama pull codestral:22b        # Best autocomplete (12GB)
ollama pull devstral-small:24b   # Best local coding agent (14GB)
ollama pull mistral-nemo         # General chat (7GB)

Use with coding tools

Continue.dev (VS Code autocomplete)

{
  "tabAutocompleteModel": {
    "provider": "ollama",
    "model": "codestral:22b"
  },
  "models": [{
    "provider": "ollama",
    "model": "devstral-small:24b",
    "title": "Devstral Small"
  }]
}

See our Continue.dev guide.

Aider (terminal)

aider --model ollama/devstral-small:24b

See our Aider guide.

OpenCode (terminal)

{"providers": {"ollama": {"baseUrl": "http://localhost:11434"}}, "defaultModel": "ollama/devstral-small:24b"}

See our OpenCode guide.

Performance

ModelMac M4 32GBRTX 4090
Codestral 22B~25 tok/s~40 tok/s
Devstral Small 24B~22 tok/s~35 tok/s
Nemo 12B~35 tok/s~55 tok/s

All fast enough for interactive coding. Codestral’s autocomplete feels instant at these speeds.

The ideal local Mistral setup

Run both Codestral (autocomplete) and Devstral Small (agent) on a 32GB machine. Ollama swaps models automatically — you don’t need to manage them manually.

For tasks that exceed local model quality, fall back to Devstral 2 via API ($2/1M tokens) or Mistral Large 2 for reasoning.

Related: Ollama Complete Guide · Best AI Models for Mac · Codestral Complete Guide · Best AI Models Under 16GB VRAM