MiMo V2 Pro is Xiaomiβs flagship coding model. You can run it locally with Ollama for free, private AI coding. Hereβs the setup.
Install and run
# Install Ollama
brew install ollama # Mac
# or: curl -fsSL https://ollama.com/install.sh | sh # Linux
# Pull MiMo V2 Pro
ollama pull mimo-v2-pro
# Test it
ollama run mimo-v2-pro "Write a Python REST API with FastAPI and SQLAlchemy"
Hardware requirements
| Hardware | Performance | Usable? |
|---|---|---|
| MacBook Air M2 16GB | ~15 tok/s | β Good |
| MacBook Pro M3 36GB | ~25 tok/s | β Great |
| Mac Mini M4 Pro 48GB | ~30 tok/s | β Excellent |
| RTX 4090 24GB | ~40 tok/s | β Excellent |
| 8GB RAM (any) | Too slow | β Need 16GB+ |
MiMo V2 Pro needs at least 16GB RAM. If you only have 8GB, use Yi-Coder 9B or Qwen3 8B instead.
See our VRAM guide for exact memory calculations.
Connect to coding tools
Aider
aider --model ollama/mimo-v2-pro
This is the same setup we use for the Xiaomi agent in the AI Startup Race. See our MiMo + Aider guide for advanced configuration.
Continue.dev (VS Code)
{
"models": [{
"title": "MiMo V2 Pro Local",
"provider": "ollama",
"model": "mimo-v2-pro"
}]
}
OpenCode
opencode --provider ollama --model mimo-v2-pro
MiMo V2 Pro vs other local coding models
| Model | Size | RAM needed | Coding quality | Speed |
|---|---|---|---|---|
| MiMo V2 Pro | ~14 GB | 16 GB | Good | Fast |
| Devstral Small 24B | ~16 GB | 16 GB | Best | Medium |
| Qwen 3.5 27B | ~17 GB | 20 GB | Very good | Medium |
| DeepSeek R1 14B | ~9 GB | 12 GB | Good (reasoning) | Slow |
| Yi-Coder 9B | ~5 GB | 8 GB | Good | Fast |
MiMo V2 Pro sits in the middle β better than the small models (Yi-Coder, Qwen3 8B) but not quite as good as Devstral Small 24B for pure coding quality. Its advantage is speed β it generates code faster than the 24B+ models.
Local vs API
| Local (Ollama) | API (OpenRouter) | |
|---|---|---|
| Cost | Free | ~$25/mo |
| Privacy | β Full | β Data sent to API |
| Speed | Depends on hardware | Fast (cloud GPU) |
| Context | Limited by RAM | 128K |
| Offline | β Works offline | β Needs internet |
Run locally for privacy and zero cost. Use the API when you need faster responses or are on weaker hardware.
The MiMo V2 family locally
| Model | Use case | Ollama command |
|---|---|---|
| MiMo V2 Pro | Best quality coding | ollama pull mimo-v2-pro |
| MiMo V2 Omni | Balanced quality/speed | ollama pull mimo-v2-omni |
| MiMo V2 Flash | Fastest, lighter tasks | ollama pull mimo-v2-flash |
Use Pro for complex coding, Flash for quick questions and autocomplete. See our MiMo V2 family guide for detailed comparisons.
Troubleshooting
- βmodel not foundβ β check exact name with
ollama list - Too slow β verify GPU is being used:
ollama ps - Out of memory β try MiMo V2 Flash or a quantized version
- Context too short β increase with
--num-ctx 32768
See our Ollama troubleshooting guide for all common errors.
Related: MiMo V2 Family Guide Β· MiMo V2 Pro + Aider Setup Β· Best Ollama Models for Coding Β· Ollama Complete Guide Β· Ollama vs LM Studio vs vLLM