πŸ€– AI Tools
Β· 3 min read

How to Run MiMo V2 Pro Locally with Ollama


MiMo V2 Pro is Xiaomi’s flagship coding model. You can run it locally with Ollama for free, private AI coding. Here’s the setup.

Install and run

# Install Ollama
brew install ollama  # Mac
# or: curl -fsSL https://ollama.com/install.sh | sh  # Linux

# Pull MiMo V2 Pro
ollama pull mimo-v2-pro

# Test it
ollama run mimo-v2-pro "Write a Python REST API with FastAPI and SQLAlchemy"

Hardware requirements

HardwarePerformanceUsable?
MacBook Air M2 16GB~15 tok/sβœ… Good
MacBook Pro M3 36GB~25 tok/sβœ… Great
Mac Mini M4 Pro 48GB~30 tok/sβœ… Excellent
RTX 4090 24GB~40 tok/sβœ… Excellent
8GB RAM (any)Too slow❌ Need 16GB+

MiMo V2 Pro needs at least 16GB RAM. If you only have 8GB, use Yi-Coder 9B or Qwen3 8B instead.

See our VRAM guide for exact memory calculations.

Connect to coding tools

Aider

aider --model ollama/mimo-v2-pro

This is the same setup we use for the Xiaomi agent in the AI Startup Race. See our MiMo + Aider guide for advanced configuration.

Continue.dev (VS Code)

{
  "models": [{
    "title": "MiMo V2 Pro Local",
    "provider": "ollama",
    "model": "mimo-v2-pro"
  }]
}

OpenCode

opencode --provider ollama --model mimo-v2-pro

MiMo V2 Pro vs other local coding models

ModelSizeRAM neededCoding qualitySpeed
MiMo V2 Pro~14 GB16 GBGoodFast
Devstral Small 24B~16 GB16 GBBestMedium
Qwen 3.5 27B~17 GB20 GBVery goodMedium
DeepSeek R1 14B~9 GB12 GBGood (reasoning)Slow
Yi-Coder 9B~5 GB8 GBGoodFast

MiMo V2 Pro sits in the middle β€” better than the small models (Yi-Coder, Qwen3 8B) but not quite as good as Devstral Small 24B for pure coding quality. Its advantage is speed β€” it generates code faster than the 24B+ models.

Local vs API

Local (Ollama)API (OpenRouter)
CostFree~$25/mo
Privacyβœ… Full❌ Data sent to API
SpeedDepends on hardwareFast (cloud GPU)
ContextLimited by RAM128K
Offlineβœ… Works offline❌ Needs internet

Run locally for privacy and zero cost. Use the API when you need faster responses or are on weaker hardware.

The MiMo V2 family locally

ModelUse caseOllama command
MiMo V2 ProBest quality codingollama pull mimo-v2-pro
MiMo V2 OmniBalanced quality/speedollama pull mimo-v2-omni
MiMo V2 FlashFastest, lighter tasksollama pull mimo-v2-flash

Use Pro for complex coding, Flash for quick questions and autocomplete. See our MiMo V2 family guide for detailed comparisons.

Troubleshooting

  • β€œmodel not found” β€” check exact name with ollama list
  • Too slow β€” verify GPU is being used: ollama ps
  • Out of memory β€” try MiMo V2 Flash or a quantized version
  • Context too short β€” increase with --num-ctx 32768

See our Ollama troubleshooting guide for all common errors.

Related: MiMo V2 Family Guide Β· MiMo V2 Pro + Aider Setup Β· Best Ollama Models for Coding Β· Ollama Complete Guide Β· Ollama vs LM Studio vs vLLM