๐Ÿ“ Tutorials
ยท 4 min read

How to Run MAI-Thinking-1 Locally: What We Know About Microsoft's 35B Model (2026)


MAI-Thinking-1 is Microsoftโ€™s 35B reasoning model announced at Build 2026. Developers are already asking: can I run it locally? The short answer: not yet. But here is what we know, what to expect, and what to use in the meantime.

Current status: enterprise-only

As of June 2026, MAI-Thinking-1 is:

  • โŒ Not publicly available
  • โŒ No weights released
  • โŒ No public API
  • โŒ Not on OpenRouter or any third-party provider
  • โœ… Available to Microsoft enterprise customers via Azure

Microsoft has given no indication it will release weights publicly. The MAI models are positioned as proprietary differentiators for Azure, not open-source contributions.

If weights were released: hardware requirements

Based on the 35B parameter size, here is what you would need:

QuantizationMemory neededHardwareSpeed (est.)
FP16~70GBMac Studio 128GB, 1ร— A10015-25 t/s
Q8~35GBRTX 5090 32GB, Mac 64GB25-40 t/s
Q4_K_M~20GBRTX 4090 24GB, Mac 32GB35-55 t/s
Q3_K~15GBRTX 4070 16GB40-60 t/s

A 35B model at Q4 quantization is very manageable on consumer hardware. It would fit on an RTX 4090 (24GB) with room for context, or easily on RTX Spark (128GB). This is the same size class as Mistral Medium 3.5 and Granite 4.1 34B.

The Aion models: local Windows AI (available soon)

While MAI-Thinking-1 itself is enterprise/cloud-only, Microsoft DID announce local models:

  • Aion 1.0 Instruct โ€” On-device reasoning for Windows
  • Aion 1.0 Plan โ€” On-device planning and tool use for Windows

These will ship with RTX Spark hardware this fall and run natively on Windows. They are smaller than MAI-Thinking-1 but designed specifically for on-device agent workflows.

What to use in the meantime (35B-class local models)

If you want a ~35B reasoning model running locally today, these are available now:

Best alternatives at similar size:

ModelSizeHow to runQuality
Qwen 3.7 27B27Bollama pull qwen3.7:27bExcellent coding
Qwen 3.6 35B-A3B35B (3B active)ollama pull qwen3.6:35b-a3bFast (80+ t/s)
Mistral Medium 3.5~40Bollama pull mistral-medium-3.5Strong reasoning
Granite 4.1 34B34Bollama pull granite4.1:34bTool calling
Devstral 2~50Bollama pull devstral2Code specialist

All of these are open-weight, available today, and run on the same hardware that MAI-Thinking-1 would require. Qwen 3.7 27B is the closest match in terms of balancing reasoning + coding at a manageable size.

For enterprise-grade reasoning (API):

If you need MAI-Thinking-1โ€™s target quality (Sonnet 4.6 class) via API today:

See our best AI API providers guide for the full landscape.

Will Microsoft ever open-source MAI models?

Unlikely for the flagship models. Microsoftโ€™s history:

  • Phi models (1-4): Open-source (small models, research-focused)
  • MAI models: Proprietary (enterprise differentiators)
  • Pattern: Small/research models = open. Large/commercial models = closed.

Microsoft may release smaller variants (like they did with Phi) but MAI-Thinking-1 itself will likely remain Azure-exclusive. The Aion local models may be more accessible since they ship with Windows hardware.

The RTX Spark Dev Box angle

Microsoftโ€™s Surface RTX Spark Dev Box ships preloaded with:

  • Windows 11 Pro
  • WSL2 with CUDA GPU passthrough
  • VS Code + GitHub Copilot
  • Python, Git, Node.js

This hardware (128GB unified memory) can run any open-weight 35B model locally. Even if MAI-Thinking-1 stays closed, the Dev Box runs Qwen 3.7 27B, Mistral Medium 3.5, and dozens of other open models that match or exceed MAI-Thinking-1โ€™s claimed quality. See Best LLMs for RTX Spark.

FAQ

When will MAI-Thinking-1 be publicly available?

No date announced. Enterprise Azure access is live. Public API likely Q3 2026 at earliest. Open weights: unlikely ever.

Is the Aion 1.0 model the same as MAI-Thinking-1?

No. Aion models are smaller, designed for on-device Windows tasks. MAI-Thinking-1 is the cloud/enterprise flagship. Think of Aion as โ€œMAI Lite for laptops.โ€

Can I use MAI-Thinking-1 with Aider or Claude Code?

Not yet. No public API exists. When/if Microsoft releases an API, it will likely be Azure-only (not OpenAI-compatible endpoint). Tools like Aider would need specific Azure integration.

Whatโ€™s better right now: MAI-Thinking-1 (if I had access) or DeepSeek V4-Pro?

DeepSeek V4-Pro almost certainly beats MAI-Thinking-1 on coding (80.6% SWE-bench vs Sonnet 4.6-class). MAI-Thinking-1โ€™s advantage is enterprise compliance โ€” commercially licensed data, Azure integration, no Chinese provider concerns. If you donโ€™t have those constraints, DeepSeek is better and available today.

Should I wait for MAI models or use alternatives?

Use alternatives now. Qwen 3.7 27B locally or DeepSeek V4-Pro via API both exceed MAI-Thinking-1โ€™s claimed Sonnet 4.6-class quality at similar or lower cost โ€” and theyโ€™re available today.

What about the Surface RTX Spark Dev Box?

The Surface RTX Spark Dev Box ships with Windows, CUDA, and the full dev stack preloaded. Even without MAI-Thinking-1 weights, it can run every open-weight 35B model at full speed. It is the ideal hardware for local AI development on Windows โ€” whether youโ€™re running Microsoftโ€™s models or open alternatives. If MAI-Thinking-1 ever becomes available locally, the Dev Box would run it effortlessly at Q4 quantization (~20GB of its 128GB used).

Is there a timeline for MAI models becoming open?

No. Microsoft has not announced any plans to open-source MAI-Thinking-1 or MAI-Code-1-Flash. Their smaller Phi models (research-focused) are open, but commercial MAI models appear to be permanent Azure exclusives. The Aion on-device models ship with hardware but whether they are extractable/redistributable is unclear.