Microsoft announced the Aion 1.0 model family at Build 2026 — small AI models designed to run directly on Windows devices without cloud API calls. Two variants: Aion 1.0 Instruct (reasoning and instruction following) and Aion 1.0 Plan (planning, tool use, and agent workflows).
These ship with NVIDIA RTX Spark hardware this fall, pre-installed and optimized for on-device inference. They represent Microsoft’s bet that important AI workloads should run locally — for privacy, speed, and offline capability.
The two Aion models
Aion 1.0 Instruct
| Purpose | On-device reasoning and instruction following |
|---|---|
| Use cases | Answer questions, follow multi-step instructions, generate content locally |
| Runs on | Windows devices with RTX Spark or similar hardware |
| Cloud needed | ❌ (fully on-device) |
| Size | Small (optimized for device deployment) |
Aion Instruct is the “thinking” model for local tasks — when you need AI reasoning without sending data to the cloud. Think of it as a local Claude Haiku or GPT-4o-mini equivalent built into Windows.
Aion 1.0 Plan
| Purpose | Planning, tool use, and agent workflows |
|---|---|
| Use cases | Multi-step task planning, calling Windows APIs, orchestrating actions |
| Runs on | Windows devices with RTX Spark or similar hardware |
| Cloud needed | ❌ (fully on-device) |
| Designed for | Local AI agents that interact with Windows apps |
Aion Plan is the “acting” model — it plans sequences of actions and calls tools (Windows APIs, file system, applications). This is what powers the “agent-native Windows” vision: AI agents that run on your machine and interact with your apps locally.
How they fit into the Windows AI stack
Microsoft is building a layered local AI architecture:
| Layer | Technology | Purpose |
|---|---|---|
| Hardware | RTX Spark (128GB, Blackwell) | Run models locally |
| Models | Aion 1.0 Instruct + Plan | On-device reasoning + action |
| Runtime | NVIDIA OpenShell | Security and privacy policy |
| Containers | Microsoft Execution Containers (MXC) | Sandboxed agent isolation |
| Routing | Privacy-based query routing | Local vs cloud decisions |
| Agents | Hermes Agent, OpenClaw | User-facing agent apps |
The key insight: Aion models handle tasks that should never leave the device (sensitive data, private documents, internal workflows), while cloud models (MAI-Thinking-1, GPT-5.5) handle tasks that benefit from more compute.
How Aion compares to existing local AI
| Aion 1.0 | Ollama + Open Models | RTX Spark + Qwen 27B | |
|---|---|---|---|
| Setup | Pre-installed on Windows | Manual install | Manual install |
| Models | 2 (Instruct + Plan) | 100+ (any GGUF) | 100+ (any GGUF) |
| Quality | Unknown (no benchmarks yet) | Varies (7B-120B) | Strong (27B+) |
| Windows integration | ✅ (native APIs, MXC) | ❌ (generic API) | ❌ (generic API) |
| Agent capabilities | ✅ (built-in tool calling) | Requires custom code | Requires custom code |
| Open weight | ❌ (Microsoft proprietary) | ✅ | ✅ |
| Size | Small (fast on-device) | Any size | Any size |
Aion’s advantage: native Windows integration and pre-installed simplicity. Open models’ advantage: flexibility, community, proven benchmarks, and model choice.
What developers can do with Aion
Based on the Build 2026 demos:
Privacy-sensitive reasoning
- Process confidential documents without cloud upload
- Analyze internal code that cannot leave the device
- Personal assistant with access to local files (privacy-preserved)
Local agent workflows
- Agent that reads your calendar, drafts emails, manages files
- Code assistant that understands your local project structure
- Automated testing that runs entirely on-device
Hybrid local-cloud routing
- NVIDIA OpenShell routes sensitive queries to local Aion
- Non-sensitive queries go to cloud models for better quality
- Developer controls the routing policy
When available
- Ships this fall with RTX Spark hardware (Surface RTX Spark Dev Box, partner laptops)
- Windows integration via new Windows AI APIs
- No standalone download announced (hardware-bundled)
What to use today (before Aion ships)
For local AI on Windows right now:
- Ollama — Install, pull a model, run. Closest to “pre-installed” experience.
- LM Studio — GUI for browsing and running local models.
- Jan AI — Chat interface for non-technical users.
- WSL2 + vLLM — Production-grade serving on Windows via Linux subsystem.
Best models for Windows local AI today: Qwen 3.6 27B, Qwen 3.6 35B-A3B, Mistral Medium 3.5. See our best free local AI tools guide.
FAQ
Can I install Aion on my current Windows PC?
Not announced. Aion appears to be bundled with RTX Spark hardware. Whether it will be available as a standalone download for existing PCs is unclear. The hardware requirement (128GB unified memory) suggests it needs RTX Spark-class specs.
Is Aion better than Ollama + Qwen 27B?
Different strengths. Aion: native Windows integration, pre-installed, agent-ready. Ollama + Qwen: proven benchmarks, larger model, open-weight, available now. Aion’s quality is unproven (no public benchmarks).
Will Aion work for coding?
Likely limited. Its “small” size suggests it won’t match 27B+ models on coding tasks. Use it for quick local queries and agent orchestration. For serious coding, pair with larger models or cloud APIs.
Does this replace the need for Ollama on Windows?
For Windows-integrated agent tasks: possibly. For developer coding workflows: no. Ollama gives you model choice, community support, and proven quality that Aion (proprietary, unproven) cannot yet match.
How does this relate to NVIDIA OpenShell?
OpenShell provides the security/privacy layer. Aion provides the intelligence. Together they enable: “this query has private data → route to local Aion” vs “this query is general → route to cloud model.” The routing is policy-driven by OpenShell.