Mar 31, 2026 · 3 min read

Run AI Offline — Complete Guide to Air-Gapped AI (2026)

Once you download an AI model to your machine, it runs without internet. No API calls, no cloud servers, no data leaving your device. Here’s how to set up a fully offline AI system.

How offline AI works

AI models are just files — large files (2-200GB), but files nonetheless. Once downloaded, all computation happens locally on your CPU or GPU. The model doesn’t phone home, doesn’t need authentication, and doesn’t require any network connection.

This means you can:

Use AI on a plane with no WiFi
Run AI in air-gapped secure environments
Work in areas with no internet
Guarantee zero data leakage

Setup: download everything while online

You need internet once — to download the tools and models. After that, everything runs offline.

# Step 1: Install Ollama (requires internet)
curl -fsSL https://ollama.com/install.sh | sh

# Step 2: Download models (requires internet)
ollama pull qwen3.5:9b           # General purpose (5.5GB)
ollama pull qwen2.5-coder:32b    # Coding (18GB)
ollama pull codestral             # Autocomplete (12GB)

# Step 3: Disconnect from internet. Everything below works offline.

# Step 4: Run any downloaded model
ollama run qwen3.5:9b

Pre-download for air-gapped systems

If your target machine has no internet at all, download on another machine and transfer:

# On machine WITH internet:
ollama pull qwen3.5:9b

# Find the model files
ls ~/.ollama/models/

# Copy the entire .ollama directory to a USB drive
cp -r ~/.ollama /media/usb/ollama-backup

# On the air-gapped machine:
cp -r /media/usb/ollama-backup ~/.ollama
ollama run qwen3.5:9b  # Works without internet

Best models for offline use

Pick models based on your storage and hardware:

Model	Download size	RAM needed	Best for
Qwen3.5-0.8B	~0.5GB	2GB	Minimal storage, basic tasks
Qwen3.5-9B	~5.5GB	8GB	Best quality for the size
Qwen 2.5 Coder 32B	~18GB	24GB	Offline coding assistant
Codestral	~12GB	16GB	Offline autocomplete
DeepSeek R1 7B	~4GB	6GB	Offline reasoning

Download multiple models while you have internet. They sit on disk and don’t use resources until you run them.

Use cases

Travel. Long flights, remote locations, unreliable hotel WiFi. A laptop with Ollama and a few models is a portable AI assistant that works anywhere.

Secure environments. Government, military, healthcare, and financial institutions often have air-gapped networks. Self-hosted AI lets these organizations use AI without connecting to external servers.

Privacy. Even if you have internet, running offline guarantees your data never leaves your machine. No accidental data leaks, no third-party data retention policies.

Developing countries. Unreliable or expensive internet makes cloud AI impractical. Offline models work regardless of connectivity.

Offline AI in your IDE

Set up Continue + Ollama before going offline:

Install Continue extension in VS Code (requires internet)
Configure it to use Ollama (localhost)
Download your models
Go offline — everything continues to work

Your coding assistant works on a plane, in a bunker, or in the middle of nowhere.

Limitations

No model updates. You’re stuck with whatever version you downloaded.
No web search. Models can’t look up current information.
Storage. Good models are 5-20GB each. Plan your storage.
Initial download. The first download requires fast internet. A 32B model at 18GB takes time on slow connections.

Related: Best VPNs for Developers

Run AI Offline — Complete Guide to Air-Gapped AI (2026)

How offline AI works

Setup: download everything while online

Pre-download for air-gapped systems

Best models for offline use

Use cases

Offline AI in your IDE

Limitations

Related

📬 AI Dev Weekly

You might also like

How to Run GLM-5.1 Locally — Hardware, Setup, and Quantization Guide (2026)

How to Replace GitHub Copilot for Free — Step-by-Step Guide (2026)

How to Run AI Without a GPU — CPU-Only Inference Guide (2026)

How to Run DeepSeek Locally — V3 and R1 Setup Guide