πŸ€– AI Tools
Β· 3 min read

Run AI Offline β€” Complete Guide to Air-Gapped AI (2026)


Once you download an AI model to your machine, it runs without internet. No API calls, no cloud servers, no data leaving your device. Here’s how to set up a fully offline AI system.

How offline AI works

AI models are just files β€” large files (2-200GB), but files nonetheless. Once downloaded, all computation happens locally on your CPU or GPU. The model doesn’t phone home, doesn’t need authentication, and doesn’t require any network connection.

This means you can:

  • Use AI on a plane with no WiFi
  • Run AI in air-gapped secure environments
  • Work in areas with no internet
  • Guarantee zero data leakage

Setup: download everything while online

You need internet once β€” to download the tools and models. After that, everything runs offline.

# Step 1: Install Ollama (requires internet)
curl -fsSL https://ollama.com/install.sh | sh

# Step 2: Download models (requires internet)
ollama pull qwen3.5:9b           # General purpose (5.5GB)
ollama pull qwen2.5-coder:32b    # Coding (18GB)
ollama pull codestral             # Autocomplete (12GB)

# Step 3: Disconnect from internet. Everything below works offline.

# Step 4: Run any downloaded model
ollama run qwen3.5:9b

Pre-download for air-gapped systems

If your target machine has no internet at all, download on another machine and transfer:

# On machine WITH internet:
ollama pull qwen3.5:9b

# Find the model files
ls ~/.ollama/models/

# Copy the entire .ollama directory to a USB drive
cp -r ~/.ollama /media/usb/ollama-backup

# On the air-gapped machine:
cp -r /media/usb/ollama-backup ~/.ollama
ollama run qwen3.5:9b  # Works without internet

Best models for offline use

Pick models based on your storage and hardware:

ModelDownload sizeRAM neededBest for
Qwen3.5-0.8B~0.5GB2GBMinimal storage, basic tasks
Qwen3.5-9B~5.5GB8GBBest quality for the size
Qwen 2.5 Coder 32B~18GB24GBOffline coding assistant
Codestral~12GB16GBOffline autocomplete
DeepSeek R1 7B~4GB6GBOffline reasoning

Download multiple models while you have internet. They sit on disk and don’t use resources until you run them.

Use cases

Travel. Long flights, remote locations, unreliable hotel WiFi. A laptop with Ollama and a few models is a portable AI assistant that works anywhere.

Secure environments. Government, military, healthcare, and financial institutions often have air-gapped networks. Self-hosted AI lets these organizations use AI without connecting to external servers.

Privacy. Even if you have internet, running offline guarantees your data never leaves your machine. No accidental data leaks, no third-party data retention policies.

Developing countries. Unreliable or expensive internet makes cloud AI impractical. Offline models work regardless of connectivity.

Offline AI in your IDE

Set up Continue + Ollama before going offline:

  1. Install Continue extension in VS Code (requires internet)
  2. Configure it to use Ollama (localhost)
  3. Download your models
  4. Go offline β€” everything continues to work

Your coding assistant works on a plane, in a bunker, or in the middle of nowhere.

Limitations

  • No model updates. You’re stuck with whatever version you downloaded.
  • No web search. Models can’t look up current information.
  • Storage. Good models are 5-20GB each. Plan your storage.
  • Initial download. The first download requires fast internet. A 32B model at 18GB takes time on slow connections.

Related: Best VPNs for Developers