Running AI models locally means running someone else’s code on your machine. Open-source doesn’t mean safe — model weights can contain arbitrary code, and inference servers can have vulnerabilities. If you’re running models from Hugging Face, Ollama, or llama.cpp, you should sandbox them.
Here’s how, from easiest to most secure.
Level 1: Docker (Good Enough for Most People)
Docker isolates the model in a container with its own filesystem, network, and process space. The model can’t see your files unless you explicitly mount them.
# Run Ollama in Docker — no access to your home directory
docker run -d \
--name ollama \
--gpus all \
-p 11434:11434 \
-v ollama_data:/root/.ollama \
ollama/ollama
Key flags:
-v ollama_data:/root/.ollama— uses a named volume, NOT a bind mount to your home dir- Don’t use
-v /home/user:/data— that gives the container access to your files
Restrict network access
By default, Docker containers can reach the internet. If you want the model fully isolated:
# Create an isolated network
docker network create --internal ai-sandbox
# Run with no internet access
docker run -d \
--name ollama \
--network ai-sandbox \
--gpus all \
-v ollama_data:/root/.ollama \
ollama/ollama
The --internal flag blocks all outbound traffic. The model can’t phone home.
Read-only filesystem
docker run -d \
--name ollama \
--read-only \
--tmpfs /tmp \
--gpus all \
-v ollama_data:/root/.ollama \
ollama/ollama
The container can’t write anywhere except /tmp and the data volume.
Level 2: Docker with Resource Limits
Prevent a model from eating all your RAM or CPU:
docker run -d \
--name ollama \
--gpus all \
--memory=16g \
--cpus=4 \
--pids-limit=100 \
-v ollama_data:/root/.ollama \
ollama/ollama
--memory=16g— hard RAM limit, container gets killed if it exceeds this--cpus=4— can only use 4 CPU cores--pids-limit=100— prevents fork bombs
Level 3: Run as Non-Root
By default, processes inside Docker run as root (inside the container). Add a non-root user:
FROM ollama/ollama
RUN useradd -m aiuser
USER aiuser
docker build -t ollama-sandboxed .
docker run -d --name ollama --gpus all ollama-sandboxed
Now even if the model exploits a vulnerability, it’s running as an unprivileged user.
Level 4: VM Isolation (Maximum Security)
For maximum isolation, run models inside a virtual machine. The VM has its own kernel — a container escape won’t help an attacker.
Using Multipass (easiest):
# Create a VM with 16GB RAM and 4 CPUs
multipass launch --name ai-sandbox --memory 16G --cpus 4 --disk 50G
# Shell into it
multipass shell ai-sandbox
# Install Ollama inside the VM
curl -fsSL https://ollama.com/install.sh | sh
Using Lima (macOS):
brew install lima
limactl start --name=ai-sandbox --cpus=4 --memory=16
limactl shell ai-sandbox
The model runs in a completely separate OS. Your host filesystem is untouched.
Level 5: Network Firewall Rules
If you’re exposing a local model API (e.g., Ollama on port 11434), lock it down:
# Linux: only allow localhost
sudo iptables -A INPUT -p tcp --dport 11434 -s 127.0.0.1 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 11434 -j DROP
# macOS: use pf
echo "block in on en0 proto tcp to any port 11434" | sudo pfctl -ef -
This prevents other devices on your network from accessing your model.
Quick Reference
| Threat | Solution |
|---|---|
| Model reads your files | Docker with named volumes (no bind mounts) |
| Model phones home | --network=internal in Docker |
| Model eats all RAM | --memory and --cpus limits |
| Container escape | VM isolation (Multipass/Lima) |
| Network exposure | Firewall rules, bind to localhost only |
| Malicious model weights | Run as non-root, read-only filesystem |
What Most People Should Do
For hobby use: Docker with a named volume and no bind mounts. That’s it. You’re already safer than 95% of people running ollama serve directly on their host.
For anything sensitive: Docker + internal network + resource limits. Takes 5 minutes to set up and covers all realistic threats.
For paranoid (or production): VM isolation. Full stop.
Related resources
- Ollama vs llama.cpp vs vLLM — Which Should You Use?
- Run AI Offline — Complete Guide to Air-Gapped AI
- Self-Hosted AI vs API — When to Pay and When to Run Locally
- Best Self-Hosted AI Models in 2026
- Docker cheat sheet
- How Docker Containers Actually Work Under the Hood
Running AI for your business? See How to Set Up AI for Free — A Guide for Every Profession for profession-specific setups (sales, marketing, real estate, HR, and more).