You download a model from HuggingFace, load it into Ollama, and start coding with it. But who made that model? Whatβs in the weights? Could it be compromised?
AI model supply chain security is the next frontier of software security β and almost nobody is thinking about it yet.
The risks
1. Malicious code in model files
PyTorch model files (.pt, .bin) can contain arbitrary Python code that executes when loaded. A model that looks like a fine-tuned Llama could run malicious code on your machine.
Mitigation: Use SafeTensors format (.safetensors) which canβt contain executable code. Most reputable models on HuggingFace now use SafeTensors. Ollama uses GGUF which is also safe.
2. Backdoored model weights
A model can be trained to behave normally on most inputs but produce specific outputs when triggered by a secret phrase. This is called a βsleeper agentβ or βbackdoor.β
Example: A coding model that writes correct code 99.9% of the time but introduces a subtle vulnerability when it sees a specific comment pattern.
Mitigation: Hard to detect. Use models from known, reputable organizations (Google, Mistral, Z.ai, Meta). Be cautious with anonymous fine-tunes.
3. Data poisoning
The training data contained malicious examples that bias the modelβs behavior. The model might recommend insecure coding patterns, leak training data, or produce biased outputs.
Mitigation: Use models with documented training processes. Check if the organization publishes model cards and training details.
4. Dependency confusion
You install ollama pull qwen3.5:27b but a typosquatted model qwen35:27b exists with malicious weights.
Mitigation: Double-check model names. Use official organization pages on HuggingFace. Verify download URLs.
How to protect yourself
For local models
- Only download from verified organizations on HuggingFace (look for the β badge)
- Prefer SafeTensors or GGUF format over PyTorch
.binfiles - Verify checksums β compare SHA256 with the published value
- Run in a sandbox β Docker container or VM, not directly on your dev machine
- Pin versions β donβt auto-update models in production
For API providers
API providers (OpenRouter, Anthropic, OpenAI) handle model security for you. The risk shifts to:
- Data privacy (see our GDPR guide)
- Provider compromise (rare but possible)
- Model version changes without notice
For MCP servers
Third-party MCP servers from npm/PyPI are software dependencies β treat them like any other:
- Audit before installing
- Pin versions
- Review updates
- Run in sandboxed environments
The trusted model sources
| Source | Trust level | Why |
|---|---|---|
| Google (Gemma) | β High | Major company, documented process |
| Mistral | β High | EU company, published research |
| Z.ai (GLM) | β High | Public company, MIT license |
| Meta (Llama) | β High | Major company, open research |
| DeepSeek | β High | Published papers, MIT license |
| Unsloth (quantizations) | β οΈ Medium | Reputable but third-party |
| Random HuggingFace users | β Low | No verification, no accountability |
Bottom line
The AI model supply chain is where software supply chain security was 10 years ago β before SolarWinds, before Log4j. The risks are real but manageable with basic hygiene: use trusted sources, verify formats, pin versions, and sandbox execution.
Checklist for downloading models safely
Before downloading any model:
- Source is a verified organization on HuggingFace (β badge)
- Format is SafeTensors or GGUF (not
.binor.pt) - Model card exists with training details and intended use
- License is clear (MIT, Apache 2.0, or specific model license)
- Community feedback β check comments and discussions for red flags
- Checksum verified β compare SHA256 with published value
- Running in sandbox β Docker container or VM, not bare metal
For organizations: model governance
If your team uses open-source models in production, establish a model governance process:
- Approved model list β maintain a list of vetted models and versions
- Review process β new models require security review before production use
- Version pinning β never auto-update models in production
- Audit trail β document which models are used where and why
- Incident response β plan for what happens if a model is found to be compromised
This mirrors how mature organizations handle software dependencies (approved libraries, vulnerability scanning, version pinning). The same discipline applies to AI models.
The GGUF advantage
GGUF (GPT-Generated Unified Format) is the safest format for local models:
- No executable code β pure tensor data, canβt run arbitrary code
- Standardized β used by Ollama, llama.cpp, and most local inference tools
- Metadata included β model architecture and parameters embedded in the file
- Quantization built-in β different precision levels in one format
When downloading from HuggingFace, always prefer GGUF over PyTorch formats for local use.
Related: AI Security Checklist for Startups Β· Prompt Injection Explained Β· Best AI Coding Agents for Privacy Β· Ollama Complete Guide Β· Red Team Your AI Application