Local AI Code Review with Ollama β Never Send Code to the Cloud (2026)
Every line of code you paste into a cloud-based review tool leaves your machine. For teams working under NDA, handling regulated data, or building proprietary systems, thatβs a non-starter. The good news: you can run a capable AI code reviewer entirely on your own hardware using Ollama.
In this tutorial youβll build a Python script that pipes git diff output into a local LLM and gets structured review comments back β no API keys, no cloud accounts, no data leaving your network.
Why local code review matters
Cloud-hosted AI tools are convenient, but they come with trade-offs that matter in professional settings:
- Intellectual property exposure. Code sent to third-party APIs may be logged, cached, or used for training. Even with opt-out policies, the data still transits external infrastructure.
- Regulatory compliance. Industries governed by GDPR, HIPAA, SOC 2, or ITAR often prohibit sending source code to external services. A self-hosted pipeline sidesteps that entirely.
- Air-gapped environments. Defense, finance, and critical infrastructure teams frequently work on networks with no outbound internet. Local models are the only option.
- Cost at scale. API calls add up. A local model running on a developer workstation costs nothing per request after the initial setup.
For a deeper look at the privacy implications of AI-assisted coding, weβve covered that separately.
Running reviews locally also gives you full control over which model you use, how prompts are constructed, and what happens with the output. Thereβs no vendor lock-in β if a better model drops next month, you swap one string and keep going.
What youβll build
A single Python script β review.py β that:
- Captures the staged
git diff(or accepts a diff via stdin). - Sends it to a local Ollama model with a code-review system prompt.
- Prints structured feedback: bugs, suggestions, and a summary.
The script works standalone, as a pre-commit git hook, or as a step in a local CI pipeline.
Prerequisites
- Python 3.10+
- Ollama installed and running (full setup guide)
- A code-capable model pulled locally β e.g.
ollama pull codellama:13b - The
requestslibrary:pip install requests
The review script
Create a file called review.py in your project root:
#!/usr/bin/env python3
"""Local AI code review using Ollama."""
import subprocess
import sys
import json
import requests
OLLAMA_URL = "http://localhost:11434/api/generate"
MODEL = "codellama:13b"
SYSTEM_PROMPT = """You are a senior software engineer performing a code review.
Given a git diff, provide:
1. **Bugs** β logic errors, off-by-one mistakes, null/undefined risks.
2. **Suggestions** β readability, performance, idiomatic improvements.
3. **Summary** β one-paragraph overall assessment.
Be concise. Reference specific lines from the diff when possible.
If the diff looks fine, say so briefly."""
def get_diff() -> str:
"""Return staged diff, or fall back to stdin."""
if not sys.stdin.isatty():
return sys.stdin.read()
result = subprocess.run(
["git", "diff", "--cached"], capture_output=True, text=True
)
if not result.stdout.strip():
result = subprocess.run(
["git", "diff"], capture_output=True, text=True
)
return result.stdout
def review(diff: str) -> str:
"""Send diff to Ollama and return the review."""
resp = requests.post(
OLLAMA_URL,
json={
"model": MODEL,
"prompt": f"Review this diff:\n\n```diff\n{diff}\n```",
"system": SYSTEM_PROMPT,
"stream": False,
},
timeout=300,
)
resp.raise_for_status()
return resp.json()["response"]
def main():
diff = get_diff()
if not diff.strip():
print("No diff found. Stage changes or pipe a diff to stdin.")
sys.exit(0)
print(f"Reviewing {len(diff.splitlines())} lines of diff with {MODEL}...\n")
print(review(diff))
if __name__ == "__main__":
main()
Usage
# Review staged changes
python review.py
# Review a specific commit
git diff HEAD~1 | python review.py
# Review a PR branch against main
git diff main...feature-branch | python review.py
You can swap the model by changing the MODEL variable or passing it as an environment variable β a small extension left as an exercise.
Using as a git hook
To run the review automatically before every commit, set it up as a pre-commit hook:
# Create the hook
cat > .git/hooks/pre-commit << 'EOF'
#!/usr/bin/env bash
echo "Running local AI code review..."
output=$(python review.py 2>&1)
echo "$output"
# Optional: block commit if the model flags bugs
if echo "$output" | grep -qi "bug"; then
echo ""
echo "β AI reviewer flagged potential bugs. Review above and commit with --no-verify to override."
exit 1
fi
EOF
chmod +x .git/hooks/pre-commit
Now every git commit triggers a local review. Use git commit --no-verify to skip it when needed.
For team-wide adoption, consider moving the hook into a shared .githooks/ directory and configuring core.hooksPath:
git config core.hooksPath .githooks
Best models for code review
Not every model is equally good at spotting bugs in diffs. Hereβs what works well locally as of early 2026:
| Model | Size | Strengths | Min RAM |
|---|---|---|---|
codellama:13b | 13B | Solid all-rounder for code tasks | 16 GB |
deepseek-coder-v2:16b | 16B | Strong at multi-language review | 16 GB |
codellama:34b | 34B | Better reasoning on complex diffs | 32 GB |
qwen2.5-coder:32b | 32B | Excellent instruction following | 32 GB |
llama3:70b | 70B | Best quality, needs serious hardware | 64 GB |
For most developers, the 13Bβ16B range hits the sweet spot between quality and speed. If you have 32 GB of RAM, the 32Bβ34B models are noticeably better at catching subtle logic errors.
Want to try the larger models without buying hardware? Cloud GPU providers let you spin up the right GPU in minutes.
See our full comparison of local coding models and models ranked specifically for code review for detailed benchmarks.
Limitations vs cloud tools
Local AI code review is practical and private, but itβs worth being honest about the trade-offs:
- Context window. Local models typically handle 4Kβ16K tokens. Very large diffs may need to be split. Cloud models like GPT-4 and Claude offer 100K+ context.
- Quality ceiling. The best local models are good, but frontier cloud models still lead on nuanced architectural feedback and cross-file reasoning.
- No project-wide context. The script reviews a diff in isolation. It doesnβt know your codebase conventions, type definitions in other files, or your test suite. Cloud-integrated tools like GitHub Copilot pull in repo context automatically.
- Speed. On CPU-only machines, a 13B model may take 30β60 seconds per review. A decent GPU (RTX 3090 or better) brings that down to a few seconds.
- No inline annotations. You get text output, not inline PR comments. Integrating with GitHub/GitLab APIs is possible but requires additional work.
For many teams, the right answer is a hybrid approach: local review for day-to-day development, cloud tools for final PR review on non-sensitive projects. You can also improve the local experience over time β adding retrieval-augmented generation (RAG) to pull in relevant files, or fine-tuning a model on your teamβs past review comments to match your coding standards.
Related links
- Ollama Complete Guide (2026) β installation, configuration, and model management
- Best AI Models for Code Review (2026) β benchmarks and rankings
- AI Code & Data Privacy β what happens to code you send to AI services
- Self-Hosted AI & GDPR Compliance β running AI tools under European data regulations
- Best AI Models for Coding Locally (2026) β broader comparison beyond code review