πŸ“ Tutorials
Β· 5 min read

Local AI Code Review with Ollama β€” Never Send Code to the Cloud (2026)


Every line of code you paste into a cloud-based review tool leaves your machine. For teams working under NDA, handling regulated data, or building proprietary systems, that’s a non-starter. The good news: you can run a capable AI code reviewer entirely on your own hardware using Ollama.

In this tutorial you’ll build a Python script that pipes git diff output into a local LLM and gets structured review comments back β€” no API keys, no cloud accounts, no data leaving your network.

Why local code review matters

Cloud-hosted AI tools are convenient, but they come with trade-offs that matter in professional settings:

  • Intellectual property exposure. Code sent to third-party APIs may be logged, cached, or used for training. Even with opt-out policies, the data still transits external infrastructure.
  • Regulatory compliance. Industries governed by GDPR, HIPAA, SOC 2, or ITAR often prohibit sending source code to external services. A self-hosted pipeline sidesteps that entirely.
  • Air-gapped environments. Defense, finance, and critical infrastructure teams frequently work on networks with no outbound internet. Local models are the only option.
  • Cost at scale. API calls add up. A local model running on a developer workstation costs nothing per request after the initial setup.

For a deeper look at the privacy implications of AI-assisted coding, we’ve covered that separately.

Running reviews locally also gives you full control over which model you use, how prompts are constructed, and what happens with the output. There’s no vendor lock-in β€” if a better model drops next month, you swap one string and keep going.

What you’ll build

A single Python script β€” review.py β€” that:

  1. Captures the staged git diff (or accepts a diff via stdin).
  2. Sends it to a local Ollama model with a code-review system prompt.
  3. Prints structured feedback: bugs, suggestions, and a summary.

The script works standalone, as a pre-commit git hook, or as a step in a local CI pipeline.

Prerequisites

  • Python 3.10+
  • Ollama installed and running (full setup guide)
  • A code-capable model pulled locally β€” e.g. ollama pull codellama:13b
  • The requests library: pip install requests

The review script

Create a file called review.py in your project root:

#!/usr/bin/env python3
"""Local AI code review using Ollama."""

import subprocess
import sys
import json
import requests

OLLAMA_URL = "http://localhost:11434/api/generate"
MODEL = "codellama:13b"

SYSTEM_PROMPT = """You are a senior software engineer performing a code review.
Given a git diff, provide:
1. **Bugs** β€” logic errors, off-by-one mistakes, null/undefined risks.
2. **Suggestions** β€” readability, performance, idiomatic improvements.
3. **Summary** β€” one-paragraph overall assessment.

Be concise. Reference specific lines from the diff when possible.
If the diff looks fine, say so briefly."""


def get_diff() -> str:
    """Return staged diff, or fall back to stdin."""
    if not sys.stdin.isatty():
        return sys.stdin.read()
    result = subprocess.run(
        ["git", "diff", "--cached"], capture_output=True, text=True
    )
    if not result.stdout.strip():
        result = subprocess.run(
            ["git", "diff"], capture_output=True, text=True
        )
    return result.stdout


def review(diff: str) -> str:
    """Send diff to Ollama and return the review."""
    resp = requests.post(
        OLLAMA_URL,
        json={
            "model": MODEL,
            "prompt": f"Review this diff:\n\n```diff\n{diff}\n```",
            "system": SYSTEM_PROMPT,
            "stream": False,
        },
        timeout=300,
    )
    resp.raise_for_status()
    return resp.json()["response"]


def main():
    diff = get_diff()
    if not diff.strip():
        print("No diff found. Stage changes or pipe a diff to stdin.")
        sys.exit(0)

    print(f"Reviewing {len(diff.splitlines())} lines of diff with {MODEL}...\n")
    print(review(diff))


if __name__ == "__main__":
    main()

Usage

# Review staged changes
python review.py

# Review a specific commit
git diff HEAD~1 | python review.py

# Review a PR branch against main
git diff main...feature-branch | python review.py

You can swap the model by changing the MODEL variable or passing it as an environment variable β€” a small extension left as an exercise.

Using as a git hook

To run the review automatically before every commit, set it up as a pre-commit hook:

# Create the hook
cat > .git/hooks/pre-commit << 'EOF'
#!/usr/bin/env bash
echo "Running local AI code review..."
output=$(python review.py 2>&1)
echo "$output"

# Optional: block commit if the model flags bugs
if echo "$output" | grep -qi "bug"; then
    echo ""
    echo "⚠ AI reviewer flagged potential bugs. Review above and commit with --no-verify to override."
    exit 1
fi
EOF

chmod +x .git/hooks/pre-commit

Now every git commit triggers a local review. Use git commit --no-verify to skip it when needed.

For team-wide adoption, consider moving the hook into a shared .githooks/ directory and configuring core.hooksPath:

git config core.hooksPath .githooks

Best models for code review

Not every model is equally good at spotting bugs in diffs. Here’s what works well locally as of early 2026:

ModelSizeStrengthsMin RAM
codellama:13b13BSolid all-rounder for code tasks16 GB
deepseek-coder-v2:16b16BStrong at multi-language review16 GB
codellama:34b34BBetter reasoning on complex diffs32 GB
qwen2.5-coder:32b32BExcellent instruction following32 GB
llama3:70b70BBest quality, needs serious hardware64 GB

For most developers, the 13B–16B range hits the sweet spot between quality and speed. If you have 32 GB of RAM, the 32B–34B models are noticeably better at catching subtle logic errors.

Want to try the larger models without buying hardware? Cloud GPU providers let you spin up the right GPU in minutes.

See our full comparison of local coding models and models ranked specifically for code review for detailed benchmarks.

Limitations vs cloud tools

Local AI code review is practical and private, but it’s worth being honest about the trade-offs:

  • Context window. Local models typically handle 4K–16K tokens. Very large diffs may need to be split. Cloud models like GPT-4 and Claude offer 100K+ context.
  • Quality ceiling. The best local models are good, but frontier cloud models still lead on nuanced architectural feedback and cross-file reasoning.
  • No project-wide context. The script reviews a diff in isolation. It doesn’t know your codebase conventions, type definitions in other files, or your test suite. Cloud-integrated tools like GitHub Copilot pull in repo context automatically.
  • Speed. On CPU-only machines, a 13B model may take 30–60 seconds per review. A decent GPU (RTX 3090 or better) brings that down to a few seconds.
  • No inline annotations. You get text output, not inline PR comments. Integrating with GitHub/GitLab APIs is possible but requires additional work.

For many teams, the right answer is a hybrid approach: local review for day-to-day development, cloud tools for final PR review on non-sensitive projects. You can also improve the local experience over time β€” adding retrieval-augmented generation (RAG) to pull in relevant files, or fine-tuning a model on your team’s past review comments to match your coding standards.