Jun 7, 2026 · 6 min read

Generate Unit Tests with Ollama — Never Write Tests Manually Again (2026)

Writing unit tests is one of those tasks every developer knows they should do but often skip under deadline pressure. What if you could point a script at any source file and get a complete test suite back in seconds — running entirely on your machine?

That’s what we’re building today. A Python CLI tool that reads a source file, sends it to Ollama with a carefully crafted prompt, and writes a ready-to-run test file. It supports both Python (pytest) and JavaScript (jest), detects the language automatically, and needs zero cloud credentials.

If you’re new to the concept, check out our primer on what AI test generation actually is before diving in. For a broader look at the landscape, see our roundup of the best AI testing tools in 2026.

Prerequisites

Python 3.10+
Ollama installed and running (ollama serve)
A code-capable model pulled — qwen2.5-coder:7b or codellama:13b work well (see best Ollama models for coding)
The requests library (pip install requests)

The Prompt Template

The quality of generated tests lives or dies with the prompt. After testing dozens of variations, this template consistently produces runnable tests with good coverage:

PROMPT_TEMPLATES = {
    "python": """You are an expert Python testing engineer.
Given the following Python source code, generate a comprehensive pytest test suite.

Rules:
- Import the module correctly based on the filename provided.
- Test every public function and class method.
- Include edge cases: empty inputs, None values, boundary conditions.
- Use descriptive test names following test_<function>_<scenario> convention.
- Add brief docstrings to each test.
- Only output valid Python code. No explanations, no markdown fences.

Filename: {filename}

Source code:
{code}""",

    "javascript": """You are an expert JavaScript testing engineer.
Given the following JavaScript source code, generate a comprehensive Jest test suite.

Rules:
- Use require() to import the module based on the filename provided.
- Test every exported function.
- Include edge cases: empty inputs, undefined, null, boundary conditions.
- Use descriptive test names in it() blocks.
- Only output valid JavaScript code. No explanations, no markdown fences.

Filename: {filename}

Source code:
{code}""",
}

Key details that make this work: telling the model to skip markdown fences prevents the common problem of getting ```python wrappers in the output. Specifying the filename lets the model generate correct import paths. Demanding edge cases pushes coverage beyond the happy path.

The Complete Script

Save this as generate_tests.py:

#!/usr/bin/env python3
"""Generate unit tests from source files using Ollama."""

import argparse
import sys
from pathlib import Path

import requests

OLLAMA_URL = "http://localhost:11434/api/generate"
DEFAULT_MODEL = "qwen2.5-coder:7b"

PROMPT_TEMPLATES = {
    "python": """You are an expert Python testing engineer.
Given the following Python source code, generate a comprehensive pytest test suite.

Rules:
- Import the module correctly based on the filename provided.
- Test every public function and class method.
- Include edge cases: empty inputs, None values, boundary conditions.
- Use descriptive test names following test_<function>_<scenario> convention.
- Add brief docstrings to each test.
- Only output valid Python code. No explanations, no markdown fences.

Filename: {filename}
Source code:
{code}""",
    "javascript": """You are an expert JavaScript testing engineer.
Given the following JavaScript source code, generate a comprehensive Jest test suite.

Rules:
- Use require() to import the module based on the filename provided.
- Test every exported function.
- Include edge cases: empty inputs, undefined, null, boundary conditions.
- Use descriptive test names in it() blocks.
- Only output valid JavaScript code. No explanations, no markdown fences.

Filename: {filename}
Source code:
{code}""",
}

LANG_MAP = {".py": "python", ".js": "javascript", ".mjs": "javascript", ".ts": "javascript"}
TEST_EXT = {"python": ".py", "javascript": ".test.js"}


def detect_language(filepath: Path) -> str:
    lang = LANG_MAP.get(filepath.suffix)
    if not lang:
        sys.exit(f"Unsupported file type: {filepath.suffix}")
    return lang


def build_prompt(code: str, filename: str, language: str) -> str:
    return PROMPT_TEMPLATES[language].format(code=code, filename=filename)


def call_ollama(prompt: str, model: str) -> str:
    resp = requests.post(
        OLLAMA_URL,
        json={"model": model, "prompt": prompt, "stream": False},
        timeout=300,
    )
    resp.raise_for_status()
    return resp.json()["response"]


def clean_output(text: str) -> str:
    """Strip markdown fences if the model ignores our instruction."""
    lines = text.strip().splitlines()
    if lines and lines[0].startswith("```"):
        lines = lines[1:]
    if lines and lines[-1].strip() == "```":
        lines = lines[:-1]
    return "\n".join(lines)


def main():
    parser = argparse.ArgumentParser(description="Generate unit tests with Ollama")
    parser.add_argument("source", help="Path to the source file")
    parser.add_argument("-m", "--model", default=DEFAULT_MODEL, help="Ollama model name")
    parser.add_argument("-o", "--output", help="Output test file path")
    args = parser.parse_args()

    source = Path(args.source)
    if not source.exists():
        sys.exit(f"File not found: {source}")

    language = detect_language(source)
    code = source.read_text()

    print(f"Generating {language} tests for {source.name} using {args.model}...")
    prompt = build_prompt(code, source.name, language)
    raw = call_ollama(prompt, args.model)
    test_code = clean_output(raw)

    if args.output:
        out_path = Path(args.output)
    else:
        prefix = "test_" if language == "python" else ""
        ext = TEST_EXT[language]
        out_path = source.parent / f"{prefix}{source.stem}{ext}"

    out_path.write_text(test_code)
    print(f"Tests written to {out_path}")


if __name__ == "__main__":
    main()

That’s roughly 80 lines of code. No frameworks, no abstractions — just file I/O, an HTTP call, and string handling.

Usage

Generate tests for a Python file:

python generate_tests.py utils.py
# → writes test_utils.py

python generate_tests.py utils.py -o tests/test_utils.py
# → writes to specific path

Generate tests for JavaScript:

python generate_tests.py helpers.js
# → writes helpers.test.js

python generate_tests.py src/math.js -m codellama:13b
# → uses a different model

Example Run

Say you have this calculator.py:

def add(a, b):
    return a + b

def divide(a, b):
    if b == 0:
        raise ValueError("Cannot divide by zero")
    return a / b

Running python generate_tests.py calculator.py produces something like:

from calculator import add, divide
import pytest

def test_add_positive_numbers():
    """Test addition with positive integers."""
    assert add(2, 3) == 5

def test_add_negative_numbers():
    """Test addition with negative values."""
    assert add(-1, -1) == -2

def test_add_zero():
    """Test addition with zero."""
    assert add(0, 5) == 5

def test_divide_normal():
    """Test standard division."""
    assert divide(10, 2) == 5.0

def test_divide_by_zero():
    """Test that dividing by zero raises ValueError."""
    with pytest.raises(ValueError):
        divide(1, 0)

def test_divide_result_is_float():
    """Test that division returns a float."""
    assert isinstance(divide(7, 2), float)

Runnable out of the box with pytest. The model picks up the zero-division guard and tests it automatically.

Tips for Better Results

Pick the right model. Larger models produce better tests. qwen2.5-coder:7b hits a good speed/quality balance. For complex codebases, step up to a 13B or 32B parameter model. Our coding model comparison has benchmarks.

Keep source files focused. A 50-line utility file gets better tests than a 500-line monolith. If your files are large, split them or pass specific functions.

Review and run the output. AI-generated tests are a starting point. Run them, check assertions make sense, and add cases the model missed. This still saves 70-80% of the manual effort.

Chain with code review. Pair this with a local AI code review setup to catch issues in both your source code and the generated tests.

Adjust the temperature. For more deterministic output, add "temperature": 0.2 to the Ollama request JSON. For more creative edge-case exploration, bump it to 0.7.

Extending the Script

A few ideas if you want to take this further:

Watch mode — use watchdog to regenerate tests whenever a source file changes
Batch processing — accept a directory and generate tests for every file in it
Coverage feedback loop — run the generated tests, feed the coverage report back to the model, and ask it to fill gaps
TypeScript support — add a TypeScript prompt template and map .ts files to it

Wrapping Up

You now have a working local test generator that runs entirely on your hardware. No API keys, no usage limits, no code leaving your machine. The script is intentionally minimal — extend it to fit your workflow.

The combination of Ollama’s local inference and a well-tuned prompt gets you surprisingly far. For most utility code, the generated tests are ready to commit with minor tweaks.

Start with your simplest utility files, verify the output, and gradually trust it with more complex code. Your test coverage will thank you.

Generate Unit Tests with Ollama — Never Write Tests Manually Again (2026)

Prerequisites

The Prompt Template

The Complete Script

Usage

Example Run

Tips for Better Results

Extending the Script

Wrapping Up

📬 AI Dev Weekly

You might also like

Build an AI-Powered Cron Job Monitor That Explains Failures

Build a Local AI Image Describer — Vision Models + Ollama

Build an AI Expense Tracker That Reads Your Bank CSV Files

Build a CLI That Generates README Files From Your Code