Generate Unit Tests with Ollama β Never Write Tests Manually Again (2026)
Writing unit tests is one of those tasks every developer knows they should do but often skip under deadline pressure. What if you could point a script at any source file and get a complete test suite back in seconds β running entirely on your machine?
Thatβs what weβre building today. A Python CLI tool that reads a source file, sends it to Ollama with a carefully crafted prompt, and writes a ready-to-run test file. It supports both Python (pytest) and JavaScript (jest), detects the language automatically, and needs zero cloud credentials.
If youβre new to the concept, check out our primer on what AI test generation actually is before diving in. For a broader look at the landscape, see our roundup of the best AI testing tools in 2026.
Prerequisites
- Python 3.10+
- Ollama installed and running (
ollama serve) - A code-capable model pulled β
qwen2.5-coder:7borcodellama:13bwork well (see best Ollama models for coding) - The
requestslibrary (pip install requests)
The Prompt Template
The quality of generated tests lives or dies with the prompt. After testing dozens of variations, this template consistently produces runnable tests with good coverage:
PROMPT_TEMPLATES = {
"python": """You are an expert Python testing engineer.
Given the following Python source code, generate a comprehensive pytest test suite.
Rules:
- Import the module correctly based on the filename provided.
- Test every public function and class method.
- Include edge cases: empty inputs, None values, boundary conditions.
- Use descriptive test names following test_<function>_<scenario> convention.
- Add brief docstrings to each test.
- Only output valid Python code. No explanations, no markdown fences.
Filename: {filename}
Source code:
{code}""",
"javascript": """You are an expert JavaScript testing engineer.
Given the following JavaScript source code, generate a comprehensive Jest test suite.
Rules:
- Use require() to import the module based on the filename provided.
- Test every exported function.
- Include edge cases: empty inputs, undefined, null, boundary conditions.
- Use descriptive test names in it() blocks.
- Only output valid JavaScript code. No explanations, no markdown fences.
Filename: {filename}
Source code:
{code}""",
}
Key details that make this work: telling the model to skip markdown fences prevents the common problem of getting ```python wrappers in the output. Specifying the filename lets the model generate correct import paths. Demanding edge cases pushes coverage beyond the happy path.
The Complete Script
Save this as generate_tests.py:
#!/usr/bin/env python3
"""Generate unit tests from source files using Ollama."""
import argparse
import sys
from pathlib import Path
import requests
OLLAMA_URL = "http://localhost:11434/api/generate"
DEFAULT_MODEL = "qwen2.5-coder:7b"
PROMPT_TEMPLATES = {
"python": """You are an expert Python testing engineer.
Given the following Python source code, generate a comprehensive pytest test suite.
Rules:
- Import the module correctly based on the filename provided.
- Test every public function and class method.
- Include edge cases: empty inputs, None values, boundary conditions.
- Use descriptive test names following test_<function>_<scenario> convention.
- Add brief docstrings to each test.
- Only output valid Python code. No explanations, no markdown fences.
Filename: {filename}
Source code:
{code}""",
"javascript": """You are an expert JavaScript testing engineer.
Given the following JavaScript source code, generate a comprehensive Jest test suite.
Rules:
- Use require() to import the module based on the filename provided.
- Test every exported function.
- Include edge cases: empty inputs, undefined, null, boundary conditions.
- Use descriptive test names in it() blocks.
- Only output valid JavaScript code. No explanations, no markdown fences.
Filename: {filename}
Source code:
{code}""",
}
LANG_MAP = {".py": "python", ".js": "javascript", ".mjs": "javascript", ".ts": "javascript"}
TEST_EXT = {"python": ".py", "javascript": ".test.js"}
def detect_language(filepath: Path) -> str:
lang = LANG_MAP.get(filepath.suffix)
if not lang:
sys.exit(f"Unsupported file type: {filepath.suffix}")
return lang
def build_prompt(code: str, filename: str, language: str) -> str:
return PROMPT_TEMPLATES[language].format(code=code, filename=filename)
def call_ollama(prompt: str, model: str) -> str:
resp = requests.post(
OLLAMA_URL,
json={"model": model, "prompt": prompt, "stream": False},
timeout=300,
)
resp.raise_for_status()
return resp.json()["response"]
def clean_output(text: str) -> str:
"""Strip markdown fences if the model ignores our instruction."""
lines = text.strip().splitlines()
if lines and lines[0].startswith("```"):
lines = lines[1:]
if lines and lines[-1].strip() == "```":
lines = lines[:-1]
return "\n".join(lines)
def main():
parser = argparse.ArgumentParser(description="Generate unit tests with Ollama")
parser.add_argument("source", help="Path to the source file")
parser.add_argument("-m", "--model", default=DEFAULT_MODEL, help="Ollama model name")
parser.add_argument("-o", "--output", help="Output test file path")
args = parser.parse_args()
source = Path(args.source)
if not source.exists():
sys.exit(f"File not found: {source}")
language = detect_language(source)
code = source.read_text()
print(f"Generating {language} tests for {source.name} using {args.model}...")
prompt = build_prompt(code, source.name, language)
raw = call_ollama(prompt, args.model)
test_code = clean_output(raw)
if args.output:
out_path = Path(args.output)
else:
prefix = "test_" if language == "python" else ""
ext = TEST_EXT[language]
out_path = source.parent / f"{prefix}{source.stem}{ext}"
out_path.write_text(test_code)
print(f"Tests written to {out_path}")
if __name__ == "__main__":
main()
Thatβs roughly 80 lines of code. No frameworks, no abstractions β just file I/O, an HTTP call, and string handling.
Usage
Generate tests for a Python file:
python generate_tests.py utils.py
# β writes test_utils.py
python generate_tests.py utils.py -o tests/test_utils.py
# β writes to specific path
Generate tests for JavaScript:
python generate_tests.py helpers.js
# β writes helpers.test.js
python generate_tests.py src/math.js -m codellama:13b
# β uses a different model
Example Run
Say you have this calculator.py:
def add(a, b):
return a + b
def divide(a, b):
if b == 0:
raise ValueError("Cannot divide by zero")
return a / b
Running python generate_tests.py calculator.py produces something like:
from calculator import add, divide
import pytest
def test_add_positive_numbers():
"""Test addition with positive integers."""
assert add(2, 3) == 5
def test_add_negative_numbers():
"""Test addition with negative values."""
assert add(-1, -1) == -2
def test_add_zero():
"""Test addition with zero."""
assert add(0, 5) == 5
def test_divide_normal():
"""Test standard division."""
assert divide(10, 2) == 5.0
def test_divide_by_zero():
"""Test that dividing by zero raises ValueError."""
with pytest.raises(ValueError):
divide(1, 0)
def test_divide_result_is_float():
"""Test that division returns a float."""
assert isinstance(divide(7, 2), float)
Runnable out of the box with pytest. The model picks up the zero-division guard and tests it automatically.
Tips for Better Results
Pick the right model. Larger models produce better tests. qwen2.5-coder:7b hits a good speed/quality balance. For complex codebases, step up to a 13B or 32B parameter model. Our coding model comparison has benchmarks.
Keep source files focused. A 50-line utility file gets better tests than a 500-line monolith. If your files are large, split them or pass specific functions.
Review and run the output. AI-generated tests are a starting point. Run them, check assertions make sense, and add cases the model missed. This still saves 70-80% of the manual effort.
Chain with code review. Pair this with a local AI code review setup to catch issues in both your source code and the generated tests.
Adjust the temperature. For more deterministic output, add "temperature": 0.2 to the Ollama request JSON. For more creative edge-case exploration, bump it to 0.7.
Extending the Script
A few ideas if you want to take this further:
- Watch mode β use
watchdogto regenerate tests whenever a source file changes - Batch processing β accept a directory and generate tests for every file in it
- Coverage feedback loop β run the generated tests, feed the coverage report back to the model, and ask it to fill gaps
- TypeScript support β add a TypeScript prompt template and map
.tsfiles to it
Wrapping Up
You now have a working local test generator that runs entirely on your hardware. No API keys, no usage limits, no code leaving your machine. The script is intentionally minimal β extend it to fit your workflow.
The combination of Ollamaβs local inference and a well-tuned prompt gets you surprisingly far. For most utility code, the generated tests are ready to commit with minor tweaks.
Start with your simplest utility files, verify the output, and gradually trust it with more complex code. Your test coverage will thank you.