Jun 21, 2026 · 6 min read

Build an AI API Documentation Generator From Your Codebase

API documentation is the thing everyone agrees is important and nobody wants to write. Routes get added, parameters change, response shapes evolve — and the docs stay frozen in time, three sprints behind reality.

What if you could point a script at your codebase and get an OpenAPI spec back? Not a perfect one — but a solid first draft that captures every route, its method, path parameters, and likely request/response shapes.

That’s what we’re building: a Python CLI that scans Express or FastAPI codebases, extracts route signatures, sends them to a local LLM via Ollama, and outputs valid OpenAPI 3.0 YAML. Your code never leaves your machine.

What We’re Building

A single Python script — docgen.py — that:

Recursively scans a directory for .py and .js/.ts files
Extracts REST API route definitions using regex
Sends the extracted routes to a local Ollama model
Outputs a complete OpenAPI 3.0 YAML spec

Prerequisites

Python 3.10+
Ollama installed and running
A code-capable model — codellama:13b or qwen2.5-coder:14b work well (see best models for coding locally)

ollama pull codellama:13b

Install dependencies:

pip install requests pyyaml

Step 1 — Extract Routes From Source Files

The first job is finding route definitions. Express uses patterns like app.get('/users', ...) or router.post('/items/:id', ...). FastAPI uses decorators like @app.get("/users"). We’ll handle both with regex.

Create docgen.py:

#!/usr/bin/env python3
"""Generate OpenAPI docs from Express/FastAPI routes using Ollama."""

import re
import sys
import os
import requests
import yaml

# Patterns for route extraction
EXPRESS_PATTERN = re.compile(
    r'(?:app|router)\.(get|post|put|patch|delete)\s*\(\s*[\'"]([^\'"]+)[\'"]',
    re.IGNORECASE,
)
FASTAPI_PATTERN = re.compile(
    r'@\w+\.(get|post|put|patch|delete)\s*\(\s*[\'"]([^\'"]+)[\'"]',
    re.IGNORECASE,
)

EXTENSIONS = {".py", ".js", ".ts"}
SKIP_DIRS = {"node_modules", ".git", "__pycache__", "venv", ".venv", "dist"}


def find_routes(directory: str) -> list[dict]:
    """Walk a directory tree and extract API route signatures."""
    routes = []
    for root, dirs, files in os.walk(directory):
        dirs[:] = [d for d in dirs if d not in SKIP_DIRS]
        for fname in files:
            ext = os.path.splitext(fname)[1]
            if ext not in EXTENSIONS:
                continue
            filepath = os.path.join(root, fname)
            with open(filepath, "r", errors="ignore") as f:
                content = f.read()

            for pattern in (EXPRESS_PATTERN, FASTAPI_PATTERN):
                for match in pattern.finditer(content):
                    method, path = match.group(1).upper(), match.group(2)
                    # Grab surrounding lines for context
                    start = max(0, match.start() - 200)
                    end = min(len(content), match.end() + 300)
                    context = content[start:end].strip()
                    routes.append({
                        "method": method,
                        "path": path,
                        "file": os.path.relpath(filepath, directory),
                        "context": context,
                    })
    return routes

The context field captures the code around each route — function body, parameter validation, response calls. This gives the LLM enough signal to infer request/response schemas.

Step 2 — Build the Prompt

The prompt is where the magic happens. We need to be specific about the output format — LLMs are much more reliable when you tell them exactly what structure to produce.

def build_prompt(routes: list[dict]) -> str:
    """Build a prompt that asks the LLM to generate an OpenAPI spec."""
    route_descriptions = []
    for r in routes:
        route_descriptions.append(
            f"### {r['method']} {r['path']}\n"
            f"File: {r['file']}\n"
            f"```\n{r['context']}\n```"
        )

    routes_block = "\n\n".join(route_descriptions)

    return f"""Analyze these API route definitions and generate a complete OpenAPI 3.0 specification in YAML format.

For each route:
- Infer path parameters from the URL (e.g., :id or {{id}})
- Infer request body schema from the code context when visible
- Infer response schema from the code context when visible
- Add a brief description based on the route's purpose
- Use sensible HTTP status codes

Output ONLY valid YAML — no markdown fences, no explanation.

Routes found:

{routes_block}"""

Step 3 — Call Ollama and Generate the Spec

Now we send the prompt to Ollama’s API and capture the YAML output:

OLLAMA_URL = "http://localhost:11434/api/generate"


def generate_spec(routes: list[dict], model: str = "codellama:13b") -> str:
    """Send routes to Ollama and get back an OpenAPI YAML spec."""
    prompt = build_prompt(routes)

    response = requests.post(
        OLLAMA_URL,
        json={
            "model": model,
            "prompt": prompt,
            "stream": False,
            "options": {"temperature": 0.2, "num_predict": 4096},
        },
        timeout=120,
    )
    response.raise_for_status()
    return response.json()["response"]


def clean_yaml_output(raw: str) -> str:
    """Strip markdown fences if the model wraps the output."""
    raw = raw.strip()
    if raw.startswith("```"):
        raw = re.sub(r"^```\w*\n?", "", raw)
        raw = re.sub(r"\n?```$", "", raw)
    return raw.strip()

We set temperature to 0.2 — low enough for consistent, structured output but not so low that the model gets repetitive. The num_predict cap of 4096 tokens is usually enough for 10-20 routes.

Step 4 — Wire It All Together

def main():
    if len(sys.argv) < 2:
        print("Usage: python docgen.py <project-directory> [--model MODEL] [--output FILE]")
        sys.exit(1)

    directory = sys.argv[1]
    model = "codellama:13b"
    output_file = None

    args = sys.argv[2:]
    for i, arg in enumerate(args):
        if arg == "--model" and i + 1 < len(args):
            model = args[i + 1]
        elif arg == "--output" and i + 1 < len(args):
            output_file = args[i + 1]

    if not os.path.isdir(directory):
        print(f"Error: {directory} is not a directory")
        sys.exit(1)

    print(f"Scanning {directory} for API routes...")
    routes = find_routes(directory)

    if not routes:
        print("No routes found. Supported frameworks: Express, FastAPI.")
        sys.exit(0)

    print(f"Found {len(routes)} routes:")
    for r in routes:
        print(f"  {r['method']:6s} {r['path']:30s} ({r['file']})")

    print(f"\nGenerating OpenAPI spec with {model}...")
    raw_output = generate_spec(routes, model)
    spec_yaml = clean_yaml_output(raw_output)

    # Validate it parses as YAML
    try:
        parsed = yaml.safe_load(spec_yaml)
        # Re-dump for consistent formatting
        spec_yaml = yaml.dump(parsed, default_flow_style=False, sort_keys=False)
    except yaml.YAMLError as e:
        print(f"Warning: LLM output wasn't valid YAML — saving raw output. Error: {e}")

    if output_file:
        with open(output_file, "w") as f:
            f.write(spec_yaml)
        print(f"Spec written to {output_file}")
    else:
        print("\n---")
        print(spec_yaml)


if __name__ == "__main__":
    main()

Running It

Point it at any Express or FastAPI project:

# Scan and print to stdout
python docgen.py ./my-express-app

# Save to a file
python docgen.py ./my-fastapi-project --output openapi.yaml

# Use a different model
python docgen.py ./my-app --model qwen2.5-coder:14b --output docs.yaml

Example output for a small Express app:

openapi: "3.0.0"
info:
  title: API Documentation
  version: "1.0.0"
paths:
  /users:
    get:
      summary: List all users
      responses:
        "200":
          description: Array of user objects
          content:
            application/json:
              schema:
                type: array
                items:
                  type: object
                  properties:
                    id:
                      type: integer
                    name:
                      type: string
                    email:
                      type: string
    post:
      summary: Create a new user
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                name:
                  type: string
                email:
                  type: string
      responses:
        "201":
          description: User created successfully
  /users/{id}:
    get:
      summary: Get user by ID
      parameters:
        - name: id
          in: path
          required: true
          schema:
            type: integer
      responses:
        "200":
          description: User object
        "404":
          description: User not found

Tips for Better Results

Give the model more context. The script captures 200 characters before and 300 after each route match. If your route handlers are in separate files (common in Express), the context window might miss the actual logic. You can increase those numbers or refactor the script to follow imports.

Use a larger model for complex APIs. codellama:13b handles straightforward CRUD routes well. For APIs with nested schemas, authentication middleware, or complex validation (Zod, Pydantic), step up to a 30B+ model if your hardware supports it.

Iterate on the output. Treat the generated spec as a first draft. Load it into Swagger Editor to validate and refine. The LLM gets the structure right 80-90% of the time — you’re filling in the gaps, not starting from scratch.

Run it in CI. Add the script to your pipeline to detect undocumented routes. Compare the generated route list against your existing spec to catch drift.

Why Local?

Running this through Ollama instead of a cloud API means your source code never leaves your machine. That matters when you’re working with proprietary codebases, client projects under NDA, or anything in a regulated industry. For a deeper look at local AI for code tasks, see our local AI code review guide.

The trade-off is speed — a 13B model on a MacBook Pro takes 15-30 seconds per generation. For a one-off documentation task, that’s fine. If you’re running this on every commit, consider a smaller model or batching routes.

What You Learned

How to extract API route signatures from Express and FastAPI codebases using regex
How to structure prompts for consistent, structured YAML output from an LLM
How to call Ollama’s API from Python and handle the response
How to validate and clean LLM-generated YAML with PyYAML

The full script is under 120 lines. You can extend it with support for Hono, Koa, Django REST Framework, or any framework — just add another regex pattern. The LLM doesn’t care where the routes came from; it just needs the method, path, and surrounding code.

Build an AI API Documentation Generator From Your Codebase

What We’re Building

Prerequisites

Step 1 — Extract Routes From Source Files

Step 2 — Build the Prompt

Step 3 — Call Ollama and Generate the Spec

Step 4 — Wire It All Together

Running It

Tips for Better Results

Why Local?

What You Learned

📬 AI Dev Weekly

You might also like

Build a Local AI Translation Tool with Ollama — No Google Translate Needed

Build an AI Database Query Assistant — Natural Language to SQL

Build an AI Commit Message Generator — Git Hook Tutorial

Generate Unit Tests with Ollama — Never Write Tests Manually Again (2026)