Apr 16, 2026 · 5 min read

AI Agent State Management: Persistence, Sessions, and Recovery (2026)

An AI agent without state management is like a goldfish — every interaction starts from zero. The user explains their project, the agent helps, and next time the user comes back, the agent has forgotten everything. In production, this is unacceptable.

State management for AI agents is harder than for traditional apps because you’re managing three types of state simultaneously: conversation history (what was said), tool state (what was done), and world state (what changed in the environment).

The three types of agent state

1. Conversation state

The chat history between user and agent. This is what most people think of when they say “memory.”

# Simplest form: list of messages
conversation = [
    {"role": "user", "content": "I'm building a SaaS with Stripe billing"},
    {"role": "assistant", "content": "I'll help with that. What framework are you using?"},
    {"role": "user", "content": "Next.js with the App Router"},
]

The challenge: conversation history grows with every turn. A 50-turn conversation can be 20,000+ tokens. At some point, you need to summarize or truncate.

2. Tool state

What the agent has done: files read, APIs called, code executed. This matters for multi-step tasks where the agent needs to remember what it already tried.

tool_state = {
    "files_read": ["src/auth/middleware.ts", "src/lib/stripe.ts"],
    "files_modified": ["src/auth/middleware.ts"],
    "commands_run": ["npm test -- --grep auth"],
    "test_results": {"passed": 12, "failed": 2},
    "current_task": "fixing the 2 failing auth tests",
}

3. World state

The external environment: database contents, file system state, deployment status. This is the hardest to manage because it changes independently of the agent.

Implementation patterns

Pattern 1: Session-based (simplest)

Store conversation history per session. Each session is independent.

from agents.extensions.memory import SQLAlchemySession

# One session per user conversation
session = SQLAlchemySession(
    url="postgresql://user:pass@localhost/agents",
    session_id=f"user_{user_id}_session_{session_id}",
)

result = await Runner.run(agent, message, session=session)
# Session automatically persists conversation history

The OpenAI Agents SDK supports multiple session backends:

Backend	Best for	Persistence	Speed
SQLAlchemy (PostgreSQL)	Production	✅ Durable	Good
SQLAlchemy (SQLite)	Development	✅ File-based	Fast
Redis	High-throughput	Configurable TTL	Fastest
Encrypted Session	Sensitive data	✅ Encrypted at rest	Good
In-memory	Testing only	❌ Lost on restart	Fastest

Pattern 2: Sliding window

Keep only the last N messages in context. Older messages are stored but not sent to the model:

MAX_CONTEXT_MESSAGES = 20

async def get_context(session_id: str, new_message: str):
    all_messages = await db.get_messages(session_id)
    
    # Always include system prompt + last N messages
    context = [all_messages[0]]  # System prompt
    context.extend(all_messages[-MAX_CONTEXT_MESSAGES:])
    context.append({"role": "user", "content": new_message})
    
    return context

Simple but lossy. The agent forgets details from early in the conversation.

Pattern 3: Summary + recent

Summarize older messages, keep recent ones in full:

async def get_context_with_summary(session_id: str):
    all_messages = await db.get_messages(session_id)
    
    if len(all_messages) > 30:
        # Summarize older messages
        old_messages = all_messages[1:-10]  # Skip system prompt and last 10
        summary = await summarize(old_messages)
        
        return [
            all_messages[0],  # System prompt
            {"role": "system", "content": f"Previous conversation summary: {summary}"},
            *all_messages[-10:],  # Last 10 messages in full
        ]
    
    return all_messages

This is what Claude Code does with its /compact command — summarize the session, replace history with the summary, keep going.

Pattern 4: Checkpoints and recovery

For long-running agents (like those in our AI Startup Race), save checkpoints so the agent can recover from crashes:

import json
from datetime import datetime

async def save_checkpoint(agent_id: str, state: dict):
    checkpoint = {
        "agent_id": agent_id,
        "timestamp": datetime.utcnow().isoformat(),
        "conversation": state["conversation"],
        "tool_state": state["tool_state"],
        "current_task": state["current_task"],
        "budget_remaining": state["budget_remaining"],
    }
    await db.execute(
        "INSERT INTO checkpoints (agent_id, data) VALUES ($1, $2)",
        agent_id, json.dumps(checkpoint)
    )

async def restore_from_checkpoint(agent_id: str):
    row = await db.fetchone(
        "SELECT data FROM checkpoints WHERE agent_id = $1 ORDER BY id DESC LIMIT 1",
        agent_id
    )
    if row:
        return json.loads(row["data"])
    return None

Save checkpoints after every significant action (file write, API call, deployment). If the agent crashes, restore from the last checkpoint instead of starting over.

Database schema

A practical schema for production agent state:

CREATE TABLE sessions (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id TEXT NOT NULL,
    agent_name TEXT NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW(),
    metadata JSONB DEFAULT '{}'
);

CREATE TABLE messages (
    id BIGSERIAL PRIMARY KEY,
    session_id UUID REFERENCES sessions(id),
    role TEXT NOT NULL,  -- 'user', 'assistant', 'system', 'tool'
    content TEXT NOT NULL,
    tokens_used INTEGER,
    cost_usd NUMERIC(10, 6),
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE tool_calls (
    id BIGSERIAL PRIMARY KEY,
    session_id UUID REFERENCES sessions(id),
    message_id BIGINT REFERENCES messages(id),
    tool_name TEXT NOT NULL,
    input JSONB,
    output JSONB,
    duration_ms INTEGER,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_messages_session ON messages(session_id, created_at);
CREATE INDEX idx_tool_calls_session ON tool_calls(session_id);

This gives you full audit trails: every message, every tool call, every cost. Essential for debugging agents and cost management.

Cross-session memory

Sometimes agents need to remember things across sessions — user preferences, project context, learned patterns:

async def get_user_context(user_id: str) -> str:
    # Fetch persistent user facts
    facts = await db.fetch(
        "SELECT fact FROM user_memory WHERE user_id = $1 ORDER BY importance DESC LIMIT 10",
        user_id
    )
    
    if facts:
        return "What I know about this user:\n" + "\n".join(f"- {f['fact']}" for f in facts)
    return ""

# Inject into system prompt
system_prompt = f"""You are a coding assistant.
{await get_user_context(user_id)}
Help the user with their current request."""

For deeper patterns, see our agent memory patterns guide.

State management anti-patterns

Don’t store full tool outputs in conversation history. A file read that returns 5,000 tokens shouldn’t be in the conversation forever. Store a summary: “Read src/auth.ts (450 lines, JWT middleware).”

Don’t rely on in-memory state in production. Your server will restart. Your process will crash. Always persist to disk or database.

Don’t send the entire conversation history every time. Use sliding windows or summaries. A 100-message conversation costs 50,000+ tokens per request just for context.

Don’t forget to clean up old sessions. Set TTLs on session data. A user who hasn’t interacted in 30 days doesn’t need their full conversation history in hot storage.

AI Agent State Management: Persistence, Sessions, and Recovery (2026)

The three types of agent state

1. Conversation state

2. Tool state

3. World state

Implementation patterns

Pattern 1: Session-based (simplest)

Pattern 2: Sliding window

Pattern 3: Summary + recent

Pattern 4: Checkpoints and recovery

Database schema

Cross-session memory

State management anti-patterns

📬 AI Dev Weekly

You might also like

Agent Memory Patterns — How to Give AI Agents Long-Term Context

Agent vs Workflow — When to Use Autonomous AI vs Deterministic Pipelines

Building a RAG System That Scales — Architecture Deep Dive (2026)

AI Gateway Pattern — Route, Cache, and Monitor All Your LLM Calls (2026)