πŸ€– AI Tools
Β· 3 min read

Agent Memory Patterns β€” How to Give AI Agents Long-Term Context


AI agents have amnesia. Every session starts from zero. The agent that brilliantly debugged your auth system yesterday has no memory of it today. Here are 4 patterns for fixing that.

Pattern 1: Conversation history (simplest)

Store the full conversation and replay it at the start of each session.

import json

def load_history(session_id):
    try:
        with open(f"history/{session_id}.json") as f:
            return json.load(f)
    except FileNotFoundError:
        return []

def save_history(session_id, messages):
    with open(f"history/{session_id}.json", "w") as f:
        json.dump(messages, f)

# Resume a conversation
messages = load_history("project-auth-refactor")
messages.append({"role": "user", "content": "Continue where we left off"})
response = call_llm(messages)
messages.append({"role": "assistant", "content": response})
save_history("project-auth-refactor", messages)

Pros: Simple, preserves full context. Cons: Context window fills up fast. At 200K tokens, you get ~50 back-and-forth exchanges before hitting limits.

When to use: Short-lived tasks (1-3 sessions), debugging sessions, code reviews.

Pattern 2: Summarized memory

Periodically summarize old conversations and keep only the summary + recent messages.

def compress_history(messages, keep_recent=10):
    if len(messages) <= keep_recent * 2:
        return messages
    
    old_messages = messages[:-keep_recent]
    recent_messages = messages[-keep_recent:]
    
    summary = call_llm([
        {"role": "system", "content": "Summarize this conversation. Include key decisions, code changes, and unresolved issues."},
        {"role": "user", "content": json.dumps(old_messages)}
    ])
    
    return [
        {"role": "system", "content": f"Previous conversation summary:\n{summary}"},
        *recent_messages
    ]

Pros: Fits in any context window, preserves key information. Cons: Lossy β€” details get dropped during summarization.

When to use: Multi-day projects, ongoing agent tasks, the AI Startup Race agents.

Store every interaction as an embedding in a vector database. When the agent needs context, retrieve the most relevant past interactions.

from openai import OpenAI
import chromadb

client = OpenAI()
db = chromadb.PersistentClient(path="./agent_memory")
collection = db.get_or_create_collection("memories")

def remember(text, metadata=None):
    embedding = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    ).data[0].embedding
    
    collection.add(
        documents=[text],
        embeddings=[embedding],
        metadatas=[metadata or {}],
        ids=[f"mem_{int(time.time())}"]
    )

def recall(query, n_results=5):
    embedding = client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    ).data[0].embedding
    
    results = collection.query(query_embeddings=[embedding], n_results=n_results)
    return results["documents"][0]

Pros: Scales to thousands of memories, retrieves only relevant context. Cons: Requires embeddings infrastructure, retrieval quality varies.

When to use: Long-running agents, knowledge-heavy tasks, agents that need to reference past decisions.

See our RAG guide and vector database comparison for implementation details.

Pattern 4: Structured state (most reliable)

Store agent state as structured data (JSON, database rows) rather than natural language. The agent reads and writes specific fields.

# agent_state.json
{
    "project": "auth-refactor",
    "status": "in_progress",
    "completed_tasks": [
        "Migrated from JWT to session tokens",
        "Added rate limiting to login endpoint"
    ],
    "pending_tasks": [
        "Add 2FA support",
        "Write integration tests"
    ],
    "decisions": {
        "auth_method": "session tokens (chosen over JWT for revocation support)",
        "rate_limit": "10 requests/minute per IP"
    },
    "known_issues": [
        "Redis session store needs connection pooling"
    ]
}

Inject this state into the system prompt:

state = load_state("auth-refactor")
system_prompt = f"""You are working on the auth-refactor project.

Current state:
{json.dumps(state, indent=2)}

Continue from where you left off. Update the state file when you complete tasks or make decisions."""

Pros: Precise, no information loss, easy to inspect and debug. Cons: Requires defining the state schema upfront, agent must be trained to update it.

When to use: Production agents, multi-agent systems, any agent that runs for more than a few sessions.

Which pattern to use

ScenarioPatternWhy
Quick debugging sessionConversation historySimple, full context
Multi-day coding projectSummarized memoryFits context window
Knowledge assistantVector storeSemantic retrieval
Production agentStructured stateReliable, debuggable
Complex agent systemStructured state + vector storeBest of both

Combining patterns

The most robust agents use multiple patterns:

def build_agent_context(project_id, current_task):
    # 1. Structured state (always included)
    state = load_state(project_id)
    
    # 2. Relevant memories (semantic search)
    memories = recall(current_task, n_results=3)
    
    # 3. Recent conversation (last 5 exchanges)
    recent = load_recent_history(project_id, limit=5)
    
    system_prompt = f"""Project state:
{json.dumps(state, indent=2)}

Relevant past context:
{chr(10).join(memories)}

Continue working on: {current_task}"""
    
    return [{"role": "system", "content": system_prompt}] + recent

This gives the agent: structured knowledge of the project, semantic recall of relevant past work, and recent conversation context.

Related: How to Build Multi-Agent Systems Β· Agent Orchestration Patterns Β· What is RAG? Β· What is a Vector Database? Β· Tool Calling Patterns