An AI agent is an LLM that can do things, not just say things. Instead of answering a question and waiting for your next message, an agent plans a sequence of actions, executes them using tools, observes the results, and decides what to do next β autonomously.
Chatbot vs agent
| Chatbot | Agent | |
|---|---|---|
| Input | Your message | A goal |
| Output | A response | Completed task |
| Tools | None | Files, APIs, databases, terminal |
| Steps | 1 (respond) | Many (plan β act β observe β repeat) |
| Autonomy | Waits for you | Acts independently |
| Example | βWhatβs a reverse proxy?" | "Set up Nginx as a reverse proxy for my appβ |
A chatbot tells you how to do something. An agent does it for you.
How agents work
Every AI agent follows the same loop:
1. PERCEIVE β Read the current state (files, errors, tool output)
2. REASON β Decide what to do next
3. ACT β Execute an action (edit file, run command, call API)
4. OBSERVE β Check the result
5. REPEAT β Until the goal is achieved or max steps reached
In code:
async def agent_loop(goal, tools, max_steps=20):
messages = [
{"role": "system", "content": "You are a coding agent. Use tools to accomplish the goal."},
{"role": "user", "content": goal}
]
for step in range(max_steps):
# REASON: LLM decides what to do
response = await call_llm(messages, tools=tools)
if response.tool_calls:
# ACT: Execute the tool
for call in response.tool_calls:
result = execute_tool(call)
# OBSERVE: Add result to context
messages.append({"role": "tool", "content": result})
else:
# Done - agent returned a final response
return response.content
return "Max steps reached"
Thatβs it. Every AI agent β from Claude Code to Aider to Kimi CLI β is a variation of this loop.
The four components
1. The LLM (the brain)
The language model that reasons and decides. Bigger models make better decisions but cost more:
| Model | Agent quality | Cost |
|---|---|---|
| Claude Opus | β Best | $15/$75 per M tokens |
| Claude Sonnet | β Great | $3/$15 |
| Qwen 3.6 Plus | β Good | Free (preview) |
| DeepSeek R1 | β Good reasoning | $0.55/$2.19 |
| Small local models | β οΈ Limited | Free |
2. Tools (the hands)
Tools let the agent interact with the world. Common tools:
- File system β read, write, search files
- Terminal β run commands, check output
- Web search β find information online
- APIs β call external services
- MCP servers β standardized tool access
See our tool calling guide and MCP guide for implementation.
3. Memory (the notebook)
Agents need to remember what theyβve done. Without memory, they repeat actions or forget context. Four patterns:
- Conversation history β replay past messages
- Summarized memory β compress old context
- Vector store β semantic search over past interactions
- Structured state β JSON tracking progress
See our agent memory patterns guide for implementation.
4. Planning (the strategy)
Good agents plan before acting. Bad agents just start doing things.
Bad agent: "Fix the bug" β immediately starts editing random files
Good agent: "Fix the bug" β reads error log β identifies cause β finds relevant file β makes targeted fix β runs tests
Planning quality is mostly determined by the LLM. Frontier models (Claude, GPT-5) plan better than small models.
Real-world AI agents
| Agent | What it does | How it works |
|---|---|---|
| Claude Code | Writes and edits code | LLM + file tools + terminal |
| Aider | Pair programming in terminal | LLM + git + file editing |
| Cursor | AI-powered IDE | LLM + codebase indexing + editing |
| Devin | Autonomous software engineer | LLM + browser + terminal + planning |
In our AI Startup Race, 7 agents run autonomously to build startups. Each uses the same loop: plan β code β deploy β check β iterate.
When to use agents
β Use agents for:
- Multi-step coding tasks (refactoring, debugging, feature building)
- Research that requires searching and synthesizing
- Tasks that need iteration (try, fail, adjust, retry)
β Donβt use agents for:
- Single-step tasks (summarize, classify, translate) β just use an API call
- Deterministic workflows β use a fixed pipeline instead
- High-stakes decisions β keep humans in the loop
See our when NOT to use agents guide for the full decision framework.
Getting started
The fastest way to experience AI agents:
- Use one: Install Claude Code ($20/mo) or Aider (free + API)
- Build one: Follow our multi-agent guide (50 lines of Python)
- Watch one: Follow the AI Startup Race where 7 agents build startups autonomously
FAQ
Whatβs the difference between an AI agent and an AI assistant?
An AI assistant responds to your messages one at a time and waits for your next input. An AI agent takes a goal, plans multiple steps, executes actions using tools, and keeps working autonomously until the task is complete β or it gets stuck and asks for help.
Do I need expensive frontier models to build an agent?
Not necessarily β smaller models can handle simple agent loops with well-defined tools and clear goals. However, planning quality degrades significantly with smaller models. For complex multi-step tasks, frontier models like Claude Opus or GPT-5 make far fewer reasoning errors and recover better from unexpected situations.
Are AI agents safe to run autonomously?
It depends on the guardrails you set. Most production agents include confirmation steps for destructive actions, spending limits, and maximum step counts. Start with human-in-the-loop approval for critical actions and gradually increase autonomy as you build trust in the systemβs behavior.
Related: How to Build Multi-Agent Systems Β· Agent Orchestration Patterns Β· Best AI Agent Frameworks Β· When NOT to Use AI Agents Β· Agent vs Workflow Β· Tool Calling Patterns