Apr 14, 2026 · 5 min read

Last updated on Apr 20, 2026

What is Tool Calling? How AI Models Use External Tools

Tool calling (also called function calling) is how AI models execute actions in the real world. Instead of just generating text, the model outputs a structured request to call a specific function with specific parameters.

How it works

User: "What's the weather in Tokyo?"
    ↓
Model decides to call: get_weather(city="Tokyo")
    ↓
Your code executes the function
    ↓
Result: {temp: 22, condition: "sunny"}
    ↓
Model: "It's 22°C and sunny in Tokyo."

The model doesn’t execute the function itself — it tells YOUR code what to call. You execute it and return the result.

Step-by-step breakdown

You define tools — Provide the model with a list of available functions, their parameters, and descriptions
User sends a message — The user asks something that might require a tool
Model decides — Based on the user’s message and tool definitions, the model either responds directly or outputs a tool call
Tool call returned — The API response contains a tool_calls array instead of (or alongside) regular text
You execute — Your code parses the tool call, runs the function, and gets a result
You send the result back — Add the tool result to the conversation as a tool message
Model responds — The model uses the result to formulate a natural language answer

This loop can repeat multiple times — the model might call several tools in sequence to answer a complex question. This is the foundation of how AI agents work.

Why it matters

Tool calling is the foundation of:

MCP — the standard protocol for AI tool integration
AI agents — agents that read files, run tests, deploy code
RAG — retrieving documents before generating answers
AI coding tools — Claude Code, Aider, Cursor

Without tool calling, AI models are limited to generating text based on their training data. With tool calling, they can access real-time data, modify systems, and orchestrate complex workflows. This is what separates a chatbot from an AI agent.

JSON schema for tools

Tools are defined using JSON Schema. Here’s the structure both OpenAI and Anthropic expect:

{
  "name": "search_database",
  "description": "Search the product database by name or category",
  "parameters": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "Search query string"
      },
      "category": {
        "type": "string",
        "enum": ["electronics", "clothing", "food"],
        "description": "Optional category filter"
      },
      "limit": {
        "type": "integer",
        "description": "Max results to return",
        "default": 10
      }
    },
    "required": ["query"]
  }
}

Good tool definitions have:

Clear, specific description fields (the model uses these to decide when to call the tool)
Proper type annotations for every parameter
enum values where the options are limited
required array listing mandatory parameters

Example (OpenAI format)

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[{"role": "user", "content": "Check if my server is up"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "check_server",
            "description": "Check if a server is responding to HTTP requests",
            "parameters": {
                "type": "object",
                "properties": {
                    "url": {"type": "string", "description": "The server URL to check"}
                },
                "required": ["url"]
            }
        }
    }]
)

# Handle the tool call
if response.choices[0].message.tool_calls:
    tool_call = response.choices[0].message.tool_calls[0]
    # Execute the function
    result = check_server(url=json.loads(tool_call.function.arguments)["url"])
    
    # Send result back to the model
    messages.append(response.choices[0].message)
    messages.append({
        "role": "tool",
        "tool_call_id": tool_call.id,
        "content": json.dumps(result)
    })
    final_response = client.chat.completions.create(
        model="gpt-5.4",
        messages=messages
    )

Example (Anthropic format)

response = anthropic.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Check if my server is up"}],
    tools=[{
        "name": "check_server",
        "description": "Check if a server is responding to HTTP requests",
        "input_schema": {
            "type": "object",
            "properties": {
                "url": {"type": "string", "description": "The server URL to check"}
            },
            "required": ["url"]
        }
    }]
)

# Handle tool use
if response.stop_reason == "tool_use":
    tool_block = next(b for b in response.content if b.type == "tool_use")
    result = check_server(url=tool_block.input["url"])
    
    # Send result back
    follow_up = anthropic.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        messages=[
            {"role": "user", "content": "Check if my server is up"},
            {"role": "assistant", "content": response.content},
            {"role": "user", "content": [{
                "type": "tool_result",
                "tool_use_id": tool_block.id,
                "content": json.dumps(result)
            }]}
        ],
        tools=[...]  # same tools as before
    )

Common patterns

Parallel tool calls

Models can request multiple tool calls at once when they’re independent:

{
  "tool_calls": [
    {"function": {"name": "get_weather", "arguments": "{\"city\": \"Tokyo\"}"}},
    {"function": {"name": "get_weather", "arguments": "{\"city\": \"London\"}"}}
  ]
}

Execute them concurrently for faster responses.

Sequential tool calls (agent loops)

For complex tasks, the model calls tools one after another, using each result to decide the next step. This is the core pattern behind AI agents:

Read file → 2. Find bug → 3. Write fix → 4. Run tests → 5. Report result

Error handling

Always return errors as tool results rather than crashing:

try:
    result = execute_tool(name, arguments)
    return {"status": "success", "data": result}
except Exception as e:
    return {"status": "error", "message": str(e)}

The model can then retry, try a different approach, or inform the user. Never let a tool failure crash your agent loop.

Learn more

Tool Calling Patterns — sequential, parallel, conditional, recursive
MCP Complete Guide — the standardized protocol for tool integration
How to Build an AI Agent — agents that use tools autonomously
What is Prompt Engineering? — crafting instructions for AI
How AI Agents Work — the architecture behind autonomous AI systems

FAQ

Does the AI model actually execute the tool calls?

No — the model only outputs a structured request describing which function to call and with what parameters. Your application code is responsible for executing the function, handling errors, and returning the result back to the model for further reasoning.

What’s the difference between tool calling and MCP?

Tool calling is the mechanism — how a model requests a function execution. MCP is a protocol that standardizes how tools are discovered, described, and invoked across different AI clients. MCP builds on top of tool calling to create a universal integration layer. See our MCP Complete Developer Guide for the full picture.

Can local/open-source models do tool calling?

Yes — many open-source models like Qwen, Mistral, and DeepSeek support tool calling. The quality varies by model size and training, but most models above 7B parameters can handle basic tool calling reliably when given well-structured tool definitions.

How many tools can I define?

Most APIs support dozens of tools, but performance degrades with too many. Keep it under 20 tools for best results. If you need more, use a routing layer or MCP to dynamically load relevant tools.

Related: How AI Agents Work · What is MCP · MCP Complete Developer Guide · What is an AI Agent · Best Hosting for AI Projects