AI agentstool usefunction callingLLMarchitectureClaudeGPT

AI Agent Tool Use Architecture: Function Calling, ReAct Loops & Structured Outputs

March 29, 2026 6 min readBy Codelit Team Discussion

AI Agent Tool Use Architecture#

Large language models become dramatically more useful when they can take actions — reading files, querying databases, calling APIs. Tool use (also called function calling) is the mechanism that turns a chatbot into an agent.

The Core Idea#

Without tools, an LLM can only generate text. With tools, it can:

User: "What's the weather in Tokyo?"

Without tools → "I don't have real-time data..."
With tools    → calls get_weather("Tokyo") → "It's 22°C and sunny in Tokyo."

The model doesn't execute the tool itself. It requests a tool call, the runtime executes it, and the result is fed back into the conversation.

Tool Definitions#

Every tool use system starts with tool definitions — structured descriptions the model uses to decide when and how to call a tool.

{
  "name": "search_database",
  "description": "Search the product database by query string. Returns top 10 matches.",
  "parameters": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "The search query"
      },
      "category": {
        "type": "string",
        "enum": ["electronics", "clothing", "books"],
        "description": "Optional category filter"
      }
    },
    "required": ["query"]
  }
}

Key principles for good tool definitions:

Clear descriptions — the model reads these to decide when to use the tool
Precise parameter types — enums, required fields, format hints
Scoped responsibility — one tool does one thing well
Example values in descriptions help the model generate correct inputs

The ReAct Loop#

ReAct (Reasoning + Acting) is the dominant pattern for agentic tool use. The model alternates between thinking and acting:

Step 1: THINK  → "I need to find the user's order status. I'll search by email."
Step 2: ACT   → call search_orders(email="user@example.com")
Step 3: OBSERVE → [Order #1234, shipped, tracking: XYZ789]
Step 4: THINK  → "Found the order. Now I need tracking details."
Step 5: ACT   → call get_tracking("XYZ789")
Step 6: OBSERVE → [In transit, arriving March 30]
Step 7: RESPOND → "Your order #1234 is in transit, arriving March 30."

Each iteration the model sees the full history of thoughts, actions, and observations. This lets it plan multi-step workflows dynamically.

Tool Selection Strategies#

When an agent has dozens of tools available, selection becomes critical:

Relevance Filtering#

# Pre-filter tools based on the query before sending to the model
relevant_tools = [t for t in all_tools if is_relevant(t, user_query)]
# Send only relevant tools to reduce confusion and token cost

Hierarchical Tool Organization#

Top-level tools:
  - database_tools    → search, insert, update, delete
  - communication     → send_email, send_slack, create_ticket
  - file_operations   → read_file, write_file, list_directory

The model first selects a category, then gets specific tools within it.

Tool Selection via Embeddings#

For large tool inventories (100+), embed tool descriptions and retrieve the top-k most similar tools based on the user query. This scales better than sending all definitions.

Claude Tool Use API#

Claude's tool use follows a structured message flow:

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[{
        "name": "get_weather",
        "description": "Get current weather for a location",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            },
            "required": ["location"]
        }
    }],
    messages=[{"role": "user", "content": "Weather in Paris?"}]
)

# Response contains tool_use content block
# Execute the tool, then send result back

Claude returns a tool_use block with the tool name and input. You execute it and send back a tool_result message. The model then generates its final response.

GPT Function Calling API#

OpenAI's approach uses a similar pattern with slightly different structure:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Weather in Paris?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    }]
)

The model returns tool_calls in the response message. You execute them and append tool role messages with results.

Parallel Tool Calls#

Both Claude and GPT support parallel tool calls — requesting multiple tool executions in a single response:

User: "Compare weather in Tokyo and London"

Model response (single turn):
  → tool_call_1: get_weather("Tokyo")
  → tool_call_2: get_weather("London")

This is faster than sequential calls because both execute concurrently. The runtime collects all results and sends them back together.

When parallel calls help:

Independent data fetches (weather in two cities)
Gathering context from multiple sources simultaneously
Batch operations where order doesn't matter

When to force sequential:

Second call depends on first call's result
Write operations that must happen in order
Transactions requiring consistency

Structured Outputs#

Tool use naturally produces structured data, but you can also force structured outputs for the model's final response:

{
  "type": "json_schema",
  "json_schema": {
    "name": "analysis_result",
    "schema": {
      "type": "object",
      "properties": {
        "sentiment": {"type": "string", "enum": ["positive", "negative", "neutral"]},
        "confidence": {"type": "number"},
        "key_topics": {"type": "array", "items": {"type": "string"}}
      },
      "required": ["sentiment", "confidence", "key_topics"]
    }
  }
}

Structured outputs guarantee the response matches your schema — no parsing regex needed.

Error Recovery#

Tools fail. Networks time out. APIs return errors. Robust agents handle this gracefully:

Retry with Backoff#

def execute_tool_with_retry(tool_call, max_retries=3):
    for attempt in range(max_retries):
        try:
            return execute_tool(tool_call)
        except ToolError as e:
            if attempt == max_retries - 1:
                return {"error": str(e)}
            time.sleep(2 ** attempt)

Fallback Tools#

Primary: search_vector_db(query)
  ↓ fails
Fallback: search_keyword_db(query)
  ↓ fails
Final: return "I couldn't find relevant results"

Error Context for the Model#

When a tool fails, return the error to the model. Good agents adapt:

Tool result: {"error": "Rate limited. Retry after 30 seconds."}
Model thinks: "I'll try a different approach — let me use the cached data tool instead."

Security Considerations#

Tool use introduces real-world side effects. Guard against:

Prompt injection — malicious inputs that trick the model into calling dangerous tools
Over-permissioning — give tools minimum necessary permissions
Confirmation gates — require human approval for destructive operations (delete, send email)
Input validation — validate tool inputs before execution, not just after
Rate limiting — cap tool calls per conversation to prevent runaway loops

Architecture Pattern: Tool Use Runtime#

┌─────────────┐     ┌──────────┐     ┌───────────┐
│   User       │────▶│  Agent   │────▶│  LLM API  │
│   Input      │     │  Runtime │◀────│  (Claude)  │
└─────────────┘     │          │     └───────────┘
                    │  ┌───────┴──────┐
                    │  │ Tool Registry │
                    │  │ - search_db   │
                    │  │ - send_email  │
                    │  │ - read_file   │
                    │  └───────┬──────┘
                    │          │
                    │  ┌───────▼──────┐
                    └──│  Tool Results │
                       └──────────────┘

The runtime orchestrates the loop: send messages to the LLM, parse tool calls, execute tools, feed results back, repeat until the model generates a final response.

Key Takeaways#

Tool definitions are the interface — invest in clear descriptions and precise schemas
ReAct loops let models reason about multi-step problems dynamically
Parallel tool calls reduce latency for independent operations
Structured outputs eliminate parsing headaches
Error recovery is essential — tools fail, good agents adapt
Security is non-negotiable when tools have real-world side effects

331 articles on software engineering at codelit.io/blog.

Try it on Codelit

Agent Workflow Builder

Map agents, tools, model routing, approvals, evals, and deployment before wiring connectors

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Build this agent workflow →

Comments

AI agents

Customer Onboarding Is a Great Agent Workflow

3 min read

Slack agents

A Slack Agent With MCP and Skills

3 min read

n8n

n8n AI Workflows Still Need Architecture

3 min read

Try these templates

Netflix Video Streaming Architecture

Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.

10 components

Search Engine Architecture

Web-scale search with crawling, indexing, ranking, and sub-second query serving.

8 components

Google Search Engine Architecture

Web-scale search with crawling, indexing, PageRank, query processing, ads, and knowledge graph.

10 components

Build this agent workflow

Generate a production workflow for AI Agent Tool Use Architecture in seconds.

Try it in Codelit →

AI agentstool usefunction callingLLMarchitectureClaudeGPT

AI Agent Tool Use Architecture: Function Calling, ReAct Loops & Structured Outputs

March 29, 2026 6 min readBy Codelit Team Discussion

AI Agent Tool Use Architecture#

The Core Idea#

Without tools, an LLM can only generate text. With tools, it can:

User: "What's the weather in Tokyo?"

Without tools → "I don't have real-time data..."
With tools    → calls get_weather("Tokyo") → "It's 22°C and sunny in Tokyo."

The model doesn't execute the tool itself. It requests a tool call, the runtime executes it, and the result is fed back into the conversation.

Tool Definitions#

Every tool use system starts with tool definitions — structured descriptions the model uses to decide when and how to call a tool.

{
  "name": "search_database",
  "description": "Search the product database by query string. Returns top 10 matches.",
  "parameters": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "The search query"
      },
      "category": {
        "type": "string",
        "enum": ["electronics", "clothing", "books"],
        "description": "Optional category filter"
      }
    },
    "required": ["query"]
  }
}

Key principles for good tool definitions:

Clear descriptions — the model reads these to decide when to use the tool
Precise parameter types — enums, required fields, format hints
Scoped responsibility — one tool does one thing well
Example values in descriptions help the model generate correct inputs

The ReAct Loop#

ReAct (Reasoning + Acting) is the dominant pattern for agentic tool use. The model alternates between thinking and acting:

Step 1: THINK  → "I need to find the user's order status. I'll search by email."
Step 2: ACT   → call search_orders(email="user@example.com")
Step 3: OBSERVE → [Order #1234, shipped, tracking: XYZ789]
Step 4: THINK  → "Found the order. Now I need tracking details."
Step 5: ACT   → call get_tracking("XYZ789")
Step 6: OBSERVE → [In transit, arriving March 30]
Step 7: RESPOND → "Your order #1234 is in transit, arriving March 30."

Each iteration the model sees the full history of thoughts, actions, and observations. This lets it plan multi-step workflows dynamically.

Tool Selection Strategies#

When an agent has dozens of tools available, selection becomes critical:

Relevance Filtering#

# Pre-filter tools based on the query before sending to the model
relevant_tools = [t for t in all_tools if is_relevant(t, user_query)]
# Send only relevant tools to reduce confusion and token cost

Hierarchical Tool Organization#

Top-level tools:
  - database_tools    → search, insert, update, delete
  - communication     → send_email, send_slack, create_ticket
  - file_operations   → read_file, write_file, list_directory

The model first selects a category, then gets specific tools within it.

Tool Selection via Embeddings#

For large tool inventories (100+), embed tool descriptions and retrieve the top-k most similar tools based on the user query. This scales better than sending all definitions.

Claude Tool Use API#

Claude's tool use follows a structured message flow:

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[{
        "name": "get_weather",
        "description": "Get current weather for a location",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            },
            "required": ["location"]
        }
    }],
    messages=[{"role": "user", "content": "Weather in Paris?"}]
)

# Response contains tool_use content block
# Execute the tool, then send result back

Claude returns a tool_use block with the tool name and input. You execute it and send back a tool_result message. The model then generates its final response.

GPT Function Calling API#

OpenAI's approach uses a similar pattern with slightly different structure:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Weather in Paris?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    }]
)

The model returns tool_calls in the response message. You execute them and append tool role messages with results.

Parallel Tool Calls#

Both Claude and GPT support parallel tool calls — requesting multiple tool executions in a single response:

User: "Compare weather in Tokyo and London"

Model response (single turn):
  → tool_call_1: get_weather("Tokyo")
  → tool_call_2: get_weather("London")

This is faster than sequential calls because both execute concurrently. The runtime collects all results and sends them back together.

When parallel calls help:

Independent data fetches (weather in two cities)
Gathering context from multiple sources simultaneously
Batch operations where order doesn't matter

When to force sequential:

Second call depends on first call's result
Write operations that must happen in order
Transactions requiring consistency

Structured Outputs#

Tool use naturally produces structured data, but you can also force structured outputs for the model's final response:

{
  "type": "json_schema",
  "json_schema": {
    "name": "analysis_result",
    "schema": {
      "type": "object",
      "properties": {
        "sentiment": {"type": "string", "enum": ["positive", "negative", "neutral"]},
        "confidence": {"type": "number"},
        "key_topics": {"type": "array", "items": {"type": "string"}}
      },
      "required": ["sentiment", "confidence", "key_topics"]
    }
  }
}

Structured outputs guarantee the response matches your schema — no parsing regex needed.

Error Recovery#

Tools fail. Networks time out. APIs return errors. Robust agents handle this gracefully:

Retry with Backoff#

def execute_tool_with_retry(tool_call, max_retries=3):
    for attempt in range(max_retries):
        try:
            return execute_tool(tool_call)
        except ToolError as e:
            if attempt == max_retries - 1:
                return {"error": str(e)}
            time.sleep(2 ** attempt)

Fallback Tools#

Primary: search_vector_db(query)
  ↓ fails
Fallback: search_keyword_db(query)
  ↓ fails
Final: return "I couldn't find relevant results"

Error Context for the Model#

When a tool fails, return the error to the model. Good agents adapt:

Tool result: {"error": "Rate limited. Retry after 30 seconds."}
Model thinks: "I'll try a different approach — let me use the cached data tool instead."

Security Considerations#

Tool use introduces real-world side effects. Guard against:

Prompt injection — malicious inputs that trick the model into calling dangerous tools
Over-permissioning — give tools minimum necessary permissions
Confirmation gates — require human approval for destructive operations (delete, send email)
Input validation — validate tool inputs before execution, not just after
Rate limiting — cap tool calls per conversation to prevent runaway loops

Architecture Pattern: Tool Use Runtime#

┌─────────────┐     ┌──────────┐     ┌───────────┐
│   User       │────▶│  Agent   │────▶│  LLM API  │
│   Input      │     │  Runtime │◀────│  (Claude)  │
└─────────────┘     │          │     └───────────┘
                    │  ┌───────┴──────┐
                    │  │ Tool Registry │
                    │  │ - search_db   │
                    │  │ - send_email  │
                    │  │ - read_file   │
                    │  └───────┬──────┘
                    │          │
                    │  ┌───────▼──────┐
                    └──│  Tool Results │
                       └──────────────┘

The runtime orchestrates the loop: send messages to the LLM, parse tool calls, execute tools, feed results back, repeat until the model generates a final response.

Key Takeaways#

Tool definitions are the interface — invest in clear descriptions and precise schemas
ReAct loops let models reason about multi-step problems dynamically
Parallel tool calls reduce latency for independent operations
Structured outputs eliminate parsing headaches
Error recovery is essential — tools fail, good agents adapt
Security is non-negotiable when tools have real-world side effects

331 articles on software engineering at codelit.io/blog.

Try it on Codelit

Agent Workflow Builder

Map agents, tools, model routing, approvals, evals, and deployment before wiring connectors

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Build this agent workflow →

Comments

AI agents

Build this agent workflow

Generate a production workflow for AI Agent Tool Use Architecture in seconds.

Try it in Codelit →

AI Agent Tool Use Architecture: Function Calling, ReAct Loops & Structured Outputs

AI Agent Tool Use Architecture#

The Core Idea#

Tool Definitions#

The ReAct Loop#

Tool Selection Strategies#

Relevance Filtering#

Hierarchical Tool Organization#

Tool Selection via Embeddings#

Claude Tool Use API#

GPT Function Calling API#

Parallel Tool Calls#

Structured Outputs#

Error Recovery#

Retry with Backoff#

Fallback Tools#

Error Context for the Model#

Security Considerations#

Architecture Pattern: Tool Use Runtime#

Key Takeaways#

Comments

Related articles

Customer Onboarding Is a Great Agent Workflow

A Slack Agent With MCP and Skills

n8n AI Workflows Still Need Architecture

Try these templates

Netflix Video Streaming Architecture

Search Engine Architecture

Google Search Engine Architecture

Build this agent workflow

AI Agent Tool Use Architecture: Function Calling, ReAct Loops & Structured Outputs

AI Agent Tool Use Architecture#

The Core Idea#

Tool Definitions#

The ReAct Loop#

Tool Selection Strategies#

Relevance Filtering#

Hierarchical Tool Organization#

Tool Selection via Embeddings#

Claude Tool Use API#

GPT Function Calling API#

Parallel Tool Calls#

Structured Outputs#

Error Recovery#

Retry with Backoff#

Fallback Tools#

Error Context for the Model#

Security Considerations#

Architecture Pattern: Tool Use Runtime#

Key Takeaways#

Comments

Related articles

Customer Onboarding Is a Great Agent Workflow

A Slack Agent With MCP and Skills

n8n AI Workflows Still Need Architecture

Try these templates

Netflix Video Streaming Architecture

Search Engine Architecture

Google Search Engine Architecture

Build this agent workflow