AI Agents Explained — Architecture, Tools & Building Your First Agent
AI agents are autonomous programs that use LLMs to perceive environments, reason about goals, and execute actions through tools — this guide covers agent architecture, Design Patterns built on Machine Learning principles, and hands-on implementation.
What You'll Learn
You will learn what AI agents are, how they differ from standard LLM calls, the three main agent architectures, how to add tools and memory, and how to build agents using LangChain and the OpenAI Assistants API.
Why It Matters
Agents extend LLMs from text generators to autonomous decision-makers. They can browse the web, query databases, run code, and execute multi-step workflows — making them essential for production AI systems that act without constant human supervision.
Real-World Use
Durga Antivirus Pro uses a multi-agent system where a Coordinator Agent receives threat alerts, a File Analysis Agent inspects suspicious binaries, a Network Agent checks for C2 communication patterns, and a Report Agent generates remediation steps — all running autonomously.
flowchart LR
A[Perception] --> B[Reasoning]
B --> C[Action]
C --> D[Observation]
D --> B
D --> E[Memory]
E --> B
What Are AI Agents?
An AI agent is a software system that perceives its environment, reasons about a goal, and takes actions to achieve that goal. Unlike a standard LLM call that produces one-shot text, an agent operates in a loop: it thinks, acts, observes the result, and thinks again.
Characteristics of AI agents:
- Autonomy — operate without human intervention for each step
- Goal-oriented — designed to achieve a specific objective
- Tool use — can call external functions, APIs, or databases
- Memory — retains context across steps and sessions
- Adaptability — changes behavior based on observations
A simple analogy: a standard LLM is like a chef who only writes recipes. An AI agent is like a chef who checks the pantry, adjusts ingredients, tastes the dish, and keeps cooking until it is perfect.
Agent Architectures
Three architectures dominate modern agent design:
React (Reasoning + Acting)
React interleaves reasoning traces with actions. The agent thinks out loud, picks a tool, observes the result, and continues until it has enough information to answer.
Thought: I need to find the current weather in Tokyo.
Action: search("Tokyo weather")
Observation: Tokyo is 22C and sunny.
Thought: I have the weather. Now I can answer.
Final Answer: The weather in Tokyo is 22C and sunny.
Plan-and-Execute
The agent first creates a step-by-step plan, then executes each step sequentially. This works well for complex tasks where the agent should commit to a Strategy before acting.
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o", temperature=0)
plan_prompt = PromptTemplate(
template="""Create a numbered plan to accomplish this task:
Task: {task}
List exactly 3-5 sequential steps. Each step must be a single action.""",
input_variables=["task"]
)
planner = LLMChain(llm=llm, prompt=plan_prompt)
plan = planner.run(task="Research AI agents and write a summary")
print(plan)
Expected output:
1. Search for "AI agents architecture 2026"
2. Read top 3 articles and extract key concepts
3. Compare ReAct, Plan-and-Execute, and Reflection patterns
4. Write a 200-word summary of findings
Reflection
The agent critiques its own output before returning it. After producing a result, a second LLM call reviews the work for errors, completeness, and quality.
from openai import OpenAI
client = OpenAI()
def reflective_agent(task):
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": task}]
)
draft = response.choices[0].message.content
critique = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": f"Review this answer for errors and completeness:\n\n{draft}"}
]
)
feedback = critique.choices[0].message.content
revision = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": f"Original task: {task}\n\nDraft: {draft}\n\nFeedback: {feedback}\n\nRevise and improve."}
]
)
return revision.choices[0].message.content
result = reflective_agent("Explain the difference between ReAct and Plan-and-Execute")
print(result)
Expected output: A refined, self-corrected explanation that addresses any gaps or mistakes in the first draft.
Tool Use and Function Calling
Tools are the mechanism agents use to interact with the outside world. Without tools, an agent is just a chat model. With tools, an agent can run code, query databases, send emails, or control hardware.
OpenAI function calling lets you define tools as JSON schemas:
from openai import OpenAI
client = OpenAI()
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"}
},
"required": ["city"]
}
}
}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is the weather in London?"}],
tools=tools
)
msg = response.choices[0].message
if msg.tool_calls:
for call in msg.tool_calls:
print(f"Function: {call.function.name}")
print(f"Arguments: {call.function.arguments}")
Expected output:
Function: get_weather
Arguments: {"city": "London"}
Memory and Context Management
Agents need memory for two reasons: to maintain conversation context and to learn from past actions. Three memory types are common:
- Short-term memory — the current conversation window (LLM context)
- Long-term memory — stored facts retrieved via vector search
- Working memory — scratchpad for the current task
from LangChain.memory import ConversationSummaryBufferMemory
from LangChain_OpenAI import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o", temperature=0)
memory = ConversationSummaryBufferMemory(
llm=llm,
max_token_limit=2000,
memory_key="chat_history",
return_messages=True
)
memory.save_context(
{"input": "My name is Alice and I like Python"},
{"output": "Nice to meet you, Alice!"}
)
memory.save_context(
{"input": "What is my name?"},
{"output": "Your name is Alice."}
)
summary = memory.load_memory_variables({})
print(summary["chat_history"])
Expected output: A list of messages including the summary buffer that retains Alice's name across turns. The memory compresses older context while preserving key facts.
Building an Agent with LangChain
LangChain provides the create_react_agent function and AgentExecutor to build a complete agent loop. The agent reasons, selects tools, observes results, and repeats until it reaches a final answer.
from LangChain.agents import create_React_agent, AgentExecutor
from LangChain.tools import tool
from LangChain_OpenAI import ChatOpenAI
@tool
def multiply(a: int, b: int) -> int:
"""Multiply two numbers."""
return a * b
@tool
def get_word_length(word: str) -> int:
"""Return the character count of a word."""
return len(word)
tools = [multiply, get_word_length]
llm = ChatOpenAI(model="gpt-4o", temperature=0)
agent = create_React_agent(llm, tools)
executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
max_iterations=5,
handle_Parsing_errors=True
)
result = executor.invoke({
"input": "Multiply 7 by 8, then find the length of the word 'agent'"
})
print(result["output"])
Expected output: The agent first calls multiply(7, 8) to get 56, then calls get_word_length("agent") to get 5, and returns a combined answer showing both results.
Building an Agent with the OpenAI Assistants API
The OpenAI Assistants API handles threading, tool execution, and memory automatically. You define an assistant with a set of tools, create a Thread, and let the assistant run until completion.
from OpenAI import OpenAI
client = OpenAI()
assistant = client.beta.assistants.create(
name="Math Tutor",
instructions="You are a helpful math tutor. Use code Interpreter when needed.",
model="gpt-4o",
tools=[{"type": "code_Interpreter"}]
)
Thread = client.beta.threads.create()
client.beta.threads.messages.create(
Thread_id=Thread.id,
role="user",
content="Solve the equation 3x^2 + 5x - 2 = 0"
)
run = client.beta.threads.runs.create_and_poll(
Thread_id=Thread.id,
assistant_id=assistant.id
)
messages = client.beta.threads.messages.list(Thread_id=Thread.id)
for msg in messages.data:
print(f"{msg.role}: {msg.content[0].text.value}")
Expected output: The assistant solves the quadratic equation using code Interpreter and returns the step-by-step solution with both roots.
Multi-Agent Systems
Multi-agent systems decompose complex problems into roles. Each agent has a specific job and set of tools. Communication between agents enables them to collaborate on larger tasks than any single agent could handle alone.
from LangChain.agents import AgentExecutor, create_React_agent
from LangChain.tools import tool
from LangChain_OpenAI import ChatOpenAI
from LangChain.prompts import PromptTemplate
@tool
def search_web(query: str) -> str:
"""Search the web for information."""
return f"Mock result for: {query}"
@tool
def write_file(content: str, filename: str) -> str:
"""Write content to a file."""
with open(filename, "w") as f:
f.write(content)
return f"Written to {filename}"
llm = ChatOpenAI(model="gpt-4o", temperature=0)
researcher_agent = create_React_agent(
llm,
[search_web],
prompt=PromptTemplate(
template="You are a researcher. Gather information. {input}",
input_variables=["input"]
)
)
researcher = AgentExecutor(
agent=researcher_agent,
tools=[search_web],
max_iterations=3
)
writer_agent = create_React_agent(
llm,
[write_file],
prompt=PromptTemplate(
template="You are a writer. Save content to files. {input}",
input_variables=["input"]
)
)
writer = AgentExecutor(
agent=writer_agent,
tools=[write_file],
max_iterations=3
)
research_result = researcher.invoke({"input": "Find information about AI agents"})
print("Research complete")
write_result = writer.invoke({
"input": f"Save this research to report.txt: {research_result['output']}"
})
print(write_result["output"])
Expected output: The researcher agent gathers information using the search tool, then the writer agent saves the findings to a file. Each agent works independently with its own tools and instructions.
Safety and Guardrails
Agents can fail in dangerous ways without proper safeguards. Always implement these guardrails:
- Max iterations — prevent infinite loops (set to 10-15)
- Timeout — kill slow tool calls after a threshold
- Input validation — sanitize arguments before passing to tools
- Human approval — require confirmation for destructive actions
- Observation limits — truncate tool outputs to prevent context overflow
from LangChain.agents import AgentExecutor
from LangChain.tools import tool
import os
@tool
def delete_file(path: str) -> str:
"""Delete a file. Requires approval."""
confirm = input(f"Confirm delete {path}? (yes/no): ")
if confirm == "yes":
os.remove(path)
return f"Deleted {path}"
return "Deletion cancelled"
@tool
def multiply(a: int, b: int) -> int:
"""Multiply two numbers."""
return a * b
safe_executor = AgentExecutor(
agent=agent,
tools=[delete_file, multiply],
max_iterations=10,
early_stopping_method="generate",
handle_Parsing_errors=True
)
result = safe_executor.invoke({"input": "Delete temp.txt and multiply 3 by 4"})
print(result["output"])
Expected output: The agent asks for confirmation before deleting the file, then performs the multiplication normally. The guardrails prevent accidental data loss.
Common Errors
| Error | Cause | Fix |
|---|---|---|
| Agent stuck in infinite loop | Tool returns unexpected output | Set max_iterations and add fallback logic |
| Tool hallucination | Agent invokes non-existent functions | Register tools explicitly with @tool decorator |
| Context window overflow | Accumulated observations exceed token limit | Implement memory summarization |
| JSON parsing failure | LLM produces malformed function call | Set handle_Parsing_errors=True |
| Agent ignores instructions | Weak system prompt | Strengthen role definition and constraints |
Practice Questions
What distinguishes an AI agent from a standard LLM call? An agent has tools, memory, and a reasoning loop that allows it to take actions and observe results, rather than just generating text once.
How does the React architecture differ from Plan-and-Execute? React interleaves reasoning and action dynamically, while Plan-and-Execute creates a full plan first then executes steps sequentially.
Why is memory important for multi-step agent tasks? Memory preserves context across steps so the agent can reference earlier observations, avoid repeating work, and maintain conversation State.
What happens when an agent calls a tool that returns an error? The agent receives the error as an observation and can retry, use a different tool, or report the failure depending on how it is prompted.
How do guardrails prevent dangerous agent behavior? Guardrails like max_iterations, timeout, input validation, and human approval checks stop agents from executing infinite loops or destructive actions.
Challenge
Build a three-agent system where a Planner creates a Strategy, a Researcher gathers data from mock sources, and a Reporter formats the findings into a markdown document. Each agent must use at least one tool, and the agents must execute sequentially — the output of one feeds the input of the next.
Real-World Task
Build a customer support agent that can look up order status, check inventory, and escalate to a human when needed. Use LangChain with three tools: a mock order lookup function, a mock inventory check function, and an email escalation function. Test it with a scenario where a customer asks about a delayed order.
Frequently Asked Questions
{{< faq question="What is the best programming language for building AI agents?">}} Python is the most popular language for AI agents due to its ecosystem of LLM frameworks like LangChain, CrewAI, and extensive library support. Most agent frameworks are Python-first. {{< /faq >}}
{{< faq question="Can I build an AI agent without using LangChain?">}} Yes. You can build agents using direct API calls to OpenAI, Anthropic, or other providers with function calling. The OpenAI API Guide and Anthropic Claude API show how to implement tool use without any framework. {{< /faq >}}
{{< faq question="How do AI agents handle errors from tools?">}} When a tool returns an error, the agent receives it as an observation and decides what to do next — retry with different arguments, call a different tool, or report the failure to the user. Setting handle_Parsing_errors=True in LangChain helps the agent recover from malformed outputs.{{< /faq >}}
{{< faq question="What is the difference between an AI agent and a chatbot?">}} A chatbot generates responses based on conversation history. An AI agent takes actions — it calls tools, runs code, queries databases, and executes multi-step plans. Every agent can chat, but not every chatbot is an agent. {{< /faq >}}
Next Steps
[LangChain Guide](/machine-learning/LangChain-guide/) — Master LangChain expression language, chains, and advanced agent patterns.
RAG Systems — Learn retrieval-augmented generation to give your agents access to external knowledge bases.
Build a Chatbot — Apply agent concepts to build a full-featured conversational AI chatbot.
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro