The Agentic Loop, Explained: How AI Agents Actually Work
Written by Max Zeshut
Founder at Agentmelt · Last updated Jun 9, 2026
Strip away the frameworks, the demos, and the hype, and almost every AI agent comes down to one mechanism: a loop. A model looks at the situation, decides on one action, takes it, looks at what happened, and goes again. That cycle — the agentic loop — is the single idea that separates an agent from a chatbot. If you understand the loop, you understand what agents can reliably do and where they fall over.
This post is the short version. For the full breakdown with worked traces and cited sources, see the pillar guide on agentic loops.
What the agentic loop actually is
A single-shot model answers once: prompt in, answer out, no feedback. An agent doesn't. It runs a repeating five-step cycle:
- Gather context. Assemble the goal, the history, retrieved documents, and prior results into the model's context window.
- Reason and decide. The model thinks about the current state and picks one next action — usually emitting a structured tool call rather than acting directly.
- Act. The surrounding code — the harness — validates that tool call, checks permissions, and executes it: run code, query an API, edit a file, send a message.
- Observe. The result of that action is fed back into context as ground truth — a tool output, an error, a changed file, a test result.
- Repeat or stop. With the new observation in hand, the loop runs again, until the goal is met or a stop condition fires.
As Anthropic put it in Building Effective Agents, agents are "just LLMs using tools based on environmental feedback in a loop." That phrase — environmental feedback in a loop — is the whole game. The agent doesn't try to solve the problem in one shot; it takes a small step, sees what happens, and adapts.
Why the loop matters
The loop is what makes an agent robust to its own mistakes. A single-shot model that misreads a requirement produces a wrong answer and stops. An agent that misreads a requirement takes an action, sees it fail, and corrects — because the observation step grounds it in reality instead of its own assumptions.
It's also what separates an agent from a workflow. In a workflow, a human writes the steps in code and the model fills in pieces along a fixed path. In an agent loop, the model itself decides the next step based on what it observes. Workflows are more predictable; agent loops are more flexible. Most production systems use both — workflows for the well-understood parts, agent loops where the path can't be hardcoded in advance.
The patterns built on the loop
Most "agent architectures" you'll read about are variations on the base loop:
- ReAct (Reasoning + Acting). The default. The model alternates Thought → Action → Observation, feeding each result into the next thought. (ReAct, Yao et al., 2022)
- Reflection. The agent drafts an answer, critiques its own work against the goal, and revises — looping on its own feedback before returning anything. (Reflexion, Shinn et al., 2023)
- Plan-and-Execute. The agent decomposes the goal into an explicit plan first, then executes each step, re-planning when reality diverges.
- Orchestrator–Workers. A central agent breaks the task down and delegates sub-tasks to specialized workers, then synthesizes their results.
- Evaluator–Optimizer. One model generates, a second scores the output against a rubric, and the loop optimizes across rounds.
- Tree of Thoughts. The agent explores several reasoning branches in parallel and pursues the most promising — search instead of a single greedy chain. (Tree of Thoughts, Yao et al., 2023)
The first one — ReAct — covers the overwhelming majority of production agents. Reach for the heavier patterns only when a greedy, one-step-at-a-time loop starts thrashing.
Where loops go wrong: loop control
Here's the part the demos skip. Every iteration costs tokens, time, and money, and a loop with no brakes is a runaway bill — or an agent stuck repeating the same failing move forever. The genuinely hard problems in agent engineering are about controlling the loop:
- Stop conditions — max-iteration caps (often 10–25 steps), token budgets, and wall-clock timeouts.
- Repeated-action detection — catch the agent looping on the same move and break out or escalate.
- Context engineering — curate what goes into the window each turn. Too much and the model degrades (a failure mode called context rot); too little and it loses the thread.
- Human-in-the-loop gates — require approval for irreversible or high-cost actions: sending money, deleting data, publishing.
This is why the industry's attention has shifted from prompt engineering to harness engineering — building the code around the model that validates tool calls, enforces budgets, manages memory, and decides when the loop stops. As models have gotten stronger, the harness is where most agent reliability is now won or lost. Simon Willison's Designing agentic loops is a sharp practical read on this, and the Claude Agent SDK agent-loop docs show the stop-condition knobs (max turns, budget caps, compaction) in code.
What the loop looks like in practice
Take a coding agent asked to fix a failing test:
- Gather: read the failing test and the function under test.
- Reason: the loop bound looks off by one.
- Act: edit the function and run the test suite.
- Observe: one assertion is still red — the fix was wrong.
- Reason + Act: adjust the boundary, run the suite again.
- Observe: all tests green → stop.
No human wrote those six steps. The agent generated each one from the observation before it. Swap the tools and you get a sales agent qualifying a lead, or a support agent resolving a ticket — same loop, different tools and stop conditions.
What this means if you're deploying agents
If you're evaluating or building agents, the loop is your mental model. Ask: what tools does it have, what does it observe between steps, and — most importantly — what makes it stop? A vendor who can answer those three questions has thought about reliability. One who only shows you a happy-path demo has not.
For the full guide — the canonical cycle, all six patterns with examples, and the complete source list — read What Are Agentic Loops?. If you'd rather skip to shipping one, tell us your use case and we'll scope the right agent, loop control and guardrails included.
Sources
- Building Effective Agents — Anthropic, 2024
- How the Agent Loop Works (Claude Agent SDK) — Anthropic, 2026
- Unrolling the Codex agent loop — OpenAI, 2025
- ReAct: Synergizing Reasoning and Acting in Language Models — Yao et al., 2022
- Reflexion: Language Agents with Verbal Reinforcement Learning — Shinn et al., 2023
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models — Yao et al., 2023
- LLM Powered Autonomous Agents — Lilian Weng, 2023
- Designing agentic loops — Simon Willison, 2025
- What Is the AI Agent Loop? — Oracle, 2025
- Agent Loop (runtime lifecycle) — OpenClaw, 2026
Get the AI agent deployment checklist
One email, no spam. A short checklist for choosing and deploying the right AI agent for your team.
[email protected]