Loading…
Loading…
Written by Max Zeshut
Founder at Agentmelt
The gradual degradation of an AI agent's response quality as conversation context grows longer. Even when a model technically supports a million-token context window, response quality typically peaks at much shorter context lengths. As context grows, the model's attention spreads thinner across more information, instruction following weakens, and earlier instructions get diluted by later content. Context rot is why long-running agent conversations need periodic context compression, not just larger context windows.
A coding agent works on a long debugging session, accumulating 200K tokens of context. Quality is excellent at 20K tokens, good at 60K, mediocre at 120K, and unreliable at 200K—even though the model 'supports' 200K. The team implements context compression that summarizes earlier turns when context exceeds 80K, restoring high-quality outputs.