Loading…
Loading…
Written by Max Zeshut
Founder at Agentmelt
The maximum number of tokens a language model can process in a single call—encompassing the system prompt, conversation history, retrieved documents, tool outputs, and the model's response. Context windows range from 8K tokens (older models) to 200K+ tokens (Claude, Gemini). A larger context window allows agents to reason over more information simultaneously, but cost scales linearly with input tokens. Understanding your model's context window is essential for designing retrieval strategies and conversation management.
Back to glossary