Loading…
Loading…
Written by Max Zeshut
Founder at Agentmelt
Tokens generated by an AI model during an extended reasoning process that are used internally for step-by-step problem-solving but are not shown to the end user. Also called 'reasoning tokens' or 'thinking budget,' these tokens represent the model's intermediate reasoning—similar to scratch work on a math problem. Models like Claude (extended thinking) and OpenAI's o-series use thinking tokens to improve accuracy on complex tasks. Thinking tokens count toward the model's output token usage and billing, but give the model space to reason through multi-step problems, check its work, and consider alternatives before producing a final answer.
A legal agent asked to analyze a complex contract clause uses 2,000 thinking tokens to work through the implications: identifying relevant precedents, considering edge cases, evaluating risk from multiple angles, and checking its reasoning before producing a concise 200-token analysis for the lawyer. The thinking tokens aren't shown to the user but dramatically improve the quality and accuracy of the final output.