Loading…
Loading…
Written by Max Zeshut
Founder at Agentmelt
Directing AI requests to different models or endpoints based on task complexity, cost, or latency requirements. A router might send simple classification tasks to a small, fast model and complex reasoning tasks to a larger, more capable one. This reduces costs by 40–70% compared to routing everything through the most powerful model, while maintaining quality where it matters.
A support agent routes simple FAQ lookups to Haiku (fast, cheap) and complex troubleshooting to Opus (accurate, slower). The router decides based on the estimated complexity of each incoming ticket.