Loading…
Loading…
Written by Max Zeshut
Founder at Agentmelt
A two-stage search process where an initial retrieval step finds candidate results (using keyword or vector search) and a reranking model scores them by relevance to the specific query. Reranking dramatically improves search quality for AI agents: the first stage is fast but approximate, the reranker is slower but precise. Support agents use it to find the most relevant KB article; legal agents use it to surface the most pertinent precedent.
A support agent searches 10,000 KB articles. Vector search retrieves the top 50 candidates in 20ms. A cross-encoder reranker scores all 50 against the customer's exact question in 100ms, surfacing the single best article—far more accurate than vector search alone.