Loading…
Loading…
Transferring knowledge from a large, expensive AI model (teacher) to a smaller, faster model (student) that approximates the same performance at lower cost. Distillation enables production agents to run affordably at scale—a distilled model handles 90% of support tickets at 10% of the inference cost, while complex cases route to the full model.