Loading…
Loading…
The process of running a trained AI model to generate predictions or outputs from new input data. Every time an agent answers a question, writes an email, or classifies a ticket, it's performing inference. Inference cost and speed directly affect agent operating expenses and user experience—faster inference means snappier agents.
Back to glossary