Loading…
Loading…
The time delay between sending a request to an AI model and receiving a response. Low latency is critical for real-time applications like voice agents (where delays feel unnatural) and live chat support. Factors include model size, infrastructure, and whether the agent needs to call external tools before responding.