Loading…
Loading…
Delivering AI model output token by token as it's generated, rather than waiting for the complete response. Streaming reduces perceived latency for users—chat interfaces and voice agents feel more responsive when text appears incrementally. For voice agents, streaming is essential: text-to-speech begins before the full response is generated, cutting response time by 50–80%.