Streaming is when AI sends its response incrementally — word by word or chunk by chunk — rather than waiting until the entire response is generated. This creates the typing effect you see in ChatGPT and other AI interfaces. For developers building AI products, streaming reduces perceived latency and creates a more responsive user experience.
Streaming is a UX pattern that makes AI feel fast. Even though the total generation time is the same, seeing words appear immediately transforms the experience.
| Without Streaming | With Streaming |
|---|---|
| Wait... wait... full response | Words appear immediately |
| Feels slow and unresponsive | Feels fast and interactive |
| User wonders if it's working | User sees progress in real time |
| All-or-nothing | Can stop early if off-track |
Most AI SDKs support streaming natively:
const stream = await ai.streamText({
model: "claude-sonnet",
prompt: "Explain vibe coding",
})
for await (const chunk of stream) {
// Display each chunk as it arrives
}
Stream: Chat interfaces, long responses, creative content Don't stream: API calls expecting structured data, background processing, batch operations