Vercel Logo

Build a Streaming Chat Endpoint

Waiting 5 seconds for a complete AI response before showing anything makes your app feel broken. Streaming delivers tokens as they're generated, so users see the response forming in real time. Let's build one from scratch so you understand exactly what's happening between the browser and the AI.

Outcome

Build a SvelteKit server endpoint that streams AI responses using the AI SDK and Server-Sent Events.

Fast Track

  1. Set up the AI Gateway provider with your API key
  2. Use streamText() to get a streaming response from Claude
  3. Pipe the stream into an SSE response using ReadableStream

How Streaming Works

Browser                 SvelteKit               Claude API
   │                       │                       │
   │── POST /api/chat ───→ │                       │
   │                       │── streamText() ─────→ │
   │                       │                       │
   │                       │ ←─ token: "I" ─────── │
   │ ←─ SSE: "I" ──────── │                       │
   │                       │ ←─ token: "'ll" ───── │
   │ ←─ SSE: "'ll" ─────── │                       │
   │                       │ ←─ token: " help" ─── │
   │ ←─ SSE: " help" ───── │                       │
   │                       │ ←─ [done] ──────────── │
   │ ←─ SSE: [DONE] ────── │                       │

The AI SDK handles the Claude API connection. You handle turning it into Server-Sent Events for the browser.

Hands-on exercise 2.1

Let's replace the placeholder in src/routes/api/chat/+server.ts with a streaming implementation:

Requirements:

  1. Import and configure the AI Gateway provider from ai
  2. Use streamText() from the ai package with a system prompt about ski resorts
  3. Iterate over result.fullStream and emit text-delta events as SSE
  4. Return a ReadableStream response with the correct SSE headers

Implementation hints:

  • The gateway client needs AI_GATEWAY_API_KEY from $env/static/private
  • Use anthropic/claude-sonnet-4 as the model
  • The system prompt should list available resorts so the AI knows what to talk about
  • The Chat.svelte component already handles SSE parsing. It expects data: {"type": "text", "content": "..."} format
  • Don't add tools yet; that's the next lesson

SSE format the frontend expects:

data: {"type": "text", "content": "I"}

data: {"type": "text", "content": "'ll"}

data: {"type": "text", "content": " help"}

data: [DONE]

Try It

  1. Start the dev server and open the app

  2. Type a message in the chat panel:

    What resorts do you know about?
    
  3. Watch the response stream in: The AI should respond with information about the 5 available resorts (Mammoth Mountain, Palisades Tahoe, Grand Targhee, Steamboat, Mt. Bachelor). You'll see tokens appear one by one.

  4. Check the Network tab:

    • The request to /api/chat should show Content-Type: text/event-stream
    • The response streams in chunks rather than arriving all at once
Tools don't work yet

If you ask "alert me when Mammoth gets powder," the AI will respond with text but can't create an alert. You'll add that in the next lesson.

Commit and Deploy

git add -A
git commit -m "feat(chat): implement streaming AI chat endpoint"
git push

Pushing triggers a new deployment on Vercel so you can test streaming in production.

Done-When

  • Chat endpoint returns streaming SSE responses
  • AI responses appear token-by-token in the chat UI
  • The AI knows about the 5 ski resorts from the system prompt
  • No errors in the browser console or server logs

Solution

src/routes/api/chat/+server.ts
import { createGateway, streamText } from 'ai';
import { resorts } from '$lib/data/resorts';
import { AI_GATEWAY_API_KEY } from '$env/static/private';
import type { RequestHandler } from './$types';
 
const gateway = createGateway({
  apiKey: AI_GATEWAY_API_KEY
});
 
export const POST: RequestHandler = async ({ request }) => {
  const { message } = await request.json();
 
  const resortList = resorts.map((r) => `- ${r.name} (id: ${r.id})`).join('\n');
 
  const result = streamText({
    model: gateway('anthropic/claude-sonnet-4'),
    system: `You are a helpful ski conditions assistant. Users want to learn about ski resort conditions.
 
Available resorts:
${resortList}
 
Provide helpful information about these resorts and current conditions.`,
    messages: [{ role: 'user', content: message }]
  });
 
  const encoder = new TextEncoder();
  const stream = new ReadableStream({
    async start(controller) {
      for await (const part of result.fullStream) {
        if (part.type === 'text-delta') {
          controller.enqueue(
            encoder.encode(
              `data: ${JSON.stringify({ type: 'text', content: part.text })}\n\n`
            )
          );
        }
      }
      controller.enqueue(encoder.encode('data: [DONE]\n\n'));
      controller.close();
    }
  });
 
  return new Response(stream, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache',
      Connection: 'keep-alive'
    }
  });
};

The createGateway() call sets up the connection to the AI Gateway with your API key. From there, streamText() starts the conversation and hands back an async iterable, result.fullStream, that emits tokens as they arrive. We wrap each text-delta in the SSE format the frontend expects and push it through a ReadableStream. The SSE headers tell the browser to keep the connection open and parse events as they flow in.

Troubleshooting

Blank chat with no response

Check that AI_GATEWAY_API_KEY is set in your .env file. If you skipped lesson 1.2 or the key is missing, the gateway client will fail silently and the stream will never start.

Response arrives all at once instead of streaming

Verify the Content-Type header is text/event-stream. A missing or wrong header causes the browser to buffer the entire response before rendering. Also check you're not running behind a proxy that buffers SSE connections.

Advanced: The Chat Class from @ai-sdk/svelte

This lesson builds manual SSE streaming so you understand how it works. In production, AI SDK v6 provides a Chat class in @ai-sdk/svelte that handles all the client-side stream parsing for you:

<script lang="ts">
  import { Chat } from '@ai-sdk/svelte';
 
  let input = $state('');
  const chat = new Chat({});
</script>
 
{#each chat.messages as message}
  {#each message.parts as part}
    {#if part.type === 'text'}
      <p>{part.text}</p>
    {/if}
  {/each}
{/each}
 
<form onsubmit={(e) => { e.preventDefault(); chat.sendMessage({ text: input }); input = ''; }}>
  <input bind:value={input} />
</form>

The server endpoint would use toUIMessageStreamResponse() instead of manual SSE:

return result.toUIMessageStreamResponse();

The manual approach in this lesson gives you full control over the SSE format and event types (like the custom alert_created event in the next lesson). Use Chat when you don't need custom event handling.

Advanced: Error Handling in Streams

If the API key is missing or the request fails, the stream will error. Add a try/catch inside the start() function:

const stream = new ReadableStream({
  async start(controller) {
    try {
      for await (const part of result.fullStream) {
        if (part.type === 'text-delta') {
          controller.enqueue(
            encoder.encode(
              `data: ${JSON.stringify({ type: 'text', content: part.text })}\n\n`
            )
          );
        }
      }
    } catch (error) {
      controller.enqueue(
        encoder.encode(
          `data: ${JSON.stringify({ type: 'error', content: 'Stream failed' })}\n\n`
        )
      );
    }
    controller.enqueue(encoder.encode('data: [DONE]\n\n'));
    controller.close();
  }
});