Streaming Examples
Explore streaming on Vercel with code samples that work out of the box.Vercel supports streaming in Edge and Serverless functions with some limitations. This page provides examples of techniques for processing data from streams.
To learn how streaming on Vercel works, see Streaming.
To use these examples, you should know how to create a Function with your preferred framework, which you can learn in the following quickstarts:
You should also have a fundamental understanding of how streaming works on Vercel. See the following docs to learn more:
AI providers can be slow when producing responses, but many make their responses available in chunks as they're processed. Streaming enables you to show users those chunks of data as they arrive rather than waiting for the full response, improving the perceived speed of AI-powered apps.
Vercel recommends using Vercel's AI SDK to stream responses from LLMs and AI APIs. It reduces the boilerplate necessary for streaming responses from AI providers.
The following example demonstrates a Function that sends a message to one of Open AI's GPT models and streams the response:
import OpenAI from 'openai';
import { OpenAIStream, StreamingTextResponse } from 'ai';
// Can be 'nodejs', but Vercel recommends using 'edge'
export const runtime = 'edge';
const openai = new OpenAI({
apiKey: process.env.OPEN_API_KEY,
});
// This method must be named GET
export async function GET() {
// Make a request to OpenAI's API based on
// a placeholder prompt
const response = await openai.chat.completions.create({
model: 'gpt-3.5-turbo',
stream: true,
messages: [{ role: 'user', content: 'Say this is a test.' }],
});
// Log the response
for await (const part of response) {
console.log(part.choices[0].delta);
}
// Convert the response into a friendly text-stream
const stream = OpenAIStream(response);
// Respond with the stream
return new StreamingTextResponse(stream);
}
Build your app and visit localhost:3000/api/chat-example
. You should see the text "This is a test."
in the browser.
Chunks in web streams are fundamental data units that can be of many different types depending on the content, such as String
for text or Uint8Array
for binary files. While standard Function responses contain full payloads of data processed on the server, streamed responses typically send data in chunks over time.
This example will demonstrate how to:
Create a
ReadableStream
and add a data source. In this case, you'll create your own data by encoding text withTextEncoder
:// TextEncoder objects turn text content // into streams of UTF-8 characters. // You'll add this encoder to your stream const encoder = new TextEncoder(); // This is the stream object, which clients can read from // when you send it as a Function response const readableStream = new ReadableStream({ // The start method is where you'll add the stream's content start(controller) { const text = 'Stream me!'; // Queue the encoded content into the stream controller.enqueue(encoder.encode(text)); // Prevent more content from being // added to the stream controller.close(); }, });
Transform the stream's data chunks before they're read by the client. First, you'll decode the chunks with
TextDecoder
, then transform the text to uppercase before encoding the text again:// TextDecoders can decode streams of // encoded content. You'll use this to // transform the streamed content before // it's read by the client const decoder = new TextDecoder(); // TransformStreams can transform a stream's chunks // before they're read in the client const transformStream = new TransformStream({ transform(chunk, controller) { // Decode the content, so it can be transformed const text = decoder.decode(chunk); // Make the text uppercase, then encode it and // add it back to the stream controller.enqueue(encoder.encode(text.toUpperCase())); }, });
Finally, write stream the data chunk by chunk as a Function response:
// Finally, send the streamed response. Result: // "STREAM ME!" will be displayed in the client return new Response(readableStream.pipeThrough(transformStream), { headers: { 'Content-Type': 'text/html; charset=utf-8', }, });
The final file will look like this:
// Must be 'edge' in non-Node.js frameworks
export const runtime = 'edge';
// This method must be named GET
export async function GET() {
// TextEncoder objects turn text content
// into streams of UTF-8 characters.
// You'll add this encoder to your stream
const encoder = new TextEncoder();
// This is the stream object, which clients can read from
// when you send it as a Function response
const readableStream = new ReadableStream({
// The start method is where you'll add the stream's content
start(controller) {
const text = 'Stream me!';
// Queue the encoded content into the stream
controller.enqueue(encoder.encode(text));
// Prevent more content from being
// added to the stream
controller.close();
},
});
// TextDecoders can decode streams of
// encoded content. You'll use this to
// transform the streamed content before
// it's read by the client
const decoder = new TextDecoder();
// TransformStreams can transform a stream's chunks
// before they're read in the client
const transformStream = new TransformStream({
transform(chunk, controller) {
// Decode the content, so it can be transformed
const text = decoder.decode(chunk);
// Make the text uppercase, then encode it and
// add it back to the stream
controller.enqueue(encoder.encode(text.toUpperCase()));
},
});
// Finally, send the streamed response. Result:
// "STREAM ME!" will be displayed in the client
return new Response(readableStream.pipeThrough(transformStream), {
headers: {
'Content-Type': 'text/html; charset=utf-8',
},
});
}
Build your app and visit localhost:3000/api/chunk-example
. You should see the text "STREAM ME!"
in the browser.
See Understanding Chunks to learn more.
When the server streams data faster than the client can process it, excess data will queue up in the client's memory. This issue is called backpressure, and it can lead to memory overflow errors, or data loss when the client's memory reaches capacity.
This example will demonstrate how to:
- Simulate backpressure by creating a function that generates data faster than a stream can read it
- Handle backpressure by pushing data into a stream as it's needed, rather than as it's ready
To create this example:
Create the function that will generate the data. In this case, it will be a generator function that yields a new integer indefinitely
generator// For Serverless, set this to 'nodejs' export const runtime = 'edge'; // A generator that will yield positive integers async function* integers() { let i = 1; while (true) { console.log(`yielding ${i}`); yield i++; await sleep(100); } } // Add a custom sleep function to create // a delay that simulates how slow some // Function responses are. function sleep(ms: number) { return new Promise((resolve) => setTimeout(resolve, ms)); }
Next, create a method that adds the generator function to a
ReadableStream
. Using thepull
handler, you can prevent new data being added from the generator to the stream if no more data is being requestedPull data// Wraps a generator into a ReadableStream function createStream(iterator::AsyncGenerator<number, void, unknown>) { return new ReadableStream({ // The pull method controls what happens // when data is added to a stream. async pull(controller) { const { value, done } = await iterator.next(); // done == true when the generator will yield // no more new values. If that's the case, // close the stream. if (done) { controller.close(); } else { controller.enqueue(value); } }, }); }
Finally, iterate through a loop and read data from the stream. Without the code that checks if the generator is done, the stream would continue taking values from
integers()
indefinitely, filling up memory. Because the code checks if the generator is done, the stream closes after you iterator as many times asloopCount
:iterate-values// Demonstrate handling backpressure async function backpressureDemo() { // Set up a stream of integers const stream = createStream(integers()); // Read values from the stream const reader = stream.getReader(); const loopCount = 5; // Read as much data as you want for (let i = 0; i < loopCount; i++) { // Get the newest value added to the stream const { value } = await reader.read(); console.log(`Stream value: ${value}`); await sleep(1000); } }
The final file, including the route handler function, will look like this:
// For Serverless, set this to 'nodejs'
export const runtime = 'edge';
// A generator that will yield positive integers
async function* integers() {
let i = 1;
while (true) {
console.log(`yielding ${i}`);
yield i++;
await sleep(100);
}
}
// Add a custom sleep function to create
// a delay that simulates how slow some
// Function responses are.
function sleep(ms: number) {
return new Promise((resolve) => setTimeout(resolve, ms));
}
// Wraps a generator into a ReadableStream
function createStream(iterator: AsyncGenerator<number, void, unknown>) {
return new ReadableStream({
// The pull method controls what happens
// when data is added to a stream.
async pull(controller) {
const { value, done } = await iterator.next();
// done == true when the generator will yield
// no more new values. If that's the case,
// close the stream.
if (done) {
controller.close();
} else {
controller.enqueue(value);
}
},
});
}
// Demonstrate handling backpressure
async function backpressureDemo() {
// Set up a stream of integers
const stream = createStream(integers());
// Read values from the stream
const reader = stream.getReader();
const loopCount = 5;
// Read as much data as you want
for (let i = 0; i < loopCount; i++) {
// Get the newest value added to the stream
const { value } = await reader.read();
console.log(`Stream value: ${value}`);
await sleep(1000);
}
}
export async function GET() {
backpressureDemo();
return new Response('Check your console to see the result!');
}
Was this helpful?