7 min read
Introducing image generation, non-blocking data streaming, improved tool calling, and more.
The AI SDK is an open-source toolkit for building AI applications with JavaScript and TypeScript. Its unified provider API allows you to use any language model and enables powerful UI integrations into leading web frameworks such as Next.js and Svelte.
Since our 4.0 release, we've seen some incredible products powered by the AI SDK:
Languine is an open-source CLI tool that automates application localization, detecting translation changes and maintaining consistent tone across all major i18n libraries.
Scira is a minimalist AI-powered search engine, using the AI SDK to power search-grounded LLM responses and powerful generative UI interactions.
Fullmoon enables cross-platform chat with private local LLMs, bringing secure AI conversations to everyone.
Each of these projects is open-source (Languine, Scira, Fullmoon), giving you the opportunity to explore how these AI-powered applications are built.
Today, we're announcing the release of AI SDK 4.1, which introduces image generation capabilities. This update allows developers to generate images through a unified API that works seamlessly across providers like Replicate, OpenAI, Google Vertex, and Fireworks.
Along with image generation, this release includes:
Let's explore these new features and improvements.
Image generation
Generating images from text prompts is a novel generative AI capability that enables new types of applications and workflows. The ecosystem is growing rapidly, with providers such as Replicate supporting hundreds of different image generation models, adding more and more every day.
With AI SDK 4.1, we're taking our first step towards enabling multi-modal outputs by introducing support for image generation through the new experimental generateImage
function.
import { experimental_generateImage as generateImage } from 'ai';import { replicate } from '@ai-sdk/replicate';
const { image } = await generateImage({ model: replicate.image('black-forest-labs/flux-1.1-pro-ultra'), prompt: 'A futuristic cityscape at sunset',});
Switching between AI providers is as simple as changing 2 lines of code - your prompt and settings remain unchanged:
import { experimental_generateImage as generateImage } from 'ai';import { fireworks } from '@ai-sdk/openai';
const { image } = await generateImage({ model: fireworks.image('accounts/fireworks/models/SSD-1B'), prompt: 'A futuristic cityscape at sunset',});
With the generateImage
function, you have full control of parameters like:
Control image dimensions with
size
oraspectRatio
Generate multiple images in parallel with
n
Access images in both base64 and uint8Array formats
Control randomness with
seed
Provider-specific options are also supported through the providerOptions
parameter:
const { image } = await generateImage({ model: replicate.image('black-forest-labs/flux-1.1-pro-ultra'), prompt: 'A futuristic cityscape at sunset', size: "16:9", n: 3, seed: 0, providerOptions: { replicate: { style: 'realistic_image' }, },});
The AI SDK supports image generation across multiple providers including Replicate, OpenAI, Google Vertex AI, and Fireworks.
Check out our image generation demo to see how different providers handle the same prompts and explore the capabilities of each model.
Stream transformation & smoothing
AI SDK 4.1 introduces new capabilities for transforming stream output on the server. This enables powerful use cases such as:
Creating smoother streaming experiences with the built-in
smoothStream
transform (with custom chunking options such as by character, word, line)Filtering content and applying safety guardrails
Any custom transformation (eg. uppercase)
For example, the built-in smoothStream
function helps create more natural text streaming by smoothing chunky or inconsistent provider responses into a smoother flow:
import { smoothStream, streamText } from 'ai';
const result = streamText({ model, prompt, experimental_transform: smoothStream(),});
Multiple transformations can be applied by passing them as an array:
const result = streamText({ model, prompt, experimental_transform: [firstTransform, secondTransform],});
Check out our stream transformation documentation to learn more about configuring chunking patterns, implementing content filtering, and creating your own transformations.
Simplified persistence with useChat
We heard your feedback that adding persistence to useChat
is too complicated. To address this, we've added three key improvements:
Chat ID can be forwarded from the client to the server
Response message IDs can be forwarded from the server to the client
The new
appendResponseMessages
utility unifies messages for simple saving
Check out our chat persistence guide to learn more, or start with our minimal example if you prefer to dive straight into code.
Non-blocking data streaming
AI SDK 4.1 introduces powerful new streaming functionality with the createDataStreamResponse
function enabling powerful use cases like streaming retrieval-augmented generation (RAG) context and search results to the client before the LLM’s response begins. Previously, streaming was limited to returning the result of a single LLM call (eg. streamText().toDataStreamResponse()
). Now, you can create non-blocking data streams that:
Return immediately and allows you to stream data on-demand
Provide full control over when and what data is streamed
Support adding annotations and metadata to messages
Here's an example of using createDataStreamResponse
to stream custom data alongside LLM output:
import { openai } from "@ai-sdk/openai";import { createDataStreamResponse, Message, streamText } from "ai";import { getRelevantContent } from "./get-relevant-content"; // user-defined
export async function POST(req: Request) { const { messages }: { messages: Message[] } = await req.json();
const lastMessage = messages.pop();
return createDataStreamResponse({ execute: async (dataStream) => { const relevantContent = await getRelevantContent(lastMessage.content); for (const content of relevantContent) { dataStream.writeData({ type: "source", url: content.url, title: content.title, }); }
lastMessage.content = lastMessage.content + "\n\nUse the following information to answer the question: " + relevantContent.join("\n");
const result = streamText({ model: openai("gpt-4o"), messages: [...messages, lastMessage], onFinish: async ({}) => { dataStream.writeMessageAnnotation({ sources: relevantContent }); }, });
result.mergeIntoDataStream(dataStream); }, });}
The streamed data is automatically handled by the useChat
hook on the client, making it simple to access both the message content and any additional streamed data:
"use client";
import { useChat } from "ai/react";
export default function Chat() { const { messages, data } = useChat(); // Access streamed data console.log(data); // Access message annotations messages.forEach(m => console.log(m.annotations));
return (/* ... */);}
To learn more, check out the streaming custom data documentation.
Tool-calling improvements
Tools are a core building block of production AI applications - they enable language models to interact with real-world systems and data. However, getting tools to work reliably can be challenging. With AI SDK 4.1, we've invested significantly in making tool calling more robust:
Improved context within tool calls
When executing a tool call, the execute
function now has access to helpful context accessible through a second parameter:
toolCallId
for tracking specific executions and adding tool-related annotationsmessages
array containing full conversation history, including previous tool calls and resultsabortSignal
for canceling long-running operations and forwarding to fetch calls
Here's an example using these context options:
const result = await generateText({ model, abortSignal, tools: { weather: tool({ parameters: z.object({ location: z.string() }), execute: async ({ location }, { toolCallId, messages, abortSignal }) => { // Use toolCallId for tracking data.appendMessageAnnotation({ type: 'tool-status', toolCallId, status: 'in-progress', });
// Forward abort signal const response = await fetch( `https://api.weatherapi.com/v1/current.json?q=${location}`, { signal: abortSignal }, );
return response.json(); }, }), },});
To learn more, check out the tool-calling documentation.
Tool call repair
When tool calls fail, you can now use the experimental_toToolCallRepair
function to attempt repairs to:
Use a model with structured outputs to generate the arguments.
Send the messages, system prompt, and tool schema to a stronger model to generate the arguments.
Provide more specific repair instructions based on which tool was called.
import { openai } from '@ai-sdk/openai';import { generateObject, generateText, NoSuchToolError, tool } from 'ai';
const result = await generateText({ model, tools, prompt,
// example approach: use a model with structured outputs for repair. // (you can use other strategies as well) experimental_repairToolCall: async ({ toolCall, tools, parameterSchema, error, }) => { if (NoSuchToolError.isInstance(error)) { return null; // do not attempt to fix invalid tool names }
const tool = tools[toolCall.toolName as keyof typeof tools];
const { object: repairedArgs } = await generateObject({ model: openai('gpt-4o', { structuredOutputs: true }), schema: tool.parameters, prompt: [ `The model tried to call the tool "${toolCall.toolName}"` + ` with the following arguments:`, JSON.stringify(toolCall.args), `The tool accepts the following schema:`, JSON.stringify(parameterSchema(toolCall)), 'Please fix the arguments.', ].join('\\n'), });
return { ...toolCall, args: JSON.stringify(repairedArgs) }; },});
Granular error handling
To help ship more resilient tool calls, the AI SDK now provides granular error types that make debugging and error handling more precise. Each error type exposes detailed information about what went wrong and includes contextual data to help diagnose and fix issues:
NoSuchToolError
: Handles cases where the model attempts to call an undefined tool.InvalidToolArgumentsError
: Catches schema validation failures when tool arguments don't match the expected parameters.ToolExecutionError
: Identifies runtime issues during tool execution.ToolCallRepairError
: Tracks failures during automatic tool call repair attempts.
These specific error types allow you to implement targeted error handling strategies and provide better feedback to users when tool execution fails. To learn more, check out the error handling documentation.
Structured output improvements
We've expanded structured output capabilities to enable more dynamic and resilient AI applications:
Structured outputs with tools
One of our most requested features is now available: the ability to combine structured outputs with tool usage. With the new experimental_output
option in both generateText
and streamText
, you can build sophisticated large-language model (LLM) calls that can both interact with external systems and return predictably structured data.
Here's an example showing how structured outputs and tools work together:
import { openai } from '@ai-sdk/openai';import { generateText, tool, Output } from 'ai';import { z } from 'zod';
const result = await generateText({ model: openai('gpt-4o', { structuredOutputs: true }), prompt: "What's the weather like in London and New York?", maxSteps: 5, tools: { getWeather: tool({ parameters: z.object({ city: z.string(), units: z.enum(['celsius', 'fahrenheit']), }), execute: async ({ city, units }) => { // Fetch weather data }, }), }, experimental_output: Output.object({ schema: z.object({ cities: z.array( z.object({ name: z.string(), temperature: z.number(), conditions: z.string(), }), ), }), }),});
Instead of making separate calls to determine which cities to check and then calling weather tools for each one, the model can handle the entire workflow in a single function. This results in more efficient and maintainable code, especially for complex scenarios with unpredictable inputs or multiple potential tool paths. To learn more, check out the structured outputs with generateText
and streamText
documentation.
Currently, structured outputs with tools is only available with OpenAI models.
Enhanced error handling
Error handling for structured outputs has been significantly improved in 4.1. Previously, when structure parsing or validation failed, you only received an error – with no access to the underlying response. This meant your only option was to retry the request. With the new NoObjectGeneratedError
, you now have access to:
Raw model output for debugging or salvaging partial responses
Complete request context (response ID, timestamp, model)
Token usage and cost analytics
Here's how to implement the enhanced error handling:
try { const result = await generateObject({ model, schema, prompt, });} catch (error) { if (error instanceof NoObjectGeneratedError) { console.log('Generated text:', error.text); console.log('Response metadata:', error.response); console.log('Token usage:', error.usage); console.log('Error cause:', error.cause); }}
This granular error information makes it easier to diagnose and fix issues with structured output generation, whether they occur during parsing, validation, or model generation phases.
Check out the structured output error handling documentation to learn more about implementing these patterns.
New and updated providers
The AI SDK provider ecosystem continues to grow with new and improved providers:
Google Vertex AI 2.0: A complete refresh of the Vertex AI integration that introduces enhanced performance, improved error handling, and search-grounding support.
OpenAI: Fully revamped support for latest reasoning models.
OpenAI Compatible: A new dedicated provider for OpenAI-compatible APIs.
Replicate: Adds first-party provider for Replicate (image models).
Fireworks: Adds first-party provider for Fireworks (language and image models).
Cohere: Adds first-party provider for Cohere (language and embedding models).
Together AI: Adds first-party provider for Together AI (language models).
DeepInfra: Adds first-party provider for DeepInfra (language models).
DeepSeek: Adds first-party provider for DeepSeek (language models).
Cerebras: Adds first-party provider for Cerebras (language models).
Getting started
With powerful new features like image generation, non-blocking data streaming, and improved tool calling, there's never been a better time to start building AI applications with the AI SDK.
Start a new AI project: Ready to build something new? Check out our latest guides
Explore our templates: Visit our Template Gallery to see the AI SDK in action
Join the community: Share what you're building in our GitHub Discussions
Contributors
AI SDK 4.1 is the result of the combined work of our core team at Vercel (Lars, Jeremy, Walter, and Nico) and many community contributors. Special thanks for contributing merged pull requests:
patelvivekdev, zeke, daviddkkim, klren0312, viktorlarsson, richhuth, dragos-cojocaru, olyaiy, minpeter, nathanwijaya, timconnorz, palmm, Ojansen, ggallon, williamlmao, nasjp, ManuLpz4, aaronccasanova, marcklingen, aaishikasb, michael-hhai, jeremypress, yoshinorisano.
Your feedback and contributions are invaluable as we continue to evolve the AI SDK.