Skip to content
← Back to Blog

Tuesday, June 18th 2024

Introducing Vercel AI SDK 3.2

Posted by

Avatar for lgrammel

Lars Grammel

Software Engineer

Avatar for jared

Jared Palmer

VP of Product, AI

Avatar for aryamank

Aryaman Khandelwal

Sofware Engineer

We’ve been listening to your feedback and working hard to expand the capabilities of the AI SDK while improving its existing functionality. Today, we’re launching AI SDK 3.2.

This release advances the SDK across four major areas:

  • Agents: Extended generateText and streamText for multi-step workflows
  • Providers: Added new providers and expanded capabilities for Anthropic and Google models
  • Embeddings: Introduced embeddings support to power use cases like retrieval augmented generation (RAG) and semantic search
  • DX improvements: Improved AI SDK observability and enabled client-side tool calls

Agents

As AI models become more capable, they’ve been charged with increasingly complex tasks — some of which require multiple steps. With the AI SDK 3.2, we’ve taken the first step towards making these agentic workflows much easier to build.

Let’s say you were building an application to analyze product feedback with an LLM.

await generateText({
model: openai('gpt-4o'),
system: "You are a product feedback analyzer. You summarize a piece of feedback, then determine whether it is positive, negative, or neutral."
prompt: userFeedback,
});

To ensure the model generates useful information, you’ll probably want to first clean the data, then analyze it, and finally send it to a platform where your coworkers can also take a look. In other words, you want your LLM to act as a basic agent.

With just a few short additions, we can use the AI SDK to implement this agent.

await generateText({
model: openai('gpt-4-turbo'),
system: "You are a product feedback analyzer. You summarize feedback, then determine whether it is positive, negative, or neutral. If the feedback is not neutral, you send a message to our Slack channel with that feedback. Always clean the feedback before summarizing or categorizing. "
prompt: userFeedback,
tools: {
cleanUserFeedback: tool({
description: "Removes spam, PII, and profanity from raw user feedback",
parameters: z.object({userFeedback: z.string() }),
execute: async ({userFeedback}) => cleanUserFeedback(userFeedback),
}),
sendMessageToSlack: tool({
description: "Sends feedback to Slack"
parameters: z.object({ sentiment: z.enum(["positive", "negative", "neutral"), feedbackSummary: z.string()}),
execute: async ({ sentiment, feedbackSummary }) => {
sendMessageToSlack(sentiment, feedbackSummary)
process.exit(0);
},
}),
},
maxToolRoundtrips: 10,
});

The agent will first call the cleanFeedback tool with userFeedback as input. It will then summarize and determine a sentiment for the cleaned feedback. Finally, it will call the sendMessagetoSlack tool send the sentiment and feedback summary to Slack before exiting.

This release is only the beginning of what’s necessary to support complex, autonomous agents, and we’ll continue to build on this work in future releases.

Providers

We’ve been adding support for new model providers, aided by the work of our wonderful community. With today’s 3.2 release, the AI SDK has added support for the following providers:

As providers upgrade their models’ capabilities, we’ve been updating the SDK’s functionality too. Image input is now supported for the following providers:

The up-to-date list of supported providers and capabilities can be found in the AI SDK documentation.

We’ve also built an adapter to allow you to use LangChain’s AI tools and abstractions with the UI and streaming capabilities of the AI SDK. Check out our docs for more information.

Embeddings

AI applications often require representing model inputs and outputs as vectors in a high-dimensional space — especially for use cases like RAG. To solve this problem, embeddings provide semantically meaningful representations of content.

The most common method of comparing two embeddings vectors is cosine similarity — a measure of the semantic distance between pieces of embedded content.

This release extends the unified API of AI SDK Core to generating embeddings with the OpenAI, Mistral, Azure OpenAI, and Ollama providers.

// 'embedding' is a single embedding object (number[])
const { embedding } = await embed({
model: openai.embedding('text-embedding-3-small'),
value: 'sunny day at the beach'
});

When loading data, it can often be useful to embed many values at once.

// 'embeddings' is an array of embedding objects (number[][]).
// It is sorted in the same order as the input values.
const { embeddings } = await embedMany({
model: openai.embedding('text-embedding-3-small'),
values: [
'sunny day at the beach',
'rainy afternoon in the city',
],
});
console.log(`Similarity from -1 to 1: ${cosineSimilarity(embeddings[0], embeddings[1])}`)

The AI SDK now also provides the cosineSimilarity helper function to help you determine semantic similarity between embedded values.

Pairing a vision model with semantic similarity using the AI SDK unlocks new applications like semantic image search.

To get an in-depth look at the code for this example, check out our semantic image search template!

DX Improvements

The 3.2 release is laying the groundwork for AI observability with the AI SDK. We’ve made it easier to understand token usage and errors when using streamText and streamObject. We’ve also introduced the onFinish callback, which is invoked when the stream is finished and contains the token usage.

const result = await streamObject({
model: openai('gpt-4-turbo'),
schema: z.object({
name: z.object({
firstName: z.string(),
lastName: z.string(),
})
})
prompt: "Generate a random name",
onFinish({ object, error, usage, ...rest}) {
console.log("Token usage:", usage);
if (object === undefined) {
console.error("Error": error);
} else {
console.log("Success!", JSON.stringify(object, null, 2))
}
}
})

You can now also access the final, typed object as a promise from the streamObject result, allowing you to log and use the finished result with guaranteed type-safety.

const result = await streamObject({
model: openai('gpt-4-turbo'),
schema: z.object({
name: z.object({
firstName: z.string(),
lastName: z.string()
})
}),
prompt: "Generate a random name"
});
result.object.then(({ name }) => {
// Use the fully typed, final object with no ts-ignore needed
console.log("Name:", name.firstName, name.lastName);
});

To reduce the AI SDK’s bundle size, we’ve also split AI SDK UI by framework. Our 3.2 release will be backwards compatible, but we recommend migrating to @ai-sdk/react, @ai-sdk/vue, @ai-sdk/svelte, or @ai-sdk/solid.

With this release, you can now build generative UI chatbots client-side with just useChat and streamText in your React projects. We’ve enabled client and server-side tool execution with streamText and the new toolInvocations and onToolCall utilities, which allows you to conditionally render UI based on which tools the LLM calls.

Here’s a simple example of a chatbot that tells the user where they’re chatting from.

app/api/chat/route.ts
export async function POST(req: Request) {
const { messages } = await req.json();
const result = await streamText({
model: openai('gpt-4-turbo'),
messages: convertToCoreMessages(messages),
tools: {
// client-side tool that starts user interaction:
askForConfirmation: {
description: "Ask the user for confirmation",
parameters: z.object({message: z.string().describe("The message to ask for confirmation") }),
},
// client-side tool that gets the user's location:
getLocation: {
description:
"Get the user location. Always ask for confirmation before using this tool.",
parameters: z.object({}),
},
}
})
}

In your streamText call, you can omit the execute parameter to execute the tool on the client-side.

app/page.tsx
export default function Chat() {
const {
messages,
input,
handleInputChange,
handleSubmit,
addToolResult
} = useChat({
maxToolRoundtrips: 5,
// run client-side tools that are automatically executed
async function onToolCall({ toolCall }) {
if (toolCall.toolName === 'getLocation') {
return getUserLocation();
}
}
});
return (
<div>
{messages?.map((m: Message) => (
<div key={m.id}>
<strong>{m.role}:</strong>
{m.content}
{m.toolInvocations?.map((toolInvocation: ToolInvocation) => {
const toolCallId = toolInvocation.toolCallId;
const addResult = (result: string) =>
addToolResult({ toolCallId, result });
// render confirmation tool (client-side tool with user interaction)
if (toolInvocation.toolName === 'askForConfirmation') {
return (
<div key={toolCallId}>
{'result' in toolInvocation ? (
<b>
{toolInvocation.args.message}: {toolInvocation.result}
</b>
) : (
<>
{toolInvocation.args.message}:{' '}
<button onClick={() => addResult('Yes')}>Yes</button>
<button onClick={() => addResult('No')}>No</button>
</>
)}
</div>
);
}
})}
</div>
))}
<form onSubmit={handleSubmit}>
<input value={input} onChange={handleInputChange} />
</form>
</div>
);
}

You can use onToolCall within useChat to define functions to execute for client-side tools. toolInvocation gives you access to the tools the LLM has chosen to call on the client, which allows you to conditionally render UI components to handle tool calls — implementing probabilistic routing much like the streamUI function does in AI SDK RSC. addToolResult allows you to pass user-provided information back to the LLM for use in future responses.

Conclusion

With the AI SDK 3.2, we’re taking the first steps towards supporting two new use cases: embeddings and agents. We’ll continue to update the SDK so you have access to the cutting edge of model providers and a seamless development experience when building with AI.

We can’t wait to see what you’ll build. Get started by deploying your own image search app or experimenting with SDK model providers!

Ship production-grade AI applications faster with Vercel

Talk to our team to learn more about building AI-powered applications for your organization.

Contact Us