VercelLogotypeVercelLogotype
  • Pricing
  • All Posts
  • Engineering
  • Community
  • Company News
  • Customers
  • v0
  • Changelog
  • Press
  • No "" results found at this time.
    Try again with a different keyword.

    Featured articles

  • Feb 9

    How we built AEO tracking for coding agents

    AI has changed the way that people find information. For businesses, this means it's critical to understand how LLMs search for and summarize their web content. We're building an AI Engine Optimization (AEO) system to track how models discover, interpret, and reference Vercel and our sites. This started as a prototype focused only on standard chat models, but we quickly realized that wasn’t enough. To get a complete picture of visibility, we needed to track coding agents. For standard models, tracking is relatively straightforward. We use AI Gateway to send prompts to dozens of popular models (e.g. GPT, Gemini, and Claude) and analyze their responses, search behavior, and cited sources. Coding agents, however, behave very differently. Many Vercel users interact with AI through their terminal or IDE while actively working on projects. In early sampling, we found that coding agents perform web searches in roughly 20% of prompts. Because these searches happen inline with real development workflows, it’s especially important to evaluate both response quality and source accuracy. Measuring AEO for coding agents requires a different approach than model-only testing. Coding agents aren’t designed to answer a single API call. They’re built to operate inside a project and expect a full development environment, including a filesystem, shell access, and package managers. That creates a new set of challenges: Execution isolation: How do you safely run an autonomous agent that can execute arbitrary code? Observability: How do you capture what the agent did when each agent has its own transcript format, tool-calling conventions, and output structure? The coding agent AEO lifecycle Coding agents are typically accessed at some level through CLIs rather than APIs. Even if you’re only sending prompts and capturing responses, the CLI still needs to be installed and executed in a full runtime environment. Vercel Sandbox solves this by providing ephemeral Linux MicroVMs that spin up in seconds. Each agent run gets its own sandbox and follows the same six-step lifecycle, regardless of the CLI it uses. Create the sandbox. Spin up a fresh MicroVM with the right runtime (Node 24, Python 3.13, etc.) and a timeout. The timeout is a hard ceiling, so if the agent hangs or loops, the sandbox kills it. Install the agent CLI. Each agent ships as an npm package (i.e., @anthropic-ai/claude-code, @openai/codex, etc.). The sandbox installs it globally so it's available as a shell command. Inject credentials. Instead of giving each agent a direct provider API key, we set environment variables that route all LLM calls through Vercel AI Gateway. This gives us unified logging, rate limiting, and cost tracking across every agent, even though each agent uses a different underlying provider (though the system allows direct provider keys as well). Run the agent with the prompt. This is the only step that differs per agent. Each CLI has its own invocation pattern, flags, and config format. But from the sandbox's perspective, it's just a shell command. Capture the transcript. After the agent finishes, we extract a record of what it did, including which tools it called, whether it searched the web, and what it recommended in the response. This is agent-specific (covered below). Tear down. Stop the sandbox. If anything went wrong, the catch block ensures the sandbox is stopped anyway so we don't leak resources. In the code, the lifecycle looks like this. Agents as config Because the lifecycle is uniform, each agent can be defined as a simple config object. Adding a new agent to the system means adding a new entry, and the sandbox orchestration handles everything else. runtime determines the base image for the MicroVM. Most agents run on Node, but the system supports Python runtimes too. setupCommands is an array because some agents need more than a global install. For example, Codex also needs a TOML config file written to ~/.codex/config.toml. buildCommand is a function that takes the prompt and returns the shell command to run. Each agent's CLI has its own flags and invocation style. Using the AI Gateway for routing We wanted to use the AI Gateway to centralize management of cost and logs. This required overriding the provider’s base URLs via environment variables inside the sandbox. The agents themselves don’t know this is happening and operate as if they are talking directly to their provider. Here’s what this looks like for Claude Code: ANTHROPIC_BASE_URL points to AI Gateway instead of api.anthropic.com. The agent's HTTP calls go to Gateway, which proxies them to Anthropic. ANTHROPIC_API_KEY is set to empty string on purpose — Gateway authenticates via its own token, so the agent doesn't need (or have) a direct provider key. This same pattern works for Codex (override OPENAI_BASE_URL) and any other agent that respects a base URL environment variable. Provider API credentials can also be used directly. The transcript format problem Once an agent finishes running in its sandbox, we have a raw transcript, which is a record of everything it did. The problem is that each agent produces them in a different format. Claude Code writes JSONL files to disk. Codex streams JSON to stdout. OpenCode also uses stdout, but with a different schema. They use different names for the same tools, different nesting structures for messages, and different conventions. We needed all of this to feed into a single brand pipeline, so we built a four-stage normalization layer: Transcript capture: Each agent stores its transcript differently, so this step is agent-specific. Parsing: Each agent has its own parser that normalizes tool names and flattens agent-specific message structures into a single unified event type. Enrichment: Shared post-processing that extracts structured metadata (URLs, commands) from tool arguments, normalizing differences in how each agent names its args. Summary and brand extraction: Aggregate the unified events into stats, then feed into the same brand extraction pipeline used for standard model responses. Stage 1: Transcript capture This happens while the sandbox is still running (step 5 in the lifecycle from the previous section). Claude Code writes its transcript as a JSONL file on the sandbox filesystem. We have to find and read it out after the agent finishes: Codex and OpenCode both output their transcripts to stdout, so capture is simpler — filter the output for JSON lines: The output of this stage is the same for all agents: a string of raw JSONL. But the structure of each JSON line is still completely different per agent, and that's what the next stage handles. Stage 2: Parsing tool names and message shapes We built a dedicated parser for each agent that does two things at once: normalizes tool names and flattens agent-specific message structures into a single formatted event type. Tool name normalization The same operation has different names across agents: Operation Claude Code Codex OpenCode Read a file Read read_file read Write a file Write write_file write Edit a file StrReplace patch_file patch Run a command Bash shell bash Search the web WebFetch (varies) (varies) Each parser maintains a lookup table that maps agent-specific names to ~10 canonical names: Message shape flattening Beyond naming, the structure of events varies across agents: Claude Code nests messages inside a message property and mixes tool_use blocks into content arrays. Codex has Responses API lifecycle events (thread.started, turn.completed, output_text.delta) alongside tool events. OpenCode bundles tool call + result in the same event via part.tool and part.state. The parser for each agent handles these structural differences and collapses everything into a single TranscriptEvent type: The output of this stage is a flat array of TranscriptEvent[] , which is the same shape regardless of which agent produced it. Stage 3: Enrichment After parsing, a shared post-processing step runs across all events. This extracts structured metadata from tool arguments so that downstream code doesn't need to know that Claude Code puts file paths in args.path while Codex uses args.file: Stage 4: Summary and brand extraction The enriched TranscriptEvent[] array gets summarized into aggregate stats (total tool calls by type, web fetches, errors) and then fed into the same brand extraction pipeline used for standard model responses. From this point forward, the system doesn't know or care whether the data came from a coding agent or a model API call. Orchestration with Vercel Workflow This entire pipeline runs as a Vercel Workflow. When a prompt is tagged as "agents" type, the workflow fans out across all configured agents in parallel and each gets its own sandbox: What we’ve learned Coding agents contribute a meaningful amount of traffic from web search. Early tests on a random sample of prompts showed that coding agents execute search around 20% of the time. As we collect more data we will build a more comprehensive view of agent search behavior, but these results made it clear that optimizing content for coding agents was important. Agent recommendations have a different shape than model responses. When a coding agent suggests a tool, it tends to produce working code with that tool, like an import statement, a config file, or a deployment script. The recommendation is embedded in the output, not just mentioned in prose. Transcript formats are a mess. And they are getting messier as agent CLI tools ship rapid updates. Building a normalization layer early saved us from constant breakage. The same brand extraction pipeline works for both models and agents. The hard part is everything upstream: getting the agent to run, capturing what it did, and normalizing it into a structure you can grade. What’s next Open sourcing the tool. We're planning to release an OSS version of our system so other teams can track their own AEO evals, both for standard models and coding agents. Deep dive on methodology. We are working on a follow-up post covering the full AEO eval methodology: prompt design, dual-mode testing (web search vs. training data), query-as-first-class-entity architecture, and Share of Voice metrics. Scaling agent coverage. Adding more agents as the ecosystem grows and expanding the types of prompts we test (not just "recommend a tool" but full project scaffolding, debugging, etc.).

    Eric and Allen
  • Feb 9

    Anyone can build agents, but it takes a platform to run them

    Prototyping is democratized, but production deployment isn't. AI models have commoditized code and agent generation, making it possible for anyone to build sophisticated software in minutes. Claude can scaffold a fully functional agent before your morning coffee gets cold. But that same AI will happily architect a $5,000/month DevOps setup when the system could run efficiently at $500/month. In a world where anyone can build internal tools and agents, the build vs. buy equation has fundamentally changed. Competitive advantage no longer comes from whether you can build. It comes from rapid iteration on AI that solves real problems for your business and, more importantly, reliably operating those systems at scale. To do that, companies need an internal AI stack as robust as their external product infrastructure. That's exactly what Vercel's agent orchestration platform provides. Build vs. buy ROI has fundamentally changed For decades, the economics of custom internal tools only made sense at large-scale companies. The upfront engineering investment was high, but the real cost was long-term operation with high SLAs and measurable ROI. For everyone else, buying off-the-shelf software was the practical option. AI has fundamentally changed this equation. Companies of any size can now create agents quickly, and customization delivers immediate ROI for specialized workflows: OpenAI deployed an internal data agent to democratize analytics Vercel’s lead qualification agent helps one SDR do the work of 10 (template here) Stripe built a customer-facing financial impact calculator (on a flight!) Today the question isn’t build vs. buy. The answer is build and run. Instead of separating internal systems and vendors, companies need a single platform that can handle the unique demands of agent workloads. Every company needs an internal AI stack The number of use cases for internal apps and agents is exploding, but here's the problem: production is still hard. Vibe coding has created one of the largest shadow IT problems in history, and understanding production operations requires expertise in security, observability, reliability, and cost optimization. These skills remain rare even as building becomes easier. The ultimate challenge for agents isn't building them, it's the platform they run on. The platform is the product: how our data agent runs on Vercel Like OpenAI, we built our own internal data agent named d0 (OSS template here). At its core, d0 is a text-to-SQL engine, which is not a new concept. What made it a successful product was the platform underneath. Using Vercel’s built-in primitives and deployment infrastructure, one person built d0 in a few weeks using 20% of their time. This was only possible because Sandboxes, Fluid compute and AI Gateway automatically handled the operational complexity that would have normally taken months of engineering effort to scaffold and secure. Today, d0 has completely democratized data access that was previously limited to professional analysts. Engineers, marketers, and executives can all ask questions in natural language and get immediate, accurate answers from our data warehouse. Here’s how it works: A user asks a question in Slack: "What was our Enterprise ARR last quarter?" d0 receives the message, determines the right level of data access based on the permissions of the user, and starts the agent workflow. The agent explores a semantic layer: The semantic layer is a file system of 5 layers of YAML-based configs that describe our data warehouse, our metrics, our products, and our operations. AI SDK handles the model calls: Streaming responses, tool use, and structured outputs all work out of the box. We didn't build custom LLM plumbing, we used the same abstractions any Vercel developer can use. Agent steps are orchestrated durably: If a step fails (Snowflake timeout, model hiccup), Vercel Workflows handles retries and state recovery automatically. Automated actions are executed in isolation: File exploration, SQL generation, and query execution all happen in a secure Vercel Sandbox. Runaway operations can't escape, and the agent can execute arbitrary Python for advanced analysis. Multiple models are used to balance cost and accuracy: AI Gateway routes simple requests to fast models and complex analysis to Claude Opus, all in one code base. The answer arrives in Slack: formatted results, often with a chart or Google Sheet link, are delivered back to the Slack using the AI SDK Chatbot primitive. Vercel is the platform for agents Vercel provides the infrastructure primitives purpose-built for agent workloads, both internal and customer-facing. You build the agent, Vercel runs it. And it just works. Using our own agent orchestration platform has enabled us to build and manage an increasing number of custom agents. Internally, we run: A lead qualification agent d0, our analytics agent A customer support agent (handles 87% percent of initial questions) An abuse detection agent that flags risky content A content agent that turns Slack threads into draft blog posts. On the product side: v0 is a code generation agent, and Vercel Agent can review pull requests, analyze incidents, and recommend actions. Both products run on the same primitives as our internal tools. Sandboxes give agents a secure, isolated environment for executing sensitive autonomous actions. This is critical for protecting your core systems. When agents generate and run untested code or face prompt injection attacks, sandboxes contain the damage within isolated Linux VMs. When agents need filesystem access for information discovery, sandboxes can dynamically mount VMs with secure access to the right resources. Fluid compute automatically handles the unpredictable, long-running compute patterns that agents create. It’s easy to ignore compute when agents are processing text, but when usage scales and you add data-heavy workloads for files, images, and video, cost becomes an issue quickly. Fluid compute automatically scales up and down based on demand, and you're only charged for compute time, keeping costs low and predictable. AI Gateway gives you unified access to hundreds of models with built-in budget control, usage monitoring, and load balancing across providers. This is important for avoiding vendor lock-in and getting instant access to the latest models. When your agent needs to handle different types of queries, AI Gateway can route simple requests to fast, inexpensive models while sending complex analysis to more capable ones. If your primary provider hits rate limits or goes down, traffic automatically fails over to backup providers. Workflows give agents the ability to perform complex, multi-step operations reliably. When agents are used for critical business processes, failures are costly. Durable orchestration provides retry logic and error handling at every step so that interruptions don't require manual intervention or restart the entire operation. Observability reveals what agents are actually doing beyond basic system metrics. This data is essential for debugging unexpected behavior and optimizing agent performance. When your agent makes unexpected decisions, consumes more tokens than expected, or underperforms, observability shows you the exact prompts, model responses, and decision paths, letting you trace issues back to specific model calls or data sources. Build your agents, Vercel will run them In the future, every enterprise will build their version of d0. And their internal code review agent. And their customer support routing agent. And hundreds of other specialized tools. The success of these agents depends on the platform that runs them. Companies who invest in their internal AI stack now will not only move faster, they'll experience far higher ROI as their advantages compound over time.

    Eric and Jeanne
  • Nov 21

    Self-driving infrastructure

    AI has transformed how we write code. The next transformation is how we run it. At Vercel, we’re building self-driving infrastructure that autonomously manages production operations, improves application code using real-world insights, and learns from the unpredictable nature of production itself. Our vision is a world where developers express intent, not infrastructure. Where ops teams set principles, not individual configurations and alerts. Where the cloud doesn’t just host your app, it understands, optimizes, and evolves it.

    Malte, Tom, and Dan

    Latest news.

  • General
    Feb 20

    Skills Night: 69,000+ ways agents are getting smarter

    The room was full of people who had already used skills. Tuesday night we hosted Skills Night in San Francisco, an event for developers building on and around skills.sh, the open skills ecosystem we've been growing since the idea started as a single weekend of writing. What began as Shu Ding sitting down to document everything he knows about React has grown into over 69,000 skills, 2 million skill CLI installs, and a community moving incredibly fast. Here is what we learned. Where this came from The origin story is worth retelling because it shapes how we think about the project. Shu Ding is one of the most talented web engineers I've ever worked with. He knows things about React and the browser that most people will never discover. Last year, he sat down on a weekend and wrote it all down. A kind of web bible. We wanted to figure out how to ship it. We considered a blog post or documentation that the next generation of models might eventually learn - but we wouldn't see the results until Claude Sonnet 8, or GPT-9. On the other hand, an MCP server felt too heavy for what was essentially a collection of markdown documents. Skills made sense as the quickest way to deliver on-demand knowledge. While writing the instructions for installing React best practices, I ended up copying and pasting the same installation instructions for getting the skills into Cursor, Claude Code, Codex, and the other 10+ coding agents but with slightly different installation directories. So I built a CLI to install it into every major coding agent at once. That became npx skills. We added telemetry to surface new skills as they got installed, which became the data that powers the leaderboard at skills.sh. The whole thing went from idea to production on Vercel in days. Malte Ubl, Vercel CTO, framed it perfectly: it's a package manager for agent context. Now we are tracking 69,000 of them, and making them not just easy to discover but easy to install, with simple commands like just: The security problem we needed to solve Growth creates attack surface, and fast growth creates it even faster. As soon as skills took off, quality variance followed. Ryan from Socket showed us a concrete example: a skill that looked completely clean at the markdown level but included a Python file that opened a remote shell on install. You would never catch that without looking at every file in the directory. That is why we announced security partnerships with Gen, Socket, and Snyk to run audits across all skills and every new one that comes in. Socket is doing cross-ecosystem static analysis combined with LLM-based noise reduction, reporting 95% precision, 98% recall, and 97% F1 across their benchmarks. Gen is building a real-time agent trust layer called Sage that monitors every connection in and out of your agents, allowing them to run freely without risk of data exfiltration or prompt injection. Snyk is bringing their package security background to the skills context. We are building an Audits leaderboard to provide per-skill assessments and recommendations. The goal is not to lock things down. The goal is to let you go fast with confidence. We're always looking for new security partners who can bring unique perspectives to auditing skills and provide more trust signals for skills. What the demos showed us Eight partners showed demos on Tuesday, and a few themes kept coming up. Skills close the training cutoff gap. Ben Davis ran a controlled experiment to demonstrate this. He tried to get coding agents to implement Svelte remote functions, a relatively new API, four different ways: no context, a skills file with documentation, a skill pointing to the MCP, and a code example in the project. Every approach with context worked. The no-context run, which he had to force through a stripped-down model to prevent it from inferring solutions, produced completely wrong output. Models are smart enough to use patterns correctly when you give them the patterns. Without context, they fall back to stale training data. The medium matters less than the content. The interesting takeaway from Ben's experiment was not that skills are the only way. It is that getting the right context in is what matters, and skills are the fastest starting point if you do not already have a baseline. Existing code examples, inline documentation, and MCP hints all work. Skills are just the easiest way to distribute that context to anyone. Agents can now drive the whole stack. Evan Bacon from Expo showed native iOS feature upgrades driven entirely by Claude Code using Expo skills. New SwiftUI components, gesture-driven transitions, and tab bar updates were all applied automatically. They are also using LLDB integration in a work-in-progress skill that lets agents read the native iOS view hierarchy and fix notoriously hard keyboard handling bugs automatically. Their production app, Expo Go, now auto-fixes every crash as it occurs. For anyone who has spent time wrestling with Xcode, that is a significant statement. Skills are becoming infrastructure. Nick Khami showed off that Mintlify auto-generates a skill for every documentation site they host, including Claude Code's own docs, Coinbase, Perplexity, and Lovable. Traffic to these sites is now 50% coding agents, up from 10% a year ago. The skill is not something the docs team writes anymore; it is a byproduct of having well-structured documentation. Sentry's David Cramer built Warden, a harness that runs skills as linters on pull requests via GitHub Actions, treating agents as a static analysis layer. What we're building toward Guillermo Rauch, Vercel CEO, said something Tuesday night that I keep thinking about: agents make mistakes. They sometimes tell you you are absolutely right and proceed to do the wrong thing. Shipping quality in the AI era means not just celebrating how many tokens you are burning. It means raising the bar on what those tokens actually produce. Skills are one answer to that problem. They are how we influence what agents create, keep them up to date with framework changes, and make them more token-efficient by giving them a straight path to the right answer instead of letting them stumble around. Two million installs is real signal. The security partnerships make it something teams can rely on. And the demos showed that the most interesting skills work is not at the CLI level. It is in the agents and tools that are now treating skills as a first-class primitive for distributing knowledge at scale. We will keep building. Come find us at skills.sh.

    Andrew Qu
  • General
    Feb 19

    Video Generation with AI Gateway

    AI Gateway now supports video generation, so you can create cinematic videos with photorealistic quality, synchronized audio, generate personalized content with consistent identity, all through AI SDK 6. Two ways to get started Video generation is in beta and currently available for Pro and Enterprise plans and paid AI Gateway users. AI SDK 6: Generate videos programmatically with the same interface you use for text and images. One API, one authentication flow, one observability dashboard across your entire AI pipeline. AI Gateway Playground: Experiment with video models with no code in the configurable AI Gateway playground that's embedded in each model page. Compare providers, tweak prompts, and download results without writing code. To access, click any video gen model in the model list. Four initial video models; 17 variations Grok Imagine from xAI is fast and great at instruction following. Create and edit videos with style transfer, all in seconds. Wan from Alibaba specializes in reference-based generation and multi-shot storytelling, with the ability to preserve identity across scenes. Kling excels at image to video and native audio. The new 3.0 models support multishot video with automatic scene transitions. Veo from Google delivers high visual fidelity and physics realism. Native audio generation with cinematic lighting and physics. Understanding video requests Video models require more than just describing what you want. Unlike image generation, video prompts can include motion cues (camera movement, object actions, timing) and optionally audio direction. Each provider exposes different capabilities through providerOptions that unlock fundamentally different generation modes. See the documentation for model-specific options. Generation types AI Gateway initially supports 4 types of video generation: Type Inputs Description Example use cases Text-to-video Text prompt Describe a scene, get a video Ad creative, explainer videos, social content Image-to-video Image, text prompt optional Animate a still image with motion Product showcases, logo reveals, photo animation First and last frame 2 images, text prompt optional Define start and end states, model fills in between Before/after reveals, time-lapse, transitions Reference-to-video Images or videos Extract a character from reference images or videos and place them in new scenes Spokesperson content, consistent brand characters Across the model creators, their current capabilities across the models on AI Gateway are listed below: Model Creator Capabilities xAI Text-to-video, image-to-video, video editing, audio Wan Text-to-video, image-to-video, reference-to-video, audio Kling Text-to-video, image-to-video, first and last frame, audio Veo Text-to-video, image-to-video, audio Text-to-video Describe what you want, get a video. The model handles visuals, motion, and optionally audio. Great for hyperrealistic, production-quality footage with just a simple text prompt. Example: Programmatic video at scale. Generate videos on demand for your app, platform, or content pipeline. No licencing fees or production required, just prompts and outputs. This example uses klingai/kling-v2.6-t2v to generate video from a text prompt with a specified aspect ratio and duration. Example: Creative content generation. Turn a simple prompt into polished video clips for social media, ads, or storytelling with natural motion and cinematic quality. By setting a very specific and descriptive prompt, google/veo-3.1-generate-001 generates video with immense detail and the exact desired motion. Image-to-video Provide a starting image and animate it. Control the initial composition, then let the model generate motion. Example: Animate product images. Turn existing product photos into interactive videos. The klingai/kling-v2.6-i2v model animates a product image after you pass an image URL and motion description in the prompt. Example: Animated illustrations. Bring static artwork to life with subtle motion. Perfect for thematic content or marketing at scale. Example: Lifestyle and product photography. Add subtle motion to food, beverage, or lifestyle shots for social content. Here, a picture of coffee is rendered for a more interactive video, with lighting direction and minute details. First and last frame Define the start and end states, and the model generates a seamless transition between them. Example: Before/after reveals. Outfit swaps, product comparisons, changes over time. Upload two images, get a seamless transition. The start and end states are defined here with two images that used in the prompt and provider options. In this example, klingai/kling-v3.0-i2v lets you define the start frame in image and the end frame in lastFrameImage. The model generates the transition between them. Reference-to-video Provide reference videos or images of a person/character, and the model extracts their appearance and voice to generate new scenes starring them with consistent identity. In this example, 2 reference images of dogs are used to generate the final video. Using alibaba/wan-v2.6-r2v-flash here, you can instruct the model to utilize the people/characters within the prompt. Wan suggests using character1, character2, etc. in the prompt for multi-reference to video to get the best results. Video Editing Transform existing videos with style transfer. Provide a video URL and describe the transformation you want. The model applies the new style while preserving the original motion. Here, xai/grok-imagine-video utilizes a source video from a previous generation to edit into a watercolor style. Get started For more examples and detailed configuration options for video models, check out the Video Generation Documentation. You can also find simple getting started scripts with the Video Generation Quick Start. Check out the changelogs for these video models for more detailed examples and prompts. Grok Imagine Alibaba Wan Veo Kling

    Jerilyn Zheng
  • Engineering
    Feb 18

    We Ralph Wiggumed WebStreams to make them 10x faster

    When we started profiling Next.js server rendering earlier this year, one thing kept showing up in the flamegraphs: WebStreams. Not the application code running inside them, but the streams themselves. The Promise chains, the per-chunk object allocations, the microtask queue hops. After Theo Browne's server rendering benchmarks highlighted how much compute time goes into framework overhead, we started looking at where that time actually goes. A lot of it was in streams. Turns out that WebStreams have an incredibly complete test suite, and that makes them a great candidate for doing an AI-based re-implementation in a purely test-driven and benchmark-driven fashion. This post is about the performance work we did, what we learned, and how this work is already making its way into Node.js itself through Matteo Collina's upstream PR. The problem Node.js has two streaming APIs. The older one (stream.Readable, stream.Writable, stream.Transform) has been around for over a decade and is heavily optimized. Data moves through C++ internals. Backpressure is a boolean. Piping is a single function call. The newer one is the WHATWG Streams API: ReadableStream, WritableStream, TransformStream. This is the web standard. It powers fetch() response bodies, CompressionStream, TextDecoderStream, and increasingly, server-side rendering in frameworks like Next.js and React. The web standard is the right API to converge on. But on the server, it is slower than it needs to be. To understand why, consider what happens when you call reader.read() on a native WebStream in Node.js. Even if data is already sitting in the buffer: A ReadableStreamDefaultReadRequest object is allocated with three callback slots The request is enqueued into the stream's internal queue A new Promise is allocated and returned Resolution goes through the microtask queue That is four allocations and a microtask hop to return data that was already there. Now multiply that by every chunk flowing through every transform in a rendering pipeline. Or consider pipeTo(). Each chunk passes through a full Promise chain: read, write, check backpressure, repeat. An {value, done} result object is allocated per read. Error propagation creates additional Promise branches. None of this is wrong. These guarantees matter in the browser where streams cross security boundaries, where cancellation semantics need to be airtight, where you do not control both ends of a pipe. But on the server, when you are piping React Server Components through three transforms at 1KB chunks, the cost adds up. We benchmarked native WebStream pipeThrough at 630 MB/s for 1KB chunks. Node.js pipeline() with the same passthrough transform: ~7,900 MB/s. That is a 12x gap, and the difference is almost entirely Promise and object allocation overhead. What we built We have been working on a library called fast-webstreams that implements the WHATWG ReadableStream, WritableStream, and TransformStream APIs backed by Node.js streams internally. Same API, same error propagation, same spec compliance. The overhead is removed for the common cases. The core idea is to route operations through different fast paths depending on what you are actually doing: When you pipe between fast streams: zero Promises This is the biggest win. When you chain pipeThrough and pipeTo between fast streams, the library does not start piping immediately. Instead, it records upstream links: source → transform1 → transform2 → ... When pipeTo() is called at the end of the chain, it walks upstream, collects the underlying Node.js stream objects, and issues a single pipeline() call. One function call. Zero Promises per chunk. Data flows through Node's optimized C++ path. The result: ~6,200 MB/s. That is ~10x faster than native WebStreams and close to raw Node.js pipeline performance. If any stream in the chain is not a fast stream (say, a native CompressionStream), the library falls back to either native pipeThrough or a spec-compliant pipeTo implementation. When you read chunk by chunk: synchronous resolution When you call reader.read(), the library tries nodeReadable.read() synchronously. If data is there, you get Promise.resolve({value, done}). No event loop round-trip. No request object allocation. Only when the buffer is empty does it register a listener and return a pending Promise. The result: ~12,400 MB/s, or 3.7x faster than native. The React Flight pattern: where the gap is largest This is the one that matters most for Next.js. React Server Components use a specific byte stream pattern: create a ReadableStream with type: 'bytes', capture the controller in start(), enqueue chunks externally as the render produces them. Native WebStreams: ~110 MB/s. fast-webstreams: ~1,600 MB/s. That is 14.6x faster for the exact pattern used in production server rendering. The speed comes from LiteReadable, a minimal array-based buffer we wrote to replace Node.js's Readable for byte streams. It uses direct callback dispatch instead of EventEmitter, supports pull-based demand and BYOB readers, and costs about 5 microseconds less per construction. That matters when React Flight creates hundreds of byte streams per request. Fetch response bodies: streams you don't construct yourself The examples above all start with new ReadableStream(...). But on the server, most streams do not start that way. They start from fetch(). The response body is a native byte stream owned by Node.js's HTTP layer. You cannot swap it out. This is a common pattern in server-side rendering: fetch data from an upstream service, pipe the response through one or more transforms, and forward the result to the client. With native WebStreams, each hop in this chain pays the full Promise-per-chunk cost. Three transforms means roughly 6-9 Promises per chunk. At 1KB chunks, that gets you ~260 MB/s. The library handles this through deferred resolution. When patchGlobalWebStreams() is active, Response.prototype.body returns a lightweight fast shell wrapping the native byte stream. Calling pipeThrough() does not start piping immediately. It just records the link. Only when pipeTo() or getReader() is called at the end does the library resolve the full chain: it creates a single bridge from the native reader into Node.js pipeline() for the transform hops, then serves reads from the buffered output synchronously. The cost model: one Promise at the native boundary to pull data in. Zero Promises through the transform chain. Sync reads at the output. The result: ~830 MB/s, or 3.2x faster than native for the three-transform fetch pattern. For simple response forwarding without transforms, it is 2.0x faster (850 vs 430 MB/s). Benchmarks All numbers are throughput in MB/s at 1KB chunks on Node.js v22. Higher is better. Core operations Operation Node.js streams fast native fast vs native read loop 26,400 12,400 3,300 3.7x write loop 26,500 5,500 2,300 2.4x pipeThrough 7,900 6,200 630 9.8x pipeTo 14,000 2,500 1,400 1.8x for-await-of — 4,100 3,000 1.4x Transform chains The Promise-per-chunk overhead compounds with chain depth: Depth fast native fast vs native 3 transforms 2,900 300 9.7x 8 transforms 1,000 115 8.7x Byte streams Pattern fast native fast vs native start + enqueue (React Flight) 1,600 110 14.6x byte read loop 1,400 1,400 1.0x byte tee 1,200 750 1.6x Response body patterns Pattern fast native fast vs native Response.text() 900 910 1.0x Response forwarding 850 430 2.0x fetch → 3 transforms 830 260 3.2x Stream construction Creating streams is also faster, which matters for short-lived streams: Type fast native fast vs native ReadableStream 2,100 980 2.1x WritableStream 1,300 440 3.0x TransformStream 470 220 2.1x Spec compliance fast-webstreams passes 1,100 out of 1,116 Web Platform Tests. Node.js's native implementation passes 1,099. The 16 failures that remain are either shared with native (like the unimplemented type: 'owning' transfer mode) or are architectural differences that do not affect real applications. How we are deploying this The library can patch the global ReadableStream, WritableStream, and TransformStream constructors: The patch also intercepts Response.prototype.body to wrap native fetch response bodies in fast stream shells, so fetch() → pipeThrough() → pipeTo() chains hit the pipeline fast path automatically. At Vercel, we are looking at rolling this out across our fleet. We will do so carefully and incrementally. Streaming primitives sit at the foundation of request handling, response rendering, and compression. We are starting with the patterns where the gap is largest: React Server Component streaming, response body forwarding, and multi-transform chains. We will measure in production before expanding further. The right fix is upstream A userland library should not be the long-term answer here. The right fix is in Node.js itself. Work is already happening. After a conversation on X, Matteo Collina submitted nodejs/node#61807, "stream: add fast paths for webstreams read and pipeTo." The PR applies two ideas from this project directly to Node.js's native WebStreams: read() fast path: When data is already buffered, return a resolved Promise directly without creating a ReadableStreamDefaultReadRequest object. This is spec-compliant because read() returns a Promise either way, and resolved promises still run callbacks in the microtask queue. pipeTo() batch reads: When data is buffered, batch multiple reads from the controller queue without creating per-chunk request objects. Backpressure is respected by checking desiredSize after each write. The PR shows ~17-20% faster buffered reads and ~11% faster pipeTo. These improvements benefit every Node.js user for free. No library to install, no patching, no risk. James Snell's Node.js performance issue #134 outlines several additional opportunities: C++-level piping for internally-sourced streams, lazy buffering, eliminating double-buffering in WritableStream adapters. Each of these could close the gap further. We will keep contributing ideas upstream. The goal is not for fast-webstreams to exist forever. The goal is for WebStreams to be fast enough that it does not need to. What we learned the hard way The spec is smarter than it looks. We tried many shortcuts. Almost every one of them broke a Web Platform Test, and the test was usually right. The ReadableStreamDefaultReadRequest pattern, the Promise-per-read design, the careful error propagation: they exist because cancellation during reads, error identity through locked streams, and thenable interception are real edge cases that real code hits. Promise.resolve(obj) always checks for thenables. This is a language-level behavior you cannot avoid. If the object you resolve with has a .then property, the Promise machinery will call it. Some WPT tests deliberately put .then on read results and verify that the stream handles it correctly. We had to be very careful about where {value, done} objects get created in hot paths. Node.js pipeline() cannot replace WHATWG pipeTo. We hoped to use pipeline() for all piping. It causes 72 WPT failures. The error propagation, stream locking, and cancellation semantics are fundamentally different. pipeline() is only safe when we control the entire chain, which is why we collect upstream links and only use it for full fast-stream chains. Reflect.apply, not .call(). The WPT suite monkey-patches Function.prototype.call and verifies that implementations do not use it to invoke user-provided callbacks. Reflect.apply is the only safe way. This is a real spec requirement. We built most of fast-webstreams with AI Two things made that viable: The amazing Web Platform Tests gave us 1,116 tests as an immediate, machine-checkable answer to "did we break anything?" And we built a benchmark suite early on so we could measure whether each change actually moved throughput. The development loop was: implement an optimization, run the WPT suite, run benchmarks. When tests broke, we knew which spec invariant we had violated. When benchmarks did not move, we reverted. The WHATWG Streams spec is long and dense. The interesting optimization opportunities sit in the gap between what the spec requires and what current implementations do. read() must return a Promise, but nothing says that Promise cannot already be resolved when data is buffered. That kind of observation is straightforward when you can ask an AI to analyze algorithm steps for places where the observable behavior can be preserved with fewer allocations. Try it fast-webstreams is available on npm as experimental-fast-webstreams. The "experimental" prefix is intentional. We are confident in correctness, but this is an area of active development. If you are building a server-side JavaScript framework or runtime and hitting WebStreams performance limits, we would love to hear from you. And if you are interested in improving WebStreams in Node.js itself, Matteo's PR is a great place to start.

    Malte Ubl
  • Customers
    Feb 17

    How Stably ships AI testing agents in hours, not weeks

    How the 6-person team at Stably ships AI testing agents faster with Vercel—moving from weeks to hours. Their shift highlights how Vercel's platform eliminates infrastructure anxiety, boosting autonomous testing and enabling quick enterprise growth. Jinjing Liang, co-founder and CEO of Stably, was building something technically ambitious: AI agents that run autonomous end-to-end tests by deploying on preview URLs, reading code diffs, and validating whether changes actually work. Testing is the bottleneck for autonomous coding: AI can write code fast, but without validation, teams get stuck checking everything manually. But Stably had their own bottleneck. Every new feature meant infrastructure decisions. Every new agent meant deployment anx...

    Alli Pope
  • General
    Feb 9

    How we built AEO tracking for coding agents

    AI has changed the way that people find information. For businesses, this means it's critical to understand how LLMs search for and summarize their web content. We're building an AI Engine Optimization (AEO) system to track how models discover, interpret, and reference Vercel and our sites. This started as a prototype focused only on standard chat models, but we quickly realized that wasn’t enough. To get a complete picture of visibility, we needed to track coding agents. For standard models, tracking is relatively straightforward. We use AI Gateway to send prompts to dozens of popular models (e.g. GPT, Gemini, and Claude) and analyze their responses, search behavior, and cited sources. Coding agents, however, behave very differently. Many Vercel users interact with AI through their terminal or IDE while actively working on projects. In early sampling, we found that coding agents perform web searches in roughly 20% of prompts. Because these searches happen inline with real development workflows, it’s especially important to evaluate both response quality and source accuracy. Measuring AEO for coding agents requires a different approach than model-only testing. Coding agents aren’t designed to answer a single API call. They’re built to operate inside a project and expect a full development environment, including a filesystem, shell access, and package managers. That creates a new set of challenges: Execution isolation: How do you safely run an autonomous agent that can execute arbitrary code? Observability: How do you capture what the agent did when each agent has its own transcript format, tool-calling conventions, and output structure? The coding agent AEO lifecycle Coding agents are typically accessed at some level through CLIs rather than APIs. Even if you’re only sending prompts and capturing responses, the CLI still needs to be installed and executed in a full runtime environment. Vercel Sandbox solves this by providing ephemeral Linux MicroVMs that spin up in seconds. Each agent run gets its own sandbox and follows the same six-step lifecycle, regardless of the CLI it uses. Create the sandbox. Spin up a fresh MicroVM with the right runtime (Node 24, Python 3.13, etc.) and a timeout. The timeout is a hard ceiling, so if the agent hangs or loops, the sandbox kills it. Install the agent CLI. Each agent ships as an npm package (i.e., @anthropic-ai/claude-code, @openai/codex, etc.). The sandbox installs it globally so it's available as a shell command. Inject credentials. Instead of giving each agent a direct provider API key, we set environment variables that route all LLM calls through Vercel AI Gateway. This gives us unified logging, rate limiting, and cost tracking across every agent, even though each agent uses a different underlying provider (though the system allows direct provider keys as well). Run the agent with the prompt. This is the only step that differs per agent. Each CLI has its own invocation pattern, flags, and config format. But from the sandbox's perspective, it's just a shell command. Capture the transcript. After the agent finishes, we extract a record of what it did, including which tools it called, whether it searched the web, and what it recommended in the response. This is agent-specific (covered below). Tear down. Stop the sandbox. If anything went wrong, the catch block ensures the sandbox is stopped anyway so we don't leak resources. In the code, the lifecycle looks like this. Agents as config Because the lifecycle is uniform, each agent can be defined as a simple config object. Adding a new agent to the system means adding a new entry, and the sandbox orchestration handles everything else. runtime determines the base image for the MicroVM. Most agents run on Node, but the system supports Python runtimes too. setupCommands is an array because some agents need more than a global install. For example, Codex also needs a TOML config file written to ~/.codex/config.toml. buildCommand is a function that takes the prompt and returns the shell command to run. Each agent's CLI has its own flags and invocation style. Using the AI Gateway for routing We wanted to use the AI Gateway to centralize management of cost and logs. This required overriding the provider’s base URLs via environment variables inside the sandbox. The agents themselves don’t know this is happening and operate as if they are talking directly to their provider. Here’s what this looks like for Claude Code: ANTHROPIC_BASE_URL points to AI Gateway instead of api.anthropic.com. The agent's HTTP calls go to Gateway, which proxies them to Anthropic. ANTHROPIC_API_KEY is set to empty string on purpose — Gateway authenticates via its own token, so the agent doesn't need (or have) a direct provider key. This same pattern works for Codex (override OPENAI_BASE_URL) and any other agent that respects a base URL environment variable. Provider API credentials can also be used directly. The transcript format problem Once an agent finishes running in its sandbox, we have a raw transcript, which is a record of everything it did. The problem is that each agent produces them in a different format. Claude Code writes JSONL files to disk. Codex streams JSON to stdout. OpenCode also uses stdout, but with a different schema. They use different names for the same tools, different nesting structures for messages, and different conventions. We needed all of this to feed into a single brand pipeline, so we built a four-stage normalization layer: Transcript capture: Each agent stores its transcript differently, so this step is agent-specific. Parsing: Each agent has its own parser that normalizes tool names and flattens agent-specific message structures into a single unified event type. Enrichment: Shared post-processing that extracts structured metadata (URLs, commands) from tool arguments, normalizing differences in how each agent names its args. Summary and brand extraction: Aggregate the unified events into stats, then feed into the same brand extraction pipeline used for standard model responses. Stage 1: Transcript capture This happens while the sandbox is still running (step 5 in the lifecycle from the previous section). Claude Code writes its transcript as a JSONL file on the sandbox filesystem. We have to find and read it out after the agent finishes: Codex and OpenCode both output their transcripts to stdout, so capture is simpler — filter the output for JSON lines: The output of this stage is the same for all agents: a string of raw JSONL. But the structure of each JSON line is still completely different per agent, and that's what the next stage handles. Stage 2: Parsing tool names and message shapes We built a dedicated parser for each agent that does two things at once: normalizes tool names and flattens agent-specific message structures into a single formatted event type. Tool name normalization The same operation has different names across agents: Operation Claude Code Codex OpenCode Read a file Read read_file read Write a file Write write_file write Edit a file StrReplace patch_file patch Run a command Bash shell bash Search the web WebFetch (varies) (varies) Each parser maintains a lookup table that maps agent-specific names to ~10 canonical names: Message shape flattening Beyond naming, the structure of events varies across agents: Claude Code nests messages inside a message property and mixes tool_use blocks into content arrays. Codex has Responses API lifecycle events (thread.started, turn.completed, output_text.delta) alongside tool events. OpenCode bundles tool call + result in the same event via part.tool and part.state. The parser for each agent handles these structural differences and collapses everything into a single TranscriptEvent type: The output of this stage is a flat array of TranscriptEvent[] , which is the same shape regardless of which agent produced it. Stage 3: Enrichment After parsing, a shared post-processing step runs across all events. This extracts structured metadata from tool arguments so that downstream code doesn't need to know that Claude Code puts file paths in args.path while Codex uses args.file: Stage 4: Summary and brand extraction The enriched TranscriptEvent[] array gets summarized into aggregate stats (total tool calls by type, web fetches, errors) and then fed into the same brand extraction pipeline used for standard model responses. From this point forward, the system doesn't know or care whether the data came from a coding agent or a model API call. Orchestration with Vercel Workflow This entire pipeline runs as a Vercel Workflow. When a prompt is tagged as "agents" type, the workflow fans out across all configured agents in parallel and each gets its own sandbox: What we’ve learned Coding agents contribute a meaningful amount of traffic from web search. Early tests on a random sample of prompts showed that coding agents execute search around 20% of the time. As we collect more data we will build a more comprehensive view of agent search behavior, but these results made it clear that optimizing content for coding agents was important. Agent recommendations have a different shape than model responses. When a coding agent suggests a tool, it tends to produce working code with that tool, like an import statement, a config file, or a deployment script. The recommendation is embedded in the output, not just mentioned in prose. Transcript formats are a mess. And they are getting messier as agent CLI tools ship rapid updates. Building a normalization layer early saved us from constant breakage. The same brand extraction pipeline works for both models and agents. The hard part is everything upstream: getting the agent to run, capturing what it did, and normalizing it into a structure you can grade. What’s next Open sourcing the tool. We're planning to release an OSS version of our system so other teams can track their own AEO evals, both for standard models and coding agents. Deep dive on methodology. We are working on a follow-up post covering the full AEO eval methodology: prompt design, dual-mode testing (web search vs. training data), query-as-first-class-entity architecture, and Share of Voice metrics. Scaling agent coverage. Adding more agents as the ecosystem grows and expanding the types of prompts we test (not just "recommend a tool" but full project scaffolding, debugging, etc.).

    Eric and Allen
  • General
    Feb 9

    Anyone can build agents, but it takes a platform to run them

    Prototyping is democratized, but production deployment isn't. AI models have commoditized code and agent generation, making it possible for anyone to build sophisticated software in minutes. Claude can scaffold a fully functional agent before your morning coffee gets cold. But that same AI will happily architect a $5,000/month DevOps setup when the system could run efficiently at $500/month. In a world where anyone can build internal tools and agents, the build vs. buy equation has fundamentally changed. Competitive advantage no longer comes from whether you can build. It comes from rapid iteration on AI that solves real problems for your business and, more importantly, reliably operating those systems at scale. To do that, companies need an internal AI stack as robust as their external product infrastructure. That's exactly what Vercel's agent orchestration platform provides. Build vs. buy ROI has fundamentally changed For decades, the economics of custom internal tools only made sense at large-scale companies. The upfront engineering investment was high, but the real cost was long-term operation with high SLAs and measurable ROI. For everyone else, buying off-the-shelf software was the practical option. AI has fundamentally changed this equation. Companies of any size can now create agents quickly, and customization delivers immediate ROI for specialized workflows: OpenAI deployed an internal data agent to democratize analytics Vercel’s lead qualification agent helps one SDR do the work of 10 (template here) Stripe built a customer-facing financial impact calculator (on a flight!) Today the question isn’t build vs. buy. The answer is build and run. Instead of separating internal systems and vendors, companies need a single platform that can handle the unique demands of agent workloads. Every company needs an internal AI stack The number of use cases for internal apps and agents is exploding, but here's the problem: production is still hard. Vibe coding has created one of the largest shadow IT problems in history, and understanding production operations requires expertise in security, observability, reliability, and cost optimization. These skills remain rare even as building becomes easier. The ultimate challenge for agents isn't building them, it's the platform they run on. The platform is the product: how our data agent runs on Vercel Like OpenAI, we built our own internal data agent named d0 (OSS template here). At its core, d0 is a text-to-SQL engine, which is not a new concept. What made it a successful product was the platform underneath. Using Vercel’s built-in primitives and deployment infrastructure, one person built d0 in a few weeks using 20% of their time. This was only possible because Sandboxes, Fluid compute and AI Gateway automatically handled the operational complexity that would have normally taken months of engineering effort to scaffold and secure. Today, d0 has completely democratized data access that was previously limited to professional analysts. Engineers, marketers, and executives can all ask questions in natural language and get immediate, accurate answers from our data warehouse. Here’s how it works: A user asks a question in Slack: "What was our Enterprise ARR last quarter?" d0 receives the message, determines the right level of data access based on the permissions of the user, and starts the agent workflow. The agent explores a semantic layer: The semantic layer is a file system of 5 layers of YAML-based configs that describe our data warehouse, our metrics, our products, and our operations. AI SDK handles the model calls: Streaming responses, tool use, and structured outputs all work out of the box. We didn't build custom LLM plumbing, we used the same abstractions any Vercel developer can use. Agent steps are orchestrated durably: If a step fails (Snowflake timeout, model hiccup), Vercel Workflows handles retries and state recovery automatically. Automated actions are executed in isolation: File exploration, SQL generation, and query execution all happen in a secure Vercel Sandbox. Runaway operations can't escape, and the agent can execute arbitrary Python for advanced analysis. Multiple models are used to balance cost and accuracy: AI Gateway routes simple requests to fast models and complex analysis to Claude Opus, all in one code base. The answer arrives in Slack: formatted results, often with a chart or Google Sheet link, are delivered back to the Slack using the AI SDK Chatbot primitive. Vercel is the platform for agents Vercel provides the infrastructure primitives purpose-built for agent workloads, both internal and customer-facing. You build the agent, Vercel runs it. And it just works. Using our own agent orchestration platform has enabled us to build and manage an increasing number of custom agents. Internally, we run: A lead qualification agent d0, our analytics agent A customer support agent (handles 87% percent of initial questions) An abuse detection agent that flags risky content A content agent that turns Slack threads into draft blog posts. On the product side: v0 is a code generation agent, and Vercel Agent can review pull requests, analyze incidents, and recommend actions. Both products run on the same primitives as our internal tools. Sandboxes give agents a secure, isolated environment for executing sensitive autonomous actions. This is critical for protecting your core systems. When agents generate and run untested code or face prompt injection attacks, sandboxes contain the damage within isolated Linux VMs. When agents need filesystem access for information discovery, sandboxes can dynamically mount VMs with secure access to the right resources. Fluid compute automatically handles the unpredictable, long-running compute patterns that agents create. It’s easy to ignore compute when agents are processing text, but when usage scales and you add data-heavy workloads for files, images, and video, cost becomes an issue quickly. Fluid compute automatically scales up and down based on demand, and you're only charged for compute time, keeping costs low and predictable. AI Gateway gives you unified access to hundreds of models with built-in budget control, usage monitoring, and load balancing across providers. This is important for avoiding vendor lock-in and getting instant access to the latest models. When your agent needs to handle different types of queries, AI Gateway can route simple requests to fast, inexpensive models while sending complex analysis to more capable ones. If your primary provider hits rate limits or goes down, traffic automatically fails over to backup providers. Workflows give agents the ability to perform complex, multi-step operations reliably. When agents are used for critical business processes, failures are costly. Durable orchestration provides retry logic and error handling at every step so that interruptions don't require manual intervention or restart the entire operation. Observability reveals what agents are actually doing beyond basic system metrics. This data is essential for debugging unexpected behavior and optimizing agent performance. When your agent makes unexpected decisions, consumes more tokens than expected, or underperforms, observability shows you the exact prompts, model responses, and decision paths, letting you trace issues back to specific model calls or data sources. Build your agents, Vercel will run them In the future, every enterprise will build their version of d0. And their internal code review agent. And their customer support routing agent. And hundreds of other specialized tools. The success of these agents depends on the platform that runs them. Companies who invest in their internal AI stack now will not only move faster, they'll experience far higher ROI as their advantages compound over time.

    Eric and Jeanne
  • General
    Feb 6

    Introducing Geist Pixel

    Today, we're expanding the Geist font family with Geist Pixel. Geist Pixel is a bitmap-inspired typeface built on the same foundations as Geist Sans and Geist Mono, reinterpreted through a strict pixel grid. It's precise, intentional, and unapologetically digital. Same system, new texture Geist Pixel isn't a novelty font. It's a system extension. Just like Geist Mono was created for developers, Geist Pixel was designed with real usage in mind, not as a visual gimmick, but as a functional tool within a broader typographic system. It includes five distinct variants, each exported separately: Geist Pixel Square Geist Pixel Grid Geist Pixel Circle Geist Pixel Triangle Geist Pixel Line Every glyph is constructed on a consistent pixel grid, carefully tuned to preserve rhythm, spacing, and legibility. The result feels both nostalgic and contemporary, rooted in early screen typography, but designed for modern products that ship to real users. This matters because pixel fonts often break in production. They don't scale properly across viewports, their metrics conflict with existing typography, or they're purely decorative. Geist Pixel was built to solve these problems, maintaining the visual texture teams want while preserving the typographic rigor products require. It shares the same core principles as the rest of the Geist family: Clear structure Predictable metrics Strong alignment across layouts Designed to scale across platforms and use cases Getting started is easy Get started with Geist Pixel and start building. Install it directly: Exports and CSS variables: GeistPixelSquare: --font-geist-pixel-square GeistPixelGrid: --font-geist-pixel-grid GeistPixelCircle: --font-geist-pixel-circle GeistPixelTriangle: --font-geist-pixel-triangle GeistPixelLine: --font-geist-pixel-line And use it in layout.tsx, e.g. for GeistPixelSquare: Learn more in the README. Designed for the web and for modern products While many pixel fonts are purely expressive, Geist Pixel is meant to ship. It works in real UI contexts: banners, dashboards, experimental layouts, product moments, and systems where typography becomes part of the interface language. Special care was put into: Vertical metrics aligned with Geist and Geist Mono Consistent cap height and x-height behavior Multiple variants for different densities and use cases Seamless mixing with the rest of the Geist family It's designed for the web, for modern products, and for an era where interfaces are increasingly shaped by AI-driven workflows. Crafted on a grid, refined by hand Although Geist Pixel is grid-based, it wasn't generated mechanically. Each glyph was manually refined to avoid visual noise, uneven weight distribution, and awkward diagonals. Corners, curves, and transitions were adjusted pixel by pixel to maintain clarity at small sizes and personality at larger scales. Horizontal metrics use a semi-mono approach, and letterforms take inspiration from both its Mono and Sans counterparts. Constraints weren't a limitation, they were the design tool. Geist Pixel ships with: 5 variants 480 glyphs 7 stylistic sets 32 supported languages Built with the same system mindset as Geist and Geist Mono, it's easy to adopt without breaking layout or rhythm. Already shaping what's next Even before its public release, Geist Pixel has already started influencing the visual language of Vercel. Since being shared internally a few weeks ago, it's found its way into explorations, experiments, and early redesign work, shaping tone, texture, and expression across the product. In many ways, it's already part of the system. One family, expanding With Geist, Geist Mono, and now Geist Pixel, the family spans a broader range, from highly functional UI text to expressive, system-driven display moments. And we're not stopping here. Geist Serif is already in progress. Same system thinking. A new voice. Download Geist Pixel and start building. None of this would have been possible without an incredible group of people behind the scenes. Huge thanks to Andrés Briganti for the obsessive level of craft and care poured into the design of the font itself, and to Guido Ferreyra for his support refining and tuning the font along the way; to Luis Gutierrez Rico for bringing Geist Pixel to life through motion and subtle magic; to Christopher Kindl for helping us put together the landing page and obsessing over those small details that make everything feel just right; to Marijana Pavlinić for constantly pushing us with bold, unexpected, and wildly creative ideas; and to Zahra Jabini for the coordination, technical support, and for making sure all the pieces actually came together. This was a true team effort, and I'm incredibly grateful to have built this with all of you.

    Evil Rabbit
  • Company News
    Feb 5

    The Vercel AI Accelerator is back with $6m in credits

    Building an AI business is no small feat. Delivering a great agentic product requires infrastructure that handles deployment, security, and scale automatically, but that's table stakes. Startups also need community support, mentorship, investor connections, platform credits, and visibility. That's why we created the Vercel AI Accelerator. Last year, we hosted our second cohort of 40 early-stage teams from across the globe. They joined us for six weeks of learning, building, and shipping, hearing from speakers in leadership at AWS, Anthropic, Cursor, Braintrust, MongoDB, HubSpot, Vercel, and more. The program culminated with a demo day in San Francisco that drew hundreds from across the industry. This year, the Accelerator is back with another cohort of 40 teams building the future of AI. Applications are open now until February 16th. Program benefits The AI Accelerator provides access to thousands of dollars in credits from Vercel, v0, AWS, and a variety of AI platforms. Participants also get to join an exclusive group of AI builders within the Vercel Community. Here are the full details: Credits from Vercel, v0, AWS, and leading AI platforms, including Anthropic, Cursor, ElevenLabs, Hugging Face, Cartesia, Roboflow, Modal, Julius.ai, Sentry, Vanta, Auth0, Browserbase, WorkOS, Supabase, Autonoma, and Neon Join an exclusive group of AI builders within the Vercel Community to share progress and exchange ideas during the program Participate in weekly sessions with industry leaders connect you directly with AI startup founders, investors, and technical leaders through fireside chats and office hours Access production-focused guides, templates, and videos designed to accelerate development cycles and help ship faster Present your product to industry leaders and VCs at demo day, creating visibility and potential fundraising opportunities Platform credits and prizes Every company accepted into the Accelerator receives thousands of dollars in credits from partner platforms. Finalists earn over $100K each in additional credit prizes, providing the compute and infrastructure resources needed to build and scale AI applications without operational overhead. Six weeks of focused development The 40 selected teams will join us from March 2nd to April 16th for six weeks of building and networking. The program includes: Direct access to leading AI builders Exclusive AI content Community connections Optional IRL meet ups Welcome and mid-point goal check-ins Mentorship from VC partners The program ends with a demo day in San Francisco on April 16th: The audience will include leaders and investors Judges will select 3 winners to receive over $100k in resources The first-place winner will receive an investment from Vercel Ventures Our previous demo day featured product launches from 26 teams and attracted hundreds of industry professionals. Judges from AWS, Vercel, Cursor, Modal, OpenAI, and Roboflow selected the winners. Since that event, several teams have raised venture funding rounds, secured enterprise customers, and established partnerships through connections made during the accelerator. Infrastructure for AI development We continue investing across the AI stack, from SDKs to templates, supporting how developers build modern apps and agents. Recent AI releases include Workflow DevKit, Sandboxes, Skills.sh, and git support in v0. These tools are designed to handle infrastructure automatically so teams can focus on building AI products. Companies like Sensay, Chatbase, and Leonardo.ai are built with Next.js and deployed on Vercel. Focus on what matters AI agents can now operate autonomously on code, making decisions and taking actions without constant human oversight. This shift requires infrastructure that scales automatically and handles operational complexity behind the scenes. Vercel provides that foundation, giving startups the freedom to focus on building AI applications that solve real problems. In our last cohort we backed early-stage teams like Stably AI, Cervo, Bear AI, and General Translation. Apply to the Vercel AI Accelerator and join a cohort of developers building AI applications that operate independently and scale automatically. Applications close February 16th. Applicants must be Vercel customers at or above the age of majority in your jurisdiction. Applicants must be able to commit to the full 6 weeks of the program. Applicants must not be located in, or otherwise subject to restrictions imposed by, U.S. sanctions laws.We are looking for pre-seed ideas. Applications will be judged based on quality of submission, founder background, and overall potential for impact and scalability.

    Alli Pope
  • Engineering
    Feb 3

    Making agent-friendly pages with content negotiation

    Agents browse the web, but they read differently than humans. They don't need CSS, client-side JavaScript, or images. All of that markup fills up their context window and consumes tokens without adding useful information. What agents need is clean, structured text. That's why we've updated our blog and changelog pages to make markdown accessible to agents while still delivering a full HTML and CSS experience to human readers. This works through content negotiation, an HTTP mechanism where the server returns different formats for the same content based on what the client requests. No duplicate content or separate sites. How agents request content  Agents use the HTTP Accept header to specify what formats they prefer. Claude Code, for example, sends this header when fetching pages: Accept: text/markdown, text/html, */* By listing text/markdown first, the agent signals that markdown is preferred over HTML when available. Many agents are starting to explicitly prefer markdown this way. Try it out by sending a curl request: curl https://vercel.com/blog/self-driving-infrastructure -H "accept: text/markdown" Our middleware examines the Accept header on incoming requests and detects these preferences. When markdown is preferred, it routes the request to a Next.js route handler that converts our Contentful rich-text content into markdown. This transformation preserves the content's structure. Code blocks keep their syntax highlighting markers, headings maintain their hierarchy, and links remain functional. The agent receives the same information as the HTML version, just in a format optimized for token efficiency. Performance benefits A typical blog post weighs 500KB with all the HTML, CSS, and JavaScript. However, the same content as markdown is only 2KB. That's a 99.6% reduction in payload size. For agents operating under token limits, smaller payloads mean they can consume more content per request and spend their budget on actual information instead of markup. They work faster and hit limits less often. We maintain synchronization between HTML and markdown versions using Next.js 16 remote cache and shared slugs. When content updates in Contentful, both versions refresh simultaneously. How agents discover available content Agents need to discover what's available. We implemented a markdown sitemap that lists all content in a format optimized for agent consumption. The sitemap includes metadata about each piece, including publication dates, content types, and direct links to both HTML and markdown versions. This gives agents a complete map of available information and lets them choose the format that works best for their needs. Want to see this in action? Add .md to the end of this page's URL to get the markdown version.

    Zach and Mitul
  • General
    Feb 3

    The Vercel OSS Bug Bounty program is now available

    Security is foundational to everything we build at Vercel. Our open source projects power millions of applications across the web, from small side projects to demanding production workloads at Fortune 500 companies. That responsibility drives us to keep investing in security for the platform and the broader ecosystem. Today, we're opening the Vercel Open Source Software (OSS) bug bounty program to the public on HackerOne. We're inviting security researchers everywhere to find vulnerabilities, challenge assumptions, and help us reduce risk for everyone building with these tools. Since August 2025, we've run a private bug bounty for our open source software with a small group of researchers. That program produced multiple high-severity reports across our Tier 1 projects and helped us refine our processes for triage, fixes, coordinated disclosure, and CVE publication. Now we're ready to expand. Building on our foundation of security investment Last fall, we opened a bug bounty program focused on Web Application Firewall and the React2Shell vulnerability class. Rather than wait for bypasses to surface in the wild, we took a proactive approach: pay security researchers to find them first. That program paid out over $1M across dozens of researchers who helped us find and fix vulnerabilities before attackers could. The lesson was clear. Good incentives and clear communication turn researchers into partners, not adversaries. Opening our private OSS bug bounty program to the public is the natural next step. Security vulnerabilities in these projects don't just affect Vercel; they affect everyone who builds with these tools. Finding and fixing them protects millions of end-users. Which projects are covered All Vercel open source projects are in scope. The projects listed below represent the core of the Vercel open source ecosystem. These are the frameworks, libraries, and tools that millions of developers rely on daily. Core projects included in the HackerOne program Project Description Next.js React framework for production web applications Nuxt Vue.js framework for modern web development SWR React Hooks library for data fetching Svelte Framework for building user interfaces Turborepo High-performance build system for monorepos AI SDK TypeScript toolkit for AI applications vercel (CLI) Command-line interface for Vercel platform workflow Durable workflow execution engine flags Feature flags SDK ms Tiny millisecond conversion utility nitrojs Universal server engine async-sema Semaphore for async operations skills The open agent skills tool: npx skills These are the projects where vulnerabilities have the highest potential impact, and where we prioritize incident response, vulnerability management, and CVE publication. How to participate If you’re a security researcher and ready to start hunting, visit HackerOne to find everything you need: scope details, reward ranges, and submission guidelines. When you find a vulnerability, submit it through HackerOne with clear reproduction steps. Our security team reviews every submission and works directly with researchers through the disclosure process. We're committed to fast response times and transparent communication. We appreciate the researchers who take the time to dig into our code and report issues responsibly. Your work helps keep these projects safer for everyone. Join our bug bounty program or learn more about security at Vercel.

    Andy Riancho
  • v0
    Feb 3

    Introducing the new v0

    Since v0 became generally available in 2024, more than 4 million people have used it to turn their ideas into apps in minutes. v0 has helped people get promotions, win more clients, and work more closely with developers. AI lowered the barrier to writing code. Now we're raising the bar for shipping it. Today, v0 evolves vibe coding from novelty to business critical. Built for production apps and agents, this release includes enterprise-grade security and integrations teams can use to ship real software, not just spin up demos. The limitations of vibe coding We're at an inflection point where anyone can create software. But this freedom has created three problems for the enterprise. Vibe coding is now the world's largest shadow IT problem. AI-enabled software creation is already happening inside every enterprise, and employees are shipping security flaws alongside features: credentials copied into prompts, company data published to the public internet, and databases get deleted, all with no audit trail. Demos are easy to generate, but production features aren't. Prototyping is one of the most popular use cases for marketers and PMs, but the majority of real software work happens on existing apps, not one-off creations. Prototypes fail because they live outside real codebases, require rewrites before production, and create handoffs between tools and teams. The old Software Development Life Cycle is overloaded with dead-ends. The legacy SDLC relies on countless PRDs, tickets, and review meetings. Feedback cycles take weeks or months. Vibe coding has overloaded these outdated processes with thousands of good ideas that will never see the light of day, frustrating engineers and their stakeholders. We took these problems to heart and rebuilt v0 from the ground up. From 0 to shipped: What's new Work on existing codebases Instead of engineers spending weeks on re-writes for production, v0’s new sandbox-based runtime can import any GitHub repo and automatically pull environment variables, and configurations from Vercel. Every prompt generates production-ready code in a real environment, and it lives in your repo. No more copying code back and forth. Bring git to your entire team Historically, marketers and PMs weren’t comfortable setting up and troubleshooting a local dev environment. With v0, they don’t have to. A new Git panel lets you create a new branch for each chat, open PRs against main, and deploy on merge. Pull requests are first-class and previews map to real deployments. For the first time, anyone on a team, not just engineers, can ship production code through proper git workflows. Democratize data, safely Building internal reports and data apps typically requires painful setup of ETL pipelines and scheduled jobs. With v0, you can connect your app directly to the tables you need. Secure integrations with Snowflake and AWS databases mean anyone can build custom reporting, add rich context to their internal tools, and automate data-triggered processes. Stay secure by default Vibe coding tools optimize for speed and novelty, discarding decades of software engineering best practices. v0 is built on Vercel, where security is built-in by default and configurable for common compliance needs. Set deployment protection requirements, connect securely to enterprise systems, and set proper access controls for every app. How our customers use the new v0 Product leaders turn PRDs into prototypes, and prototypes into PRs, shipping the right features, fast. They go from "tell sales there's another delay" to "it's shipped." Designers work against real code, refining layouts, tweaking components, and previewing production with each update. They go from "another ticket for frontend" to "it's shipped." Marketers turn ideas into site updates immediately, edit landing pages, changing images, fixing copy, and publishing, all without opening a ticket. They go from "please, it's a quick change" to "it's shipped." Engineers unblock stakeholders without breaking prod, making quick fixes, importing repos, and letting business users open PRs, all in a single tab. They go from "I can't keep up with the backlog" to "it's shipped." Data teams ship dashboards the business actually uses, building custom reports and analytics on top of real data with just a few prompts. They go from "that's buried in a notebook" to "it's shipped." GTM teams close deals with the demo customers actually asked for, create live previews, mock data, and branded experiences in minutes. They go from "let's show the standard deck" to "it's shipped." What's next Today, you can use v0 to ship production apps and websites. 2026 will be the year of agents. Soon, you’ll be able to build end-to-end agentic workflows in v0, AI models included, and deploy them on Vercel’s self-driving infrastructure. Welcome to the new v0. We can’t wait to see what you build. Sign up or log in to try the new v0 today. Snowflake, GitHub, AWS are trademarks of their respective owners.

    Zeb Hermann
  • General
    Jan 30

    Run untrusted code with Vercel Sandbox, now generally available

    AI agents are changing how software gets built. They clone repos, install dependencies, run tests, and iterate in seconds. Despite the change in software, most infrastructure was built for humans, not agents. Traditional compute assumes someone is in the loop, with minutes to provision and configure environments. Agents need secure, isolated environments that start fast, run untrusted code, and disappear when the task is done. Today, Vercel Sandbox is generally available, the execution layer for agents, and we're open-sourcing the Vercel Sandbox CLI and SDK for the community to build on this infrastructure. Built on our compute platform Vercel processes over 2.7 million deployments per day. Each one spins up an isolated microVM, runs user code, and disappears, often in seconds. To do that at scale, we built our own compute platform. Internally code-named Hive, it’s powered by Firecracker and orchestrates microVM clusters across multiple regions. When you click Deploy in v0, import a repo, clone a template, or run vercel in the CLI, Hive is what makes it feel quick. Sandbox brings that same infrastructure to agents. Why agents need different infrastructure Agents don’t work like humans. They spin up environments, execute code, tear them down, and repeat the cycle continuously. That shifts the constraints toward isolation, security, and ephemeral operation, not persistent, long-running compute. Agents need: Sub-second starts for thousands of sandboxes per task Full isolation when running untrusted code from repositories and user input Ephemeral environments that exist only as long as needed Snapshots to restore complex environments instantly instead of rebuilding Fluid compute with Active CPU pricing for cost and performance efficiency We’ve spent years solving these problems for deployments. Sandbox applies the same approach to agent compute. What is Vercel Sandbox? Vercel Sandbox provides on-demand Linux microVMs. Each sandbox is isolated, with its own filesystem, network, and process space. You get sudo access, package managers, and the ability to run the same commands you’d run on a Linux machine. Sandboxes are ephemeral by design. They run for as long as you need, then shut down automatically, and you only pay for active CPU time, not idle time. This matches how agents work. A single task can involve dozens of start, run, and teardown cycles, and the infrastructure needs to keep up. How teams are using Sandbox Roo Code Roo Code builds AI coding agents that work across Slack, Linear, GitHub, and their web interface. When you trigger an agent, you get a running application to interact with, not just a patch. Snapshots changed their architecture. They snapshot the environment so later runs can restore a known state instead of starting from scratch, skipping repo cloning, dependency installs, and service boot time. Blackbox AI Blackbox AI built Agents HQ, a unified orchestration platform that integrates multiple AI coding agents through a single API. It runs tasks inside Vercel Sandboxes. This supports horizontal scaling for high-volume concurrent execution. Blackbox can dispatch tasks to multiple agents in parallel, each in an isolated sandbox, without resource contention. Create your first sandbox with one command in the CLI Explore the documentation to get started, and check out the open-source SDK.

    Harpreet and Dan

Ready to deploy? Start building with a free account. Speak to an expert for your Pro or Enterprise needs.

Start Deploying
Talk to an Expert

Explore Vercel Enterprise with an interactive product tour, trial, or a personalized demo.

Explore Enterprise

Get Started

  • Templates
  • Supported frameworks
  • Marketplace
  • Domains

Build

  • Next.js on Vercel
  • Turborepo
  • v0

Scale

  • Content delivery network
  • Fluid compute
  • CI/CD
  • Observability
  • AI GatewayNew
  • Vercel AgentNew

Secure

  • Platform security
  • Web Application Firewall
  • Bot management
  • BotID
  • SandboxNew

Resources

  • Pricing
  • Customers
  • Enterprise
  • Startups
  • Solution partners

Learn

  • Docs
  • Blog
  • Changelog
  • Knowledge Base
  • Academy
  • Community

Frameworks

  • Next.js
  • Nuxt
  • Svelte
  • Nitro
  • Turbo

SDKs

  • AI SDK
  • Workflow DevKitNew
  • Flags SDK
  • Chat SDK
  • Streamdown AINew

Use Cases

  • Composable commerce
  • Multi-tenant platforms
  • Web apps
  • Marketing sites
  • Platform engineers
  • Design engineers

Company

  • About
  • Careers
  • Help
  • Press
  • Legal
  • Privacy Policy

Community

  • Open source program
  • Events
  • Shipped on Vercel
  • GitHub
  • LinkedIn
  • X
  • YouTube

Loading status…

Select a display theme: