Skip to content

Build a Claude Managed Agent with Vercel Sandbox

Build a Claude Managed Agent with Vercel Sandbox: each session runs in a fresh microVM with credential brokering and a webhook-driven control plane on Vercel.

14 min read
Last updated May 19, 2026

Claude Managed Agents (CMA) handles the agent and infrastructure for you: the harness, the session state, the tools, and the execution environment. What if you want to plug in your own environment instead? This guide shows you how, using Vercel Sandbox as the execution layer.

Check out the demo app source code, or follow the walkthrough below to build it yourself.

Anthropic hosts the brain: Claude, the tool-calling loop, skills, and memory. The brain has no hands, so when Claude calls a tool, something on your side has to run it and post the result back. With Vercel, that "something" splits into two planes:

  • Control plane (Vercel Function): receives session.status_run_started webhooks from Anthropic and spawns one Vercel Sandbox per session.
  • Compute plane (Vercel Sandbox): the spawned VM attaches to the session's event stream, executes tool calls (run_shell, read_file, etc.), posts results back, and exits when the session ends.

Each session runs in a fresh isolated microVM that exits when the session ends. The environment key never enters the VM: Vercel Sandbox's credential brokering injects it on outbound requests scoped to this session, so a compromised sandbox can't extract the key or use it to act on other sessions.

  • A Vercel account with Sandbox access
  • An Anthropic account with environments access
  • Node.js 22+
  • Vercel CLI (pnpm add -g vercel)

The control plane webhook, the UI, and the setup scripts all live in the same Next.js app. Scaffold it with:

Terminal
pnpm create next-app cma-private-sandbox
cd cma-private-sandbox
pnpm add @anthropic-ai/sdk @vercel/sandbox@beta @vercel/functions ms
pnpm add -D tsx @types/node @types/ms
mkdir scripts sandbox

tsx auto-loads .env.local before running any script, so no dotenv import is needed in the scripts.

Link the project to Vercel and pull credentials:

Terminal
vercel link
vercel env pull .env.local

This writes a VERCEL_OIDC_TOKEN to .env.local so @vercel/sandbox can authenticate without a long-lived Vercel token.

All API calls below require two headers, which the SDK adds automatically when you pass the beta tag:

anthropic-version: 2023-06-01
anthropic-beta: managed-agents-2026-04-01

Create the environment in the Anthropic dashboard (Workspace → Environments → New → Self-hosted) or in code:

scripts/create-environment.ts
import Anthropic from "@anthropic-ai/sdk";
async function main() {
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const env = await client.beta.environments.create({
name: "vercel-sandbox",
config: { type: "self_hosted" },
betas: ["managed-agents-2026-04-01"],
});
console.log("ANTHROPIC_ENVIRONMENT_ID=" + env.id);
}
main();

Run it once and add the printed ID to .env.local. In the console, open the environment and click Generate environment key. Save it as ANTHROPIC_ENVIRONMENT_KEY in .env.local. This key authenticates the whole worker flow: poll, ack, heartbeat, stop, and the session event stream. Ignore the on-screen instructions about an env_manager binary: Vercel Sandbox is the runtime.

Create an agent with the custom tools your runner will handle. The tools array here must match exactly what runTool implements in the sandbox:

scripts/create-agent.ts
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
async function main() {
const agent = await client.beta.agents.create({
name: "Vercel Sandbox Agent",
model: "claude-opus-4-7",
system: "You are a coding assistant with a Linux environment.",
tools: [
{
type: "custom",
name: "run_shell",
description: "Run a shell command in the sandbox. Returns stdout.",
input_schema: {
type: "object",
properties: {
command: { type: "string" },
},
required: ["command"],
},
},

Add a read_file tool the same way:

{
type: "custom",
name: "read_file",
description: "Read the contents of a file at the given path.",
input_schema: {
type: "object",
properties: {
path: { type: "string" },
},
required: ["path"],
},
},
],
betas: ["managed-agents-2026-04-01"],
});
console.log("ANTHROPIC_AGENT_ID=" + agent.id);
}
main();

Save the printed ID as ANTHROPIC_AGENT_ID in .env.local.

This is the code that runs inside each spawned sandbox. It reconciles any tool calls that arrived before it attached, then streams new ones and posts results back.

Create sandbox/runner.ts:

sandbox/runner.ts
import Anthropic from "@anthropic-ai/sdk";
import { execSync } from "node:child_process";
import { readFile } from "node:fs/promises";
const ENV_ID = process.env.ENVIRONMENT_ID!;
const WORK_ID = process.env.WORK_ID!;
const SESSION_ID = process.env.SESSION_ID!;
const BETA = "managed-agents-2026-04-01";
// Auth is injected at the sandbox firewall via credential brokering,
// so the SDK only needs a placeholder here. The real key never enters the VM.
const client = new Anthropic({ authToken: "_brokered_" });
const handled = new Set<string>();

The released SDK uses one credential for everything: poll, ack, heartbeat, stop, and session events. The control plane will configure the sandbox's network policy to inject that credential on outbound requests to api.anthropic.com, so this code never sees the raw token.

Define your tool implementations:

async function runTool(name: string, input: unknown): Promise<string> {
if (name === "run_shell") {
const cmd = (input as { command: string }).command;
return execSync(cmd, { encoding: "utf8", timeout: 30_000 });
}
if (name === "read_file") {
return await readFile((input as { path: string }).path, "utf8");
}
return `unknown tool: ${name}`;
}

Start the heartbeat to keep the work-item lease alive:

let last: string | undefined;
const hb = setInterval(async () => {
try {
const r = await client.beta.environments.work.heartbeat(
WORK_ID,
{ environment_id: ENV_ID, expected_last_heartbeat: last, betas: [BETA] },
);
last = r.last_heartbeat;
} catch {}
}, 30_000);

Post results back for each tool call:

async function handleTool(ev: { id: string; name: string; input: unknown }) {
const output = await runTool(ev.name, ev.input).catch(
(e: Error) => `error: ${e.message}`
);
await client.beta.sessions.events.send(SESSION_ID, {
events: [{
type: "user.custom_tool_result",
custom_tool_use_id: ev.id,
content: [{ type: "text", text: output || "(no output)" }],
}],
});
handled.add(ev.id);
}

List existing events to catch up on anything emitted while the sandbox was booting, then switch to the live stream:

try {
for await (const ev of client.beta.sessions.events.list(
SESSION_ID, { limit: 1000 }
)) {
if (ev.type === "agent.custom_tool_use" && !handled.has(ev.id))
await handleTool(ev);
else if (ev.type === "user.custom_tool_result")
handled.add(ev.custom_tool_use_id);
}
const stream = await client.beta.sessions.events.stream(SESSION_ID);
for await (const ev of stream) {
if (ev.type === "agent.custom_tool_use" && !handled.has(ev.id))
await handleTool(ev);
}
} finally {
clearInterval(hb);
await client.beta.environments.work
.stop(WORK_ID, { environment_id: ENV_ID, betas: [BETA] })
.catch((e) => { if (e?.status !== 409) throw e; });
}

The reconcile pass matters because the webhook may take a moment to spawn the sandbox. Listing first and deduplicating with handled ensures no tool call is dropped or processed twice. A 409 from work.stop means another runner already stopped it, which is safe to swallow.

Installing the Anthropic SDK and copying in the runner on every spawn would add noticeable latency to each session. Build a snapshot once, then every sandbox boots from that prebuilt image with no install step:

scripts/build-snapshot.ts
import { Sandbox } from "@vercel/sandbox";
import { readFileSync } from "node:fs";
async function main() {
const sandbox = await Sandbox.create({ runtime: "node24" });
await sandbox.writeFiles([
{ path: "/vercel/sandbox/package.json",
content: Buffer.from('{"type":"module"}') },
{ path: "/vercel/sandbox/runner.ts",
content: readFileSync("./sandbox/runner.ts") },
]);
await sandbox.runCommand("npm", ["install", "@anthropic-ai/sdk", "tsx"]);
const snapshot = await sandbox.snapshot();
console.log("SANDBOX_SNAPSHOT_ID=" + snapshot.snapshotId);
await sandbox.stop();
}
main();

Run it and save the printed ID to .env.local:

pnpm tsx scripts/build-snapshot.ts

Rebuild the snapshot whenever your tool implementations or the SDK version change.

Register a webhook in the Anthropic dashboard for session.status_run_started. Each delivery triggers one poll, ack, and spawn pass.

Create app/api/webhook/route.ts:

import Anthropic from "@anthropic-ai/sdk";
import { Sandbox } from "@vercel/sandbox";
import { waitUntil } from "@vercel/functions";
import ms from "ms";
const ENV_ID = process.env.ANTHROPIC_ENVIRONMENT_ID!;
const ENV_KEY = process.env.ANTHROPIC_ENVIRONMENT_KEY!;
const SNAPSHOT_ID = process.env.SANDBOX_SNAPSHOT_ID!;
const WEBHOOK_SECRET = process.env.ANTHROPIC_WEBHOOK_SECRET!;
const BETA = "managed-agents-2026-04-01";
const client = new Anthropic({ authToken: ENV_KEY });

Poll and ack. The client is already authenticated with the environment key, so ack needs no per-call headers:

async function pollAndAck() {
const work = await client.beta.environments.work.poll(ENV_ID, {
reclaim_older_than_ms: 2000,
betas: [BETA],
});
if (!work || work.data.type !== "session") return null;
await client.beta.environments.work.ack(work.id, {
environment_id: ENV_ID,
betas: [BETA],
});
return { workId: work.id, sessionId: work.data.id };
}

Spawn a sandbox from the snapshot and run the tool runner detached. The networkPolicy brokers the environment key at the firewall, scoped to just this session and work item: outbound calls to /v1/sessions/<sessionId>/... and /v1/environments/<envId>/work/<workId>/... get the Authorization: Bearer <key> header injected on the wire; anything else (e.g. work/poll or another session ID) gets no auth and is rejected by Anthropic. Matchers require @vercel/sandbox@beta:

async function spawn(sessionId: string, workId: string) {
const inject = [{ headers: { authorization: `Bearer ${ENV_KEY}` } }];
const sandbox = await Sandbox.create({
source: { type: "snapshot", snapshotId: SNAPSHOT_ID },
runtime: "node24",
timeout: ms("1h"),
networkPolicy: {
allow: {
"api.anthropic.com": [
{
match: { path: { startsWith: `/v1/sessions/${sessionId}/` } },
transform: inject,
},
{
match: {
path: {
startsWith: `/v1/environments/${ENV_ID}/work/${workId}/`,
},
},
transform: inject,
},
],
},
},
});
await sandbox.runCommand({
cmd: "npx",
args: ["tsx", "runner.ts"],
cwd: "/vercel/sandbox",
env: {
ENVIRONMENT_ID: ENV_ID,
WORK_ID: workId,
SESSION_ID: sessionId,
},
detached: true,
});
}

process.env.ANTHROPIC_ENVIRONMENT_KEY is undefined inside the spawned VM. Even if an agent jailbreak or compromised tool ran console.log(process.env), there's no key to leak, and the scoped matchers mean a malicious request to work/poll or another session ID won't be authenticated. Adding more domains to the runner (e.g. a customer API) means extending the allow map. The default mode is deny-all once you set a network policy, so anything not in allow is blocked at the firewall.

detached: true returns immediately, leaving the runner running inside the VM. The webhook handler ties it together. client.beta.webhooks.unwrap() verifies the HMAC signature, checks the timestamp, and parses the event in one call, so there's no hand-rolled crypto:

export async function POST(req: Request): Promise<Response> {
const body = await req.text();
let event;
try {
event = client.beta.webhooks.unwrap(body, {
headers: Object.fromEntries(req.headers),
key: WEBHOOK_SECRET,
});
} catch {
return new Response("bad signature", { status: 401 });
}
if (event.data.type !== "session.status_run_started")
return new Response("ignored");
const item = await pollAndAck();
if (!item) return new Response("no_work");
waitUntil(spawn(item.sessionId, item.workId));
return new Response("ok");
}

waitUntil hands the spawn off so the function returns 200 immediately while Sandbox.create finishes in the background.

Push the project to Vercel and set the production environment variables:

vercel deploy
vercel env add ANTHROPIC_API_KEY
vercel env add ANTHROPIC_ENVIRONMENT_ID
vercel env add ANTHROPIC_AGENT_ID
vercel env add ANTHROPIC_ENVIRONMENT_KEY
vercel env add SANDBOX_SNAPSHOT_ID
vercel env add ANTHROPIC_WEBHOOK_SECRET
vercel deploy --prod

In the Anthropic dashboard, add a webhook subscribed to session.status_run_started pointing at https://your-project.vercel.app/api/webhook. Save the webhook signing secret as ANTHROPIC_WEBHOOK_SECRET.

If your Vercel project has Deployment Protection enabled, Anthropic's delivery will be blocked with a 401. Append a bypass token to the URL so it gets through:

https://your-project.vercel.app/api/webhook?x-vercel-protection-bypass=<bypass-secret>

The bypass secret is in your Vercel project settings under Deployment Protection.

With the Next.js project in place, the UI is two API routes and a client page.

app/api/session/route.ts creates the session and sends the first message:

import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
export async function POST(req: Request) {
const { message } = await req.json();
const session = await client.beta.sessions.create({
agent: process.env.ANTHROPIC_AGENT_ID!,
environment_id: process.env.ANTHROPIC_ENVIRONMENT_ID!,
betas: ["managed-agents-2026-04-01"],
});
await client.beta.sessions.events.send(session.id, {
events: [{
type: "user.message",
content: [{ type: "text", text: message }],
}],
});
return Response.json({ sessionId: session.id });
}

app/api/session/[id]/route.ts streams session events as SSE. It first catches up on events that already exist, then switches to live streaming.

Set up the route. The SDK exports BetaManagedAgentsSessionEvent as a discriminated union, so a switch on ev.type narrows each branch to the right payload shape without manual casts:

import Anthropic from "@anthropic-ai/sdk";
import type { BetaManagedAgentsSessionEvent } from
"@anthropic-ai/sdk/resources/beta/sessions/events";
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
export const dynamic = "force-dynamic";
export async function GET(
_req: Request,
{ params }: { params: Promise<{ id: string }> },
) {
const { id } = await params;
const stream = new ReadableStream({
async start(ctrl) {
const send = (event: string, data: unknown) =>
ctrl.enqueue(new TextEncoder().encode(
`event: ${event}\ndata: ${JSON.stringify(data)}\n\n`,
));

Translate each session event into an SSE frame. Returning true from forward signals the turn is over:

const forward = (ev: BetaManagedAgentsSessionEvent): boolean => {
switch (ev.type) {
case "agent.message":
send("message", {
text: ev.content
.map(c => "text" in c ? c.text : "").join(""),
});
return false;
case "agent.custom_tool_use":
send("tool_use", { name: ev.name, input: ev.input });
return false;
case "user.custom_tool_result":
send("tool_result", { content: ev.content });
return false;
case "session.status_idle":
if (ev.stop_reason?.type === "end_turn") {
send("done", { sessionId: id });
return true;
}
return false;
default:
return false;
}
};

Catch up on history, then stream live events until the turn ends:

try {
for await (const ev of client.beta.sessions.events.list(
id, { limit: 1000 }
)) {
if (forward(ev)) return;
}
const evStream = await client.beta.sessions.events.stream(id);
for await (const ev of evStream) {
if (forward(ev)) break;
}
} finally { ctrl.close(); }
},
});
return new Response(stream, {
headers: {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
},
});
}

The client page subscribes to the SSE stream and renders tool calls and the agent's reply as they arrive.

The UI above attaches SSE directly to Anthropic's session event stream. That is enough for a demo, but serverless functions can time out on long sessions, and you lose durable replay on refresh.

If you need a production chat UI with durable polling, multi-turn conversations, and full event replay, see Build a Claude Managed Agent with Vercel Workflow: a Vercel Workflow run polls session events, writes them to a durable stream, and the client reads them over SSE. The workflow run is both the execution engine and the event log.

For local testing without deploying the webhook, use the direct session runner. It streams events and handles tool calls on your machine:

scripts/run-session.ts
import Anthropic from "@anthropic-ai/sdk";
import { execSync } from "node:child_process";
const SESSION_ID = process.argv[2];
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const handled = new Set<string>();
async function handleTool(ev: { id: string; name: string; input: unknown }) {
const cmd = (ev.input as { command: string }).command;
const output = execSync(cmd, { encoding: "utf8", timeout: 30_000 });
await client.beta.sessions.events.send(SESSION_ID, {
events: [{
type: "user.custom_tool_result",
custom_tool_use_id: ev.id,
content: [{ type: "text", text: output }],
}],
});
handled.add(ev.id);
}

Catch up on existing events, then stream new ones until the session ends:

async function main() {
const isEnd = (ev: { type: string; stop_reason?: { type: string } }) =>
ev.type === "session.status_idle" && ev.stop_reason?.type === "end_turn";
for await (const ev of client.beta.sessions.events.list(
SESSION_ID, { limit: 1000 }
)) {
if (ev.type === "agent.custom_tool_use" && !handled.has(ev.id))
await handleTool(ev);
if (ev.type === "user.custom_tool_result")
handled.add(ev.custom_tool_use_id);
if (isEnd(ev as { type: string; stop_reason?: { type: string } })) return;
}
const stream = await client.beta.sessions.events.stream(SESSION_ID);
for await (const ev of stream) {
if (ev.type === "agent.custom_tool_use" && !handled.has(ev.id))
await handleTool(ev);
if (ev.type === "agent.message") {
const text = (ev as { content: Array<{ text?: string }> })
.content.map(c => c.text ?? "").join("");
console.log(text);
}
if (isEnd(ev as { type: string; stop_reason?: { type: string } })) break;
}
}
main();

Run it in two terminals:

# Terminal 1: create a session
pnpm tsx scripts/test-session.ts
# → Session ID: sesn_01...
# Terminal 2: handle tool calls locally (no sandbox)
pnpm tsx scripts/run-session.ts sesn_01...
# Or: full E2E through Vercel Sandbox
pnpm tsx scripts/test-e2e.ts

Self-hosting the CMA compute plane with Vercel Sandbox is the right choice when:

  • Tools touch private infrastructure: your runner needs to reach internal databases, private APIs, or services not reachable from Anthropic's compute. Vercel Sandbox lets you run the compute inside or adjacent to your own network with low-latency, secure connectivity.
  • You are handling per-customer credentials: in a SaaS context each user has their own API tokens. Passing those tokens as env vars to the runner works, but any code that runs in the sandbox can read them. Vercel Sandbox's credential brokering injects tokens at the firewall level instead: the sandbox calls fetch("https://api.example.com/...") with no auth header, and the firewall adds it before forwarding. console.log(process.env) inside the sandbox reveals nothing.
  • You need egress control: Vercel Sandbox lets you define a domain allowlist and deny everything else, which matters when your runner processes private data and you want to prevent exfiltration.

The platform itself is also a good fit for this kind of work:

  • Battle-tested infrastructure: Vercel has been running microVM sandboxes for 10 years to power its build system. The same infrastructure handles over a billion deployments and has hardened defenses against the kinds of attacks agent code can run into, like cryptominer abuse and container escapes.
  • Built for TypeScript developers: the Sandbox SDK and CLI follow the same DX principles as Next.js, AI SDK, and Turborepo. Secure OIDC authentication, no long-lived tokens, and a small surface area that fits cleanly into the rest of your toolchain.
  • Low-latency connectivity to your cloud: sandboxes have direct egress to your AWS workloads with low data transfer costs, which matters more here than with general-purpose sandbox providers like Daytona or Cloudflare when your agent's tools need to reach private services.

Instead of passing tokens as env vars:

env: {
SESSION_ID: sessionId,
CUSTOMER_API_KEY: customerToken, // readable by any sandbox code
}

Configure injection on the sandbox's network policy:

networkPolicy: {
allow: [
{ domain: "api.anthropic.com" },
{
domain: "api.example.com",
inject: [{
requestHeaders: { Authorization: `Bearer ${customerToken}` },
}],
},
],
}

The runner's fetch calls to api.example.com are authenticated. The token never enters the sandbox. You can also scope injection to specific paths and methods using matchers:

inject: [{
match: {
path: { startsWith: "/v1/write" },
method: ["POST", "PUT", "PATCH"],
},
requestHeaders: { Authorization: `Bearer ${writeToken}` },
}]

Lock down egress to exactly what the runner needs:

networkPolicy: {
allow: [
{ domain: "api.anthropic.com" },
{ domain: "your-internal-db.example.com" },
],
}

Policies can be updated on running sandboxes without restarting. A useful pattern: start with allow-all to install dependencies, then tighten the policy before running agent-generated code or processing sensitive data.

Check out the source or deploy the complete working implementation to Vercel in one click, then run the setup scripts locally (create-environment.ts, create-agent.ts, build-snapshot.ts) to fill in the environment variables before Anthropic's first webhook fires.

Was this helpful?

supported.