How I use OpenCode with Vercel AI Gateway to build features fast

Tools like Cursor, Copilot, and Claude Code all let you choose between models. Cursor even has an "Auto" mode and its own Composer model. Copilot supports Claude, GPT, and Gemini. But in all of them, you pick a model (or let the tool pick one for you) and that model runs everything in your session. There's no way to say "use Opus for planning, Haiku for file search, and Gemini for screenshots" and have that happen automatically.

That's what I wanted. Planning and reasoning need a strong model. File exploration doesn't. Screenshot analysis doesn't. Boilerplate generation doesn't. Running the same expensive model for all of these wastes tokens.

I use a stack that does this routing automatically. I configure which model handles which task once, and every agent gets the right model without me switching anything mid-session. The Vercel AI Gateway serves the models through a single endpoint, and oh-my-opencode handles the orchestration.

Here's how I set it up and how it works in practice.

What you'll set up

This guide covers four tools that work together:

OpenCode: An open-source AI coding agent that runs in your terminal. It's model-agnostic and supports plugins. Think of it as the foundation.
oh-my-opencode: A plugin that turns OpenCode into a multi-agent system. It adds specialized agents (planner, researcher, debugger) and lets you assign different models to each one.
Vercel AI Gateway: A single API endpoint that lets you call models from Anthropic, OpenAI, Google, and 40+ other providers with one API key. No separate accounts or billing per provider.
agent-browser: A browser automation CLI by Vercel. Your agents use it to test web flows, take screenshots, and verify changes in a real browser.

The layers: OpenCode runs the session. oh-my-opencode adds orchestration. The AI Gateway routes model calls. agent-browser handles anything that needs a browser.

Before you start

You'll need:

A Vercel account (for AI Gateway access)
Node.js 18+ and npm
Homebrew (on macOS) or another way to install CLI tools
A terminal you're comfortable in

Step 1: Install OpenCode

OpenCode is open source, built by the team at Anomaly. I install it with Homebrew:

brew install sst/tap/opencode

Verify it works with opencode --version. See the OpenCode docs for other install methods (npm, curl, etc.).

OpenCode stores its config at ~/.config/opencode/opencode.json by default. I rename mine to opencode.jsonc so I can add comments. OpenCode reads both formats. The examples in this guide use .jsonc for the same reason. I keep the whole config folder in a dotfiles repo and symlink it.

Step 2: Add oh-my-opencode

oh-my-opencode is a plugin that turns OpenCode into a multi-agent system. The recommended way to install it is to let an agent do it for you. Paste this prompt into an OpenCode session:

Install and configure oh-my-opencode by following the instructions here:
<https://raw.githubusercontent.com/code-yeongyu/oh-my-opencode/refs/heads/master/docs/guide/installation.md>

The agent will run the CLI installer and ask which model providers you have (Claude, OpenAI, Gemini, GitHub Copilot). Since this guide uses the Vercel AI Gateway as the provider for all models, you can say "no" to all of these. The installer still registers the plugin and sets up the agent structure. You'll configure the actual model routing through the AI Gateway in Step 3.

Manual alternative

If you'd rather do it yourself, run the interactive installer:

bunx oh-my-opencode install

This walks you through provider selection and writes the config files. Since we're using the Vercel AI Gateway, decline all provider subscriptions:

bunx oh-my-opencode install --no-tui --claude=no --openai=no --gemini=no --copilot=no

Either way, verify it worked by opening OpenCode. You should see additional agents available. Press Tab to cycle through them.

oh-my-opencode does a lot out of the box. The feature we care about here is assigning different models to different agents and task categories.

Reference: oh-my-opencode installation guide

Step 3: Configure Vercel AI Gateway as your provider

This is where multi-model routing happens. The Vercel AI Gateway gives you a single endpoint for models from Anthropic, OpenAI, Google, and 40+ other providers. You use one API key and specify models in the format provider/model-name.

Here's what my Vercel provider block looks like in ~/.config/opencode/opencode.jsonc:

{
  // This is the default model OpenCode uses when no agent override is set
  "model": "vercel/anthropic/claude-opus-4.6",

  "provider": {
    "vercel": {
      "models": {
        // Anthropic models
        "anthropic/claude-opus-4.6": { "name": "Claude Opus 4.6" },
        "anthropic/claude-sonnet-4.6": { "name": "Claude Sonnet 4.6" },
        "anthropic/claude-haiku-4.5": { "name": "Claude Haiku 4.5" },

        // Google models
        "google/gemini-3-flash": { "name": "Gemini 3 Flash" },
        "google/gemini-3-pro": { "name": "Gemini 3 Pro" },

        // OpenAI models
        "openai/gpt-5.2-codex": { "name": "GPT-5.2 Codex" }
      }
    }
    // ... other config omitted
  }
}

Each entry in the models block registers a model that OpenCode (and oh-my-opencode) can use. The key is the model identifier in provider/model-name format. Adding a new model is just adding a line.

Pricing: the AI Gateway charges tokens at the upstream provider's list price with zero markup. Every Vercel account gets $5 in free credits that reset every 30 days, with no restrictions on which models you can use. That's enough to run through this guide and test the full multi-model setup before committing. You can also bring your own API keys if you already have provider accounts.

Reference: Vercel AI Gateway documentation

Step 4: Install agent-browser

agent-browser is a browser automation CLI built for AI agents. It uses semantic element references instead of raw DOM trees, which cuts context token usage by roughly 93%.

Install it as an OpenCode skill:

npx skills add vercel-labs/agent-browser

Select "Copy" mode when prompted. This copies the skill files into your OpenCode config rather than symlinking them, which is more reliable if you manage your config through dotfiles.

You'll rarely call agent-browser directly. The agents invoke it automatically when they need to verify something in a browser, take a screenshot, or test a web flow.

Why multi-model routing matters

When a single model handles everything, you're paying the same rate for a codebase search as you are for architecture planning. That means either overspending on simple tasks or underperforming on complex ones.

Multi-model routing lets you assign models based on what each task actually needs:

Tier	Models	Best for
Expensive (high reasoning)	Opus 4.6, GPT-5.2 Codex	Planning, architecture, complex debugging
Balanced (workhorse)	Sonnet 4.6	General coding, research, documentation search
Cheap (fast)	Haiku 4.5, Gemini 3 Flash	Quick tasks, screenshot analysis, boilerplate

Opus is roughly 10-20x more expensive per token than Haiku. If an explore agent runs 50 times during a refactoring session, routing it through Haiku instead of Opus saves real money. And the exploration quality is the same because that task doesn't need deep reasoning.

My model assignments

oh-my-opencode lets you assign models at two levels: per agent and per task category. I configure both in ~/.config/opencode/oh-my-opencode.jsonc.

Agent assignments

Each specialized agent gets the model that fits its job:

{
  "agents": {
    // Orchestrators need top reasoning
    "sisyphus": { "model": "vercel/anthropic/claude-opus-4.6" },
    "prometheus": { "model": "vercel/anthropic/claude-opus-4.6" },
    "oracle": { "model": "vercel/anthropic/claude-opus-4.6" },

    // Autonomous deep worker for goal-oriented tasks
    "hephaestus": { "model": "vercel/openai/gpt-5.2-codex" },

    // Workers need good balance of cost and capability
    "atlas": { "model": "vercel/anthropic/claude-sonnet-4.6" },
    "librarian": { "model": "vercel/anthropic/claude-sonnet-4.6" },
    "explore": { "model": "vercel/anthropic/claude-sonnet-4.6" },
    "metis": { "model": "vercel/anthropic/claude-sonnet-4.6" },
    "momus": { "model": "vercel/anthropic/claude-sonnet-4.6" },

    // Multimodal tasks go to the best vision model per dollar
    "multimodal-looker": { "model": "vercel/google/gemini-3-flash" }
  }
}

The logic: Sisyphus orchestrates everything, so it gets the most capable model. Oracle handles debugging and architecture, which also needs deep reasoning. Hephaestus is the autonomous deep worker. You give it a goal and it figures out the steps itself, so it gets a strong reasoning model too. Explore and Librarian just search codebases and documentation, so they run on the balanced tier. The multimodal looker analyzes screenshots and PDFs, so it uses Gemini Flash, which handles vision tasks well at low cost.

Category assignments

oh-my-opencode also routes delegated tasks by category. When Sisyphus delegates a subtask, it picks a category, and that category maps to a model:

{
  "categories": {
    // Frontend work goes to a model strong at UI
    "visual-engineering": { "model": "vercel/google/gemini-3-pro" },

    // Hard logic problems get the heaviest model
    "ultrabrain": { "model": "vercel/openai/gpt-5.2-codex", "variant": "xhigh" },

    // Trivial tasks use the cheapest model
    "quick": { "model": "vercel/anthropic/claude-haiku-4.5" },

    // Writing tasks (docs, comments) use a fast model
    "writing": { "model": "vercel/google/gemini-3-flash" }
  }
}

This means a single refactoring session might touch four or five different models without you doing anything. Sisyphus decides what each subtask needs, picks the category, and the right model handles it.

These are my assignments as of February 2026. The config schema may change, so check the oh-my-opencode docs if something looks off.

Seeing it in action: a refactoring walkthrough

Here's how this looks in practice. I needed to refactor an authentication module from session-based auth to JWT tokens. I opened OpenCode and typed:

Refactor the auth module from sessions to JWT. ulw

ulw is short for "ultrawork." Adding it to your prompt tells oh-my-opencode to use full orchestration: parallel background agents, automatic task delegation, and continuous execution until the job is done. Here's what happened next.

1. Sisyphus analyzes the request (Opus 4.6)

Sisyphus, the main orchestrator, received the prompt. This is a complex architectural change, so it needed context before making decisions. It fired two background agents in parallel:

Explore (Sonnet 4.6): "Find all files related to authentication. Map the current session flow."
Librarian (Sonnet 4.6): "Search for JWT implementation patterns in the project's framework."

Both ran simultaneously. While they worked, Sisyphus analyzed the high-level structure of the request. This is why the orchestrator needs a top-tier model: it's making strategic decisions about how to break down the work.

2. Background agents report back (Sonnet 4.6)

Within seconds, the background agents returned their findings. Explore mapped 12 files involved in the auth flow. Librarian found the framework's recommended JWT pattern in the official docs.

These agents don't need Opus-level reasoning. They're doing search and retrieval, which Sonnet handles well at a fraction of the cost.

3. Sisyphus creates the plan and delegates

With full context, Sisyphus (still running on Opus 4.6) created an implementation plan and started handing subtasks to other models. The JWT token generation logic went to a GPT-5.2 Codex worker because the cryptographic code needed careful reasoning. The middleware updates went to a Haiku 4.5 worker since they were straightforward pattern replacements. Sisyphus picked the model for each subtask based on its category. No manual switching involved.

4. Agent-browser verifies the result

After implementation, Sisyphus invoked agent-browser to verify the login flow still worked:

agent-browser open <http://localhost:3000/login>
agent-browser snapshot -i
agent-browser fill @e1 "test@example.com"
agent-browser fill @e2 "testpassword"
agent-browser click @e3
agent-browser wait --url "**/dashboard"

The browser test confirmed that login, token generation, and protected routes all worked correctly.

The session used five different models. Opus and GPT-5.2 Codex only ran for the tasks that needed them. Everything else ran on cheaper models. Same result, roughly 70% lower cost.

What this costs

I don't track exact per-token costs because model pricing changes frequently. But the math is straightforward.

A session like the refactoring walkthrough above might cost around $3 in tokens if run entirely on Opus. With multi-model routing, the same session costs closer to $0.80. The explore agents, librarian lookups, and simple file edits all run on cheaper models. Only the planning and complex logic hit the expensive ones.

The Vercel AI Gateway passes through provider pricing with zero markup. There's a free tier with $5/month in credits that resets every 30 days after your first request. Enough to try the setup and run a few sessions before you decide to commit.

The exact numbers will vary. The point is you stop paying premium prices for tasks that don't need premium models.

What's next

You now have a setup where different agents use different models based on what each task needs. From here:

Customize the model assignments in ~/.config/opencode/oh-my-opencode.jsonc to match your preferences
Try the ulw keyword on a real task in your own project and watch the agents coordinate
Experiment with which models work best for which categories in your workflow
Add more models to your ~/.config/opencode/opencode.jsonc provider block as new ones become available

Official documentation: