Menu

Anthropic-Compatible API

Last updated

AI Gateway provides Anthropic-compatible API endpoints, so you can use the Anthropic SDK and tools like Claude Code through a unified gateway with only a URL change.

The Anthropic-compatible API implements the same specification as the Anthropic Messages API.

The Anthropic-compatible API is available at the following base URL:

The Anthropic-compatible API supports the same authentication methods as the main AI Gateway:

  • API key: Use your AI Gateway API key with the header or header
  • OIDC token: Use your Vercel OIDC token with the header

You only need to use one of these forms of authentication. If an API key is specified it will take precedence over any OIDC token, even if the API key is invalid.

The AI Gateway supports the following Anthropic-compatible endpoint:

  • - Create messages with support for streaming, tool calls, extended thinking, and file attachments

Claude Code is Anthropic's agentic coding tool. You can configure it to use Vercel AI Gateway, enabling you to:

  • Route requests through multiple AI providers
  • Monitor traffic and spend in your AI Gateway Overview
  • View detailed traces in Vercel Observability under AI
  • Use any model available through the gateway
  1. Configure Claude Code to use the AI Gateway by setting these environment variables:

    VariableValue
    Your AI Gateway API key
    (empty string)

    Setting to an empty string is important. Claude Code checks this variable first, and if it's set to a non-empty value, it will use that instead of .

    Add this alias to your (or ):

    Then reload your shell:

    For more flexibility (e.g., adding additional logic), create a wrapper script at :

    Make it executable and ensure is in your PATH:

  2. Run to start Claude Code with AI Gateway:

    Your requests will now be routed through Vercel AI Gateway.

  3. You can override the default models that Claude Code uses by setting additional environment variables:

    This allows you to use any model available through the AI Gateway while still using Claude Code's familiar interface.

    Models vary widely in their support for tools, extended thinking, and other features that Claude Code relies on. Performance may differ significantly depending on the model and provider you select.

You can use the AI Gateway's Anthropic-compatible API with the official Anthropic SDK. Point your client to the AI Gateway's base URL and use your AI Gateway API key or OIDC token for authentication.

The examples and content in this section are not comprehensive. For complete documentation on available parameters, response formats, and advanced features, refer to the Anthropic Messages API documentation.

Create messages using the Anthropic Messages API format.

Endpoint

Create a non-streaming message.

Example request
Response format

Create a streaming message that delivers tokens as they are generated.

Example request

Streaming responses use Server-Sent Events (SSE). The key event types are:

  • - Initial message metadata
  • - Start of a content block (text, tool use, etc.)
  • - Incremental content updates
  • - End of a content block
  • - Final message metadata (stop reason, usage)
  • - End of the message

The AI Gateway supports Anthropic-compatible function calling, allowing models to call tools and functions.

Example request

Tool call response format

When the model makes tool calls, the response includes tool use blocks:

Configure extended thinking for models that support chain-of-thought reasoning. The parameter allows you to control how reasoning tokens are generated and returned.

Example request
  • : Set to to enable extended thinking
  • : Maximum number of tokens to allocate for thinking

When thinking is enabled, the response includes thinking blocks:

Use the built-in web search tool to give the model access to current information from the web.

Example request

Send images and PDF documents as part of your message request.

Example request
  • Images: , , ,
  • Documents:

The messages endpoint supports the following parameters:

  • (string): The model to use (e.g., )
  • (integer): Maximum number of tokens to generate
  • (array): Array of message objects with and fields
  • (boolean): Whether to stream the response. Defaults to
  • (number): Controls randomness in the output. Range: 0-1
  • (number): Nucleus sampling parameter. Range: 0-1
  • (integer): Top-k sampling parameter
  • (array): Stop sequences for the generation
  • (array): Array of tool definitions for function calling
  • (object): Controls which tools are called
  • (object): Extended thinking configuration
  • (string or array): System prompt

The API returns standard HTTP status codes and error responses:

  • : Invalid request parameters
  • : Invalid or missing authentication
  • : Insufficient permissions
  • : Model or endpoint not found
  • : Rate limit exceeded
  • : Server error

Was this helpful?

supported.