Anthropic-Compatible API
AI Gateway provides Anthropic-compatible API endpoints, so you can use the Anthropic SDK and tools like Claude Code through a unified gateway with only a URL change.
The Anthropic-compatible API implements the same specification as the Anthropic Messages API.
The Anthropic-compatible API is available at the following base URL:
The Anthropic-compatible API supports the same authentication methods as the main AI Gateway:
- API key: Use your AI Gateway API key with the header or header
- OIDC token: Use your Vercel OIDC token with the header
You only need to use one of these forms of authentication. If an API key is specified it will take precedence over any OIDC token, even if the API key is invalid.
The AI Gateway supports the following Anthropic-compatible endpoint:
Claude Code is Anthropic's agentic coding tool. You can configure it to use Vercel AI Gateway, enabling you to:
- Route requests through multiple AI providers
- Monitor traffic and spend in your AI Gateway Overview
- View detailed traces in Vercel Observability under AI
- Use any model available through the gateway
Configure Claude Code to use the AI Gateway by setting these environment variables:
Variable Value Your AI Gateway API key (empty string) Setting to an empty string is important. Claude Code checks this variable first, and if it's set to a non-empty value, it will use that instead of .
Add this alias to your (or ):
Then reload your shell:
For more flexibility (e.g., adding additional logic), create a wrapper script at :
Make it executable and ensure is in your PATH:
Run to start Claude Code with AI Gateway:
Your requests will now be routed through Vercel AI Gateway.
You can override the default models that Claude Code uses by setting additional environment variables:
This allows you to use any model available through the AI Gateway while still using Claude Code's familiar interface.
Models vary widely in their support for tools, extended thinking, and other features that Claude Code relies on. Performance may differ significantly depending on the model and provider you select.
You can use the AI Gateway's Anthropic-compatible API with the official Anthropic SDK. Point your client to the AI Gateway's base URL and use your AI Gateway API key or OIDC token for authentication.
The examples and content in this section are not comprehensive. For complete documentation on available parameters, response formats, and advanced features, refer to the Anthropic Messages API documentation.
Create messages using the Anthropic Messages API format.
Create a non-streaming message.
Create a streaming message that delivers tokens as they are generated.
Streaming responses use Server-Sent Events (SSE). The key event types are:
- - Initial message metadata
- - Start of a content block (text, tool use, etc.)
- - Incremental content updates
- - End of a content block
- - Final message metadata (stop reason, usage)
- - End of the message
The AI Gateway supports Anthropic-compatible function calling, allowing models to call tools and functions.
Tool call response format
When the model makes tool calls, the response includes tool use blocks:
Configure extended thinking for models that support chain-of-thought reasoning. The parameter allows you to control how reasoning tokens are generated and returned.
- : Set to to enable extended thinking
- : Maximum number of tokens to allocate for thinking
When thinking is enabled, the response includes thinking blocks:
Use the built-in web search tool to give the model access to current information from the web.
Send images and PDF documents as part of your message request.
- Images: , , ,
- Documents:
The messages endpoint supports the following parameters:
- (string): The model to use (e.g., )
- (integer): Maximum number of tokens to generate
- (array): Array of message objects with and fields
- (boolean): Whether to stream the response. Defaults to
- (number): Controls randomness in the output. Range: 0-1
- (number): Nucleus sampling parameter. Range: 0-1
- (integer): Top-k sampling parameter
- (array): Stop sequences for the generation
- (array): Array of tool definitions for function calling
- (object): Controls which tools are called
- (object): Extended thinking configuration
- (string or array): System prompt
The API returns standard HTTP status codes and error responses:
- : Invalid request parameters
- : Invalid or missing authentication
- : Insufficient permissions
- : Model or endpoint not found
- : Rate limit exceeded
- : Server error
Was this helpful?