VercelVercel
Menu

Responses API

Last updated February 24, 2026

AI Gateway supports the OpenAI Responses API, letting you use the official OpenAI SDKs with any provider supported by the gateway. Point your client to the AI Gateway base URL and use provider/model identifiers to route requests to OpenAI, Anthropic, Google, and more.

Set your SDK's base URL to the AI Gateway and use your API key for authentication:

basic.ts
import OpenAI from 'openai';
 
const client = new OpenAI({
  apiKey: process.env.AI_GATEWAY_API_KEY,
  baseURL: 'https://ai-gateway.vercel.sh/v1',
});
 
const response = await client.responses.create({
  model: 'anthropic/claude-sonnet-4.5',
  input: 'What is the capital of France?',
});
 
console.log(response.output_text);
basic.py
import os
from openai import OpenAI
 
client = OpenAI(
    api_key=os.getenv('AI_GATEWAY_API_KEY'),
    base_url='https://ai-gateway.vercel.sh/v1',
)
 
response = client.responses.create(
    model='anthropic/claude-sonnet-4.5',
    input='What is the capital of France?',
)
 
print(response.output_text)

Set stream: true to receive tokens as they're generated. The SDK returns an async iterator of server-sent events:

stream.ts
import OpenAI from 'openai';
 
const client = new OpenAI({
  apiKey: process.env.AI_GATEWAY_API_KEY,
  baseURL: 'https://ai-gateway.vercel.sh/v1',
});
 
const stream = await client.responses.create({
  model: 'openai/gpt-5.2',
  input: 'Write a haiku about programming.',
  stream: true,
});
 
for await (const event of stream) {
  if (event.type === 'response.output_text.delta') {
    process.stdout.write(event.delta);
  }
}
stream.py
import os
from openai import OpenAI
 
client = OpenAI(
    api_key=os.getenv('AI_GATEWAY_API_KEY'),
    base_url='https://ai-gateway.vercel.sh/v1',
)
 
stream = client.responses.create(
    model='openai/gpt-5.2',
    input='Write a haiku about programming.',
    stream=True,
)
 
for event in stream:
    if event.type == 'response.output_text.delta':
        print(event.delta, end='', flush=True)

Define tools with JSON Schema parameters. The model can call them, and you can feed the results back in a follow-up request:

tools.ts
import OpenAI from 'openai';
 
const client = new OpenAI({
  apiKey: process.env.AI_GATEWAY_API_KEY,
  baseURL: 'https://ai-gateway.vercel.sh/v1',
});
 
const response = await client.responses.create({
  model: 'openai/gpt-5.2',
  input: 'What is the weather in San Francisco?',
  tools: [
    {
      type: 'function',
      name: 'get_weather',
      description: 'Get the current weather for a location',
      strict: true,
      parameters: {
        type: 'object',
        properties: {
          location: { type: 'string' },
        },
        required: ['location'],
        additionalProperties: false,
      },
    },
  ],
});
 
// The model returns function_call items in the output
for (const item of response.output) {
  if (item.type === 'function_call') {
    console.log(`Call: ${item.name}(${item.arguments})`);
  }
}

To continue the conversation with tool results, include the function call and its output in the next request's input array:

tool-followup.ts
const functionCall = response.output.find(
  (item) => item.type === 'function_call',
);
 
const followup = await client.responses.create({
  model: 'openai/gpt-5.2',
  input: [
    { role: 'user', content: 'What is the weather in San Francisco?' },
    {
      type: 'function_call',
      id: functionCall.id,
      call_id: functionCall.call_id,
      name: functionCall.name,
      arguments: functionCall.arguments,
    },
    {
      type: 'function_call_output',
      call_id: functionCall.call_id,
      output: JSON.stringify({ temperature: 68, condition: 'Sunny' }),
    },
  ],
  tools: [
    /* same tools as above */
  ],
});
 
console.log(followup.output_text);

Use text.format to constrain the model's output to a JSON schema:

structured.ts
import OpenAI from 'openai';
 
const client = new OpenAI({
  apiKey: process.env.AI_GATEWAY_API_KEY,
  baseURL: 'https://ai-gateway.vercel.sh/v1',
});
 
const response = await client.responses.create({
  model: 'openai/gpt-5.2',
  input: 'List 3 colors with their hex codes.',
  text: {
    format: {
      type: 'json_schema',
      name: 'colors',
      strict: true,
      schema: {
        type: 'object',
        properties: {
          colors: {
            type: 'array',
            items: {
              type: 'object',
              properties: {
                name: { type: 'string' },
                hex: { type: 'string' },
              },
              required: ['name', 'hex'],
              additionalProperties: false,
            },
          },
        },
        required: ['colors'],
        additionalProperties: false,
      },
    },
  },
});
 
const data = JSON.parse(response.output_text);
console.log(data.colors);

For models that support reasoning, set the reasoning parameter to control how much effort the model spends thinking:

reasoning.ts
import OpenAI from 'openai';
 
const client = new OpenAI({
  apiKey: process.env.AI_GATEWAY_API_KEY,
  baseURL: 'https://ai-gateway.vercel.sh/v1',
});
 
const response = await client.responses.create({
  model: 'anthropic/claude-sonnet-4.5',
  input: 'Explain the Monty Hall problem step by step.',
  reasoning: {
    effort: 'high',
  },
  max_output_tokens: 2048,
});
 
console.log(response.output_text);

The effort parameter accepts low, medium, or high. AI Gateway maps this to provider-specific reasoning settings automatically.

ParameterTypeDescription
modelstringModel ID in provider/model format (e.g., openai/gpt-5.2, anthropic/claude-sonnet-4.5)
inputstring or arrayA text string or array of input items (messages, function calls, function call outputs)
ParameterTypeDescription
streambooleanStream tokens via server-sent events. Defaults to false
max_output_tokensintegerMaximum number of tokens to generate
temperaturenumberControls randomness (0-2). Lower values are more deterministic
top_pnumberNucleus sampling (0-1)
instructionsstringSystem-level instructions for the model
toolsarrayTool definitions for function calling
tool_choicestring or objectTool selection: auto, required, none, or a specific function
reasoningobjectReasoning config with effort (low, medium, high)
textobjectOutput format config, including json_schema for structured output
metadataobjectUp to 16 key-value pairs for tracking (keys max 64 chars, values max 512 chars)

The API returns standard HTTP status codes and error responses.

  • 400 Bad Request - Invalid request parameters
  • 401 Unauthorized - Invalid or missing authentication
  • 403 Forbidden - Insufficient permissions
  • 404 Not Found - Model or endpoint not found
  • 429 Too Many Requests - Rate limit exceeded
  • 500 Internal Server Error - Server error

When an error occurs, the API returns a JSON object with details about what went wrong.

{
  "error": {
    "type": "invalid_request_error",
    "message": "At least one user message is required in the input"
  }
}

Was this helpful?

supported.