Responses API

Last updated February 24, 2026

AI Gateway supports the OpenAI Responses API, letting you use the official OpenAI SDKs with any provider supported by the gateway. Point your client to the AI Gateway base URL and use provider/model identifiers to route requests to OpenAI, Anthropic, Google, and more.

Getting started

Set your SDK's base URL to the AI Gateway and use your API key for authentication:

basic.ts

import OpenAI from 'openai';
 
const client = new OpenAI({
  apiKey: process.env.AI_GATEWAY_API_KEY,
  baseURL: 'https://ai-gateway.vercel.sh/v1',
});
 
const response = await client.responses.create({
  model: 'anthropic/claude-sonnet-4.5',
  input: 'What is the capital of France?',
});
 
console.log(response.output_text);

basic.py

import os
from openai import OpenAI
 
client = OpenAI(
    api_key=os.getenv('AI_GATEWAY_API_KEY'),
    base_url='https://ai-gateway.vercel.sh/v1',
)
 
response = client.responses.create(
    model='anthropic/claude-sonnet-4.5',
    input='What is the capital of France?',
)
 
print(response.output_text)

Streaming

Set stream: true to receive tokens as they're generated. The SDK returns an async iterator of server-sent events:

stream.ts

import OpenAI from 'openai';
 
const client = new OpenAI({
  apiKey: process.env.AI_GATEWAY_API_KEY,
  baseURL: 'https://ai-gateway.vercel.sh/v1',
});
 
const stream = await client.responses.create({
  model: 'openai/gpt-5.2',
  input: 'Write a haiku about programming.',
  stream: true,
});
 
for await (const event of stream) {
  if (event.type === 'response.output_text.delta') {
    process.stdout.write(event.delta);
  }
}

stream.py

import os
from openai import OpenAI
 
client = OpenAI(
    api_key=os.getenv('AI_GATEWAY_API_KEY'),
    base_url='https://ai-gateway.vercel.sh/v1',
)
 
stream = client.responses.create(
    model='openai/gpt-5.2',
    input='Write a haiku about programming.',
    stream=True,
)
 
for event in stream:
    if event.type == 'response.output_text.delta':
        print(event.delta, end='', flush=True)

Tool calling

Define tools with JSON Schema parameters. The model can call them, and you can feed the results back in a follow-up request:

tools.ts

import OpenAI from 'openai';
 
const client = new OpenAI({
  apiKey: process.env.AI_GATEWAY_API_KEY,
  baseURL: 'https://ai-gateway.vercel.sh/v1',
});
 
const response = await client.responses.create({
  model: 'openai/gpt-5.2',
  input: 'What is the weather in San Francisco?',
  tools: [
    {
      type: 'function',
      name: 'get_weather',
      description: 'Get the current weather for a location',
      strict: true,
      parameters: {
        type: 'object',
        properties: {
          location: { type: 'string' },
        },
        required: ['location'],
        additionalProperties: false,
      },
    },
  ],
});
 
// The model returns function_call items in the output
for (const item of response.output) {
  if (item.type === 'function_call') {
    console.log(`Call: ${item.name}(${item.arguments})`);
  }
}

To continue the conversation with tool results, include the function call and its output in the next request's input array:

tool-followup.ts

const functionCall = response.output.find(
  (item) => item.type === 'function_call',
);
 
const followup = await client.responses.create({
  model: 'openai/gpt-5.2',
  input: [
    { role: 'user', content: 'What is the weather in San Francisco?' },
    {
      type: 'function_call',
      id: functionCall.id,
      call_id: functionCall.call_id,
      name: functionCall.name,
      arguments: functionCall.arguments,
    },
    {
      type: 'function_call_output',
      call_id: functionCall.call_id,
      output: JSON.stringify({ temperature: 68, condition: 'Sunny' }),
    },
  ],
  tools: [
    /* same tools as above */
  ],
});
 
console.log(followup.output_text);

Structured output

Use text.format to constrain the model's output to a JSON schema:

structured.ts

import OpenAI from 'openai';
 
const client = new OpenAI({
  apiKey: process.env.AI_GATEWAY_API_KEY,
  baseURL: 'https://ai-gateway.vercel.sh/v1',
});
 
const response = await client.responses.create({
  model: 'openai/gpt-5.2',
  input: 'List 3 colors with their hex codes.',
  text: {
    format: {
      type: 'json_schema',
      name: 'colors',
      strict: true,
      schema: {
        type: 'object',
        properties: {
          colors: {
            type: 'array',
            items: {
              type: 'object',
              properties: {
                name: { type: 'string' },
                hex: { type: 'string' },
              },
              required: ['name', 'hex'],
              additionalProperties: false,
            },
          },
        },
        required: ['colors'],
        additionalProperties: false,
      },
    },
  },
});
 
const data = JSON.parse(response.output_text);
console.log(data.colors);

Reasoning

For models that support reasoning, set the reasoning parameter to control how much effort the model spends thinking:

reasoning.ts

import OpenAI from 'openai';
 
const client = new OpenAI({
  apiKey: process.env.AI_GATEWAY_API_KEY,
  baseURL: 'https://ai-gateway.vercel.sh/v1',
});
 
const response = await client.responses.create({
  model: 'anthropic/claude-sonnet-4.5',
  input: 'Explain the Monty Hall problem step by step.',
  reasoning: {
    effort: 'high',
  },
  max_output_tokens: 2048,
});
 
console.log(response.output_text);

The effort parameter accepts low, medium, or high. AI Gateway maps this to provider-specific reasoning settings automatically.

Parameters

Required

Parameter	Type	Description
`model`	string	Model ID in `provider/model` format (e.g., `openai/gpt-5.2`, `anthropic/claude-sonnet-4.5`)
`input`	string or array	A text string or array of input items (messages, function calls, function call outputs)

Optional

Parameter	Type	Description
`stream`	boolean	Stream tokens via server-sent events. Defaults to `false`
`max_output_tokens`	integer	Maximum number of tokens to generate
`temperature`	number	Controls randomness (0-2). Lower values are more deterministic
`top_p`	number	Nucleus sampling (0-1)
`instructions`	string	System-level instructions for the model
`tools`	array	Tool definitions for function calling
`tool_choice`	string or object	Tool selection: `auto`, `required`, `none`, or a specific function
`reasoning`	object	Reasoning config with `effort` (`low`, `medium`, `high`)
`text`	object	Output format config, including `json_schema` for structured output
`metadata`	object	Up to 16 key-value pairs for tracking (keys max 64 chars, values max 512 chars)

Error handling

The API returns standard HTTP status codes and error responses.

Common error codes

400 Bad Request - Invalid request parameters
401 Unauthorized - Invalid or missing authentication
403 Forbidden - Insufficient permissions
404 Not Found - Model or endpoint not found
429 Too Many Requests - Rate limit exceeded
500 Internal Server Error - Server error

Error response format

When an error occurs, the API returns a JSON object with details about what went wrong.

{
  "error": {
    "type": "invalid_request_error",
    "message": "At least one user message is required in the input"
  }
}

Chat Completions

Tool Calls

Was this helpful?

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users