Skip to content

GPT-4 Turbo

openai/gpt-4-turbo

GPT-4 Turbo launched at OpenAI DevDay 2023 with a context window of 128K tokens, built-in vision, JSON mode, and a knowledge cutoff of April 2023, all at reduced input prices compared to the original GPT-4.

Tool UseVision (Image)
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'openai/gpt-4-turbo',
prompt: 'Why is the sky blue?'
})

What To Consider When Choosing a Provider

  • Zero Data Retention

    AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.

    Authentication

    AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

If you built integrations around GPT-4 Turbo's specific JSON mode behavior or vision API surface, verify provider-level feature parity before routing production traffic. Some capabilities behave consistently across providers while others may have provider-specific nuances.

When to Use GPT-4 Turbo

Best For

  • Long-document analysis:

    Legal review or research tasks that benefit from holding an entire lengthy document in context at once

  • Reliable JSON output:

    Applications requiring structured responses through the built-in JSON mode

  • Vision-capable pipelines:

    Image captioning, document processing with figures, and screenshot analysis

  • Current-events workflows:

    Use cases that depend on knowledge of events up to April 2023

  • Cost-efficient GPT-4 reasoning:

    Complex reasoning where GPT-4 class depth is required at a cost tier below the original GPT-4

Consider Alternatives When

  • Larger context needed:

    You need a window beyond 128K tokens and GPT-4.1 supports up to 1M tokens

  • Native audio required:

    GPT-4o handles audio input and output natively

  • Later snapshot features:

    You want structured outputs guarantees or creative writing enhancements introduced in later GPT-4o snapshots

  • Cost-driven workloads:

    GPT-4o or GPT-4o mini provide sufficient quality at lower cost

Conclusion

GPT-4 Turbo marked the point where GPT-4 class reasoning became broadly economical, pairing a context window of 128K tokens and vision input with reduced pricing for production deployment. For document-heavy, vision-enabled, or JSON-structured workloads that were designed around its specific feature set, it remains a well-understood and reliable option through AI Gateway.

FAQ

At 128K tokens it holds roughly 300 pages of text, enabling workflows like full-codebase review, long legal document analysis, and extended multi-session conversation history without chunking.

Set response_format to { type: "json_object" } and the model constrains itself to produce valid JSON. This differs from the stricter JSON Schema-based Structured Outputs introduced later with gpt-4o-2024-08-06.

Yes. You can pass image URLs or base64-encoded images alongside text in the messages array. The model can analyze photographs, diagrams, screenshots, and documents with embedded figures.

April 2023. This was an update from earlier GPT-4 models and makes the model aware of events from the first half of 2023.

Rates are listed on this page. They reflect the providers routing through AI Gateway and shift when providers update their pricing.

Yes. AI Gateway handles authentication using its own API key or OIDC token system, so you don't need to embed OpenAI credentials in your deployment environment.

This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.