Skip to content

GLM 4.5

GLM 4.5 is Z.ai's full-scale model released July 28, 2025, unifying reasoning, coding, and agentic capabilities in a single endpoint. Available through AI Gateway with built-in observability and intelligent provider routing.

ReasoningTool UseImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'zai/glm-4.5',
prompt: 'Why is the sky blue?'
})

Frequently Asked Questions

  • What is the difference between GLM 4.5 and GLM-4.5-Air?

    GLM 4.5 is the full-scale model optimized for maximum capability across reasoning, coding, and agentic tasks. GLM-4.5-Air is a lighter variant designed for lower latency and cost on less demanding workloads.

  • Does GLM 4.5 support configurable thinking?

    Yes. You can enable or disable chain-of-thought reasoning per request. Thinking mode improves accuracy on complex tasks but increases output length and latency.

  • What is the context window for GLM 4.5?

    131.1K tokens, supporting long documents, extended conversations, and multi-file code analysis in a single request.

  • How much does GLM 4.5 cost through AI Gateway?

    Pricing appears on this page and updates as providers adjust their rates. AI Gateway routes traffic through the configured provider.

  • How do I authenticate with GLM 4.5 through AI Gateway?

    AI Gateway provides a unified API key. You don't need a separate Z.ai account. Configure your API key in your environment, then use the model identifier to route requests. BYOK is also supported if you have a direct provider account.

  • Can I use GLM 4.5 for agentic applications with tool use?

    Yes. GLM 4.5 supports agentic workflows with multi-step planning and tool use. The configurable thinking mode lets you control reasoning depth per step in your pipeline.

  • What providers serve GLM 4.5 through AI Gateway?

    GLM 4.5 is available through zai, novita. AI Gateway handles intelligent routing and automatic retries across configured providers.