What is the difference between GLM 4.5 and GLM-4.5-Air?

GLM 4.5 is the full-scale model optimized for maximum capability across reasoning, coding, and agentic tasks. GLM-4.5-Air is a lighter variant designed for lower latency and cost on less demanding workloads.

Does GLM 4.5 support configurable thinking?

Yes. You can enable or disable chain-of-thought reasoning per request. Thinking mode improves accuracy on complex tasks but increases output length and latency.

What is the context window for GLM 4.5?

131.1K tokens, supporting long documents, extended conversations, and multi-file code analysis in a single request.

How much does GLM 4.5 cost through AI Gateway?

Pricing appears on this page and updates as providers adjust their rates. AI Gateway routes traffic through the configured provider.

How do I authenticate with GLM 4.5 through AI Gateway?

AI Gateway provides a unified API key. You don't need a separate Z.ai account. Configure your API key in your environment, then use the model identifier to route requests. BYOK is also supported if you have a direct provider account.

Can I use GLM 4.5 for agentic applications with tool use?

Yes. GLM 4.5 supports agentic workflows with multi-step planning and tool use. The configurable thinking mode lets you control reasoning depth per step in your pipeline.

What providers serve GLM 4.5 through AI Gateway?

GLM 4.5 is available through zai, novita. AI Gateway handles intelligent routing and automatic retries across configured providers.

GLM 4.5

GLM 4.5 is Z.ai's full-scale model released July 28, 2025, unifying reasoning, coding, and agentic capabilities in a single endpoint. Available through AI Gateway with built-in observability and intelligent provider routing.

ReasoningTool UseImplicit Caching

import { streamText } from 'ai'

const result = streamText({
  model: 'zai/glm-4.5',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

Frequently Asked Questions

What is the difference between GLM 4.5 and GLM-4.5-Air?
GLM 4.5 is the full-scale model optimized for maximum capability across reasoning, coding, and agentic tasks. GLM-4.5-Air is a lighter variant designed for lower latency and cost on less demanding workloads.
Does GLM 4.5 support configurable thinking?
Yes. You can enable or disable chain-of-thought reasoning per request. Thinking mode improves accuracy on complex tasks but increases output length and latency.
What is the context window for GLM 4.5?
131.1K tokens, supporting long documents, extended conversations, and multi-file code analysis in a single request.
How much does GLM 4.5 cost through AI Gateway?
Pricing appears on this page and updates as providers adjust their rates. AI Gateway routes traffic through the configured provider.
How do I authenticate with GLM 4.5 through AI Gateway?
AI Gateway provides a unified API key. You don't need a separate Z.ai account. Configure your API key in your environment, then use the model identifier to route requests. BYOK is also supported if you have a direct provider account.
Can I use GLM 4.5 for agentic applications with tool use?
Yes. GLM 4.5 supports agentic workflows with multi-step planning and tool use. The configurable thinking mode lets you control reasoning depth per step in your pipeline.
What providers serve GLM 4.5 through AI Gateway?
GLM 4.5 is available through zai, novita. AI Gateway handles intelligent routing and automatic retries across configured providers.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

GLM 4.5

Frequently Asked Questions