GLM 4.5 Air

GLM 4.5 Air is Z.ai's efficiency-focused model released July 28, 2025. It delivers fast inference for high-volume workloads while keeping reasoning and coding capability at reduced cost compared to the full GLM-4.5.

ReasoningTool UseImplicit Caching

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'zai/glm-4.5-air',
  prompt: 'Why is the sky blue?'
})

Overview About Providers Throughput Latency Uptime Status Similar FAQ

About GLM 4.5 Air

GLM 4.5 Air was released July 28, 2025 as the efficiency-optimized variant in Z.ai's GLM-4.5 generation. Where GLM-4.5 targets maximum capability, GLM 4.5 Air trades a degree of depth for faster inference and lower per-token cost, making it practical for high-throughput production pipelines.

The model retains the core reasoning, coding, and agentic capabilities of the GLM-4.5 family while operating at reduced computational overhead. This positions it for workloads where response latency and cost per request are primary constraints: classification, extraction, summarization, and conversational applications that process high volumes of requests.

GLM 4.5 Air supports the same context window of 128K tokens as the full GLM-4.5 model. Through AI Gateway, it benefits from unified API access, built-in observability, and intelligent provider routing with automatic retries.

Agent Stack

Core Platform

Tools

Learn

Build

Explore

GLM 4.5 Air

About GLM 4.5 Air