Skip to content
Dashboard

GLM 5.2 Fast

Fast version of GLM 5.2 with 120-250 TPS.

ReasoningTool UseImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'zai/glm-5.2-fast',
prompt: 'Why is the sky blue?'
})

Playground

Try out GLM 5.2 Fast by Z.ai. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

zai logo
zai logo

Ask GLM 5.2 Fast anything to try it out.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Wafer
1M
1.1s
186tps
$3.00/M$10.25/M
Read:$0.5/M
Write:—
——
+1
06/23/2026
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by Z.ai

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
0.7s
213tps
$1.40/M$4.40/M
Read:$0.26/M
Write:—
——
+1
baseten logo
fireworks logo
wafer logo
+1
06/16/2026
205K
0.2s
116tps
$1.30/M$4.30/M
Read:$0.26/M
Write:—
——
+1
baseten logo
deepinfra logo
fireworks logo
+3
04/07/2026
203K
5.5s
34tps
$1.20/M$4.00/M
Read:$0.24/M
Write:—
——
+1
zai logo
03/15/2026
203K
0.3s
74tps
$0.80/M$2.56/M
Read:$0.16/M
Write:—
——
+1
baseten logo
bedrock logo
deepinfra logo
+4
02/12/2026
205K
0.1s
917tps
$2.25/M$2.75/M
Read:$2.25/M
Write:—
——
+1
baseten logo
bedrock logo
cerebras logo
+3
12/22/2025
200K
$0.06/M$0.40/M
Read:$0.01/M
Write:—
——
+1
zai logo
01/01/2025