GLM 5 Turbo
GLM 5 Turbo is the speed-optimized variant of Z.ai's GLM-5, released March 15, 2026. It trades some reasoning depth for faster throughput and lower latency while retaining GLM-5's multiple thinking modes and agentic capabilities.
import { streamText } from 'ai'
const result = streamText({ model: 'zai/glm-5-turbo', prompt: 'Why is the sky blue?'})Frequently Asked Questions
How does GLM 5 Turbo compare to the full GLM-5?
GLM 5 Turbo shares GLM-5's core capabilities, including multiple thinking modes and enhanced agentic coding. It's optimized for faster inference at lower cost, with some reduction in peak reasoning depth on the most complex tasks.
Does GLM 5 Turbo support multiple thinking modes?
Yes. It retains GLM-5's multiple thinking modes, letting you select the reasoning depth per request. All modes run faster than their equivalents on the full GLM-5.
What is the context window for GLM 5 Turbo?
202.8K tokens.
Can I switch between GLM-5 and GLM 5 Turbo easily?
Yes. Both share the same API surface. Change the model identifier to switch between them without any other integration changes.
How do I authenticate with GLM 5 Turbo through AI Gateway?
AI Gateway provides a unified API key. No separate Z.ai account is needed. Use the model identifier to route requests. BYOK is also supported for direct provider access.
Is GLM 5 Turbo good for agentic coding?
Yes. It inherits GLM-5's improvements in autonomous tool use, code planning, and multi-step iteration. The faster inference makes it practical for agent loops where speed compounds across many steps.
What is the pricing for GLM 5 Turbo?
See the pricing section on this page for today's rates. AI Gateway exposes each provider's pricing for GLM 5 Turbo.