Skip to content

GLM 5 Turbo

GLM 5 Turbo is the speed-optimized variant of Z.ai's GLM-5, released March 15, 2026. It trades some reasoning depth for faster throughput and lower latency while retaining GLM-5's multiple thinking modes and agentic capabilities.

ReasoningTool UseImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'zai/glm-5-turbo',
prompt: 'Why is the sky blue?'
})

Frequently Asked Questions

  • How does GLM 5 Turbo compare to the full GLM-5?

    GLM 5 Turbo shares GLM-5's core capabilities, including multiple thinking modes and enhanced agentic coding. It's optimized for faster inference at lower cost, with some reduction in peak reasoning depth on the most complex tasks.

  • Does GLM 5 Turbo support multiple thinking modes?

    Yes. It retains GLM-5's multiple thinking modes, letting you select the reasoning depth per request. All modes run faster than their equivalents on the full GLM-5.

  • What is the context window for GLM 5 Turbo?

    202.8K tokens.

  • Can I switch between GLM-5 and GLM 5 Turbo easily?

    Yes. Both share the same API surface. Change the model identifier to switch between them without any other integration changes.

  • How do I authenticate with GLM 5 Turbo through AI Gateway?

    AI Gateway provides a unified API key. No separate Z.ai account is needed. Use the model identifier to route requests. BYOK is also supported for direct provider access.

  • Is GLM 5 Turbo good for agentic coding?

    Yes. It inherits GLM-5's improvements in autonomous tool use, code planning, and multi-step iteration. The faster inference makes it practical for agent loops where speed compounds across many steps.

  • What is the pricing for GLM 5 Turbo?

    See the pricing section on this page for today's rates. AI Gateway exposes each provider's pricing for GLM 5 Turbo.