What can GLM 5V Turbo do with screenshots?

It converts screenshots and design mockups into responsive code, identifies visual bugs in rendered output, and navigates GUI environments by reading screen elements and performing actions.

How does GLM 5V Turbo compare to GLM-4.6V?

GLM 5V Turbo is a newer, compact vision model focused on coding and GUI tasks. GLM-4.6V is a larger 106B parameter model with broader vision-language capabilities including native multimodal function calling and interleaved image-text generation.

Does GLM 5V Turbo support design-to-code generation?

Yes. It's specifically built for this workflow. Provide a screenshot or design mockup and specify the target framework. The model generates matching responsive components.

How do I authenticate with GLM 5V Turbo through AI Gateway?

AI Gateway provides a unified API key. No separate Z.ai account is needed. Use the `glm-5v-turbo` model identifier to route requests. BYOK is also supported.

Can GLM 5V Turbo operate GUIs autonomously?

Yes. It reads screen elements, interprets visual context, and performs navigation actions in real GUI environments. This makes it useful for automated testing and UI interaction workflows.

What is the pricing for GLM 5V Turbo?

Check the pricing panel on this page for today's numbers. AI Gateway tracks rates across every provider that serves GLM 5V Turbo.

Dashboard

GLM 5V Turbo

GLM 5V Turbo is Z.ai's vision-enabled turbo model released April 1, 2026. It turns screenshots and designs into code, debugs visually, and operates GUIs autonomously, combining GLM-5's agentic capabilities with multimodal vision input at a compact parameter size.

ReasoningTool UseImplicit CachingVision (Image)File Input

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'zai/glm-5v-turbo',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

Playground

Try out GLM 5V Turbo by Z.ai. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Legal:Terms

•

Privacy

200K

0.9s

133tps

$1.20/M

$4.00/M

Read:$0.24/M

Write:—

—

04/01/2026

More models by Z.ai

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

205K

0.7s

87tps

$1.30/M

$4.30/M

Read:$0.26/M

Write:—

—

04/07/2026

203K

7.4s

64tps

$1.20/M

$4.00/M

Read:$0.24/M

Write:—

—

03/15/2026

203K

0.4s

118tps

$0.80/M

$2.56/M

Read:$0.16/M

Write:—

—

02/12/2026

205K

0.1s

481tps

$2.25/M

$2.75/M

Read:$2.25/M

Write:—

—

12/22/2025

205K

0.4s

193tps

$0.60/M

$2.20/M

Read:$0.11/M

Write:—

—

09/30/2025

200K

0.3s

126tps

$0.07/M

$0.40/M

Read:$0.01/M

Write:—

—

About GLM 5V Turbo

GLM 5V Turbo was released April 1, 2026 as the vision-enabled turbo variant in Z.ai's GLM-5 generation. It combines GLM-5's agentic capabilities with multimodal vision input, purpose-built for workflows where visual understanding drives code generation and UI interaction.

The model focuses on design-to-code generation. Given a screenshot or design mockup, GLM 5V Turbo produces responsive components that match the original layout. It can debug visually by examining screenshots of rendered output and identifying discrepancies, then generating fixes. The model also navigates real GUI environments autonomously, reading screen elements and performing actions without manual scripting.

Despite these multimodal capabilities, GLM 5V Turbo operates at a smaller parameter size than comparable vision-language models. This translates to faster inference and lower cost per request, making high-volume visual coding workflows economically viable. Through AI Gateway, it's accessible via the same unified API with built-in observability and provider routing.

What To Consider When Choosing a Provider

Configuration: GLM 5V Turbo converts visual designs to code. For best results, provide clean screenshots at sufficient resolution and specify the target framework (React, HTML/CSS, etc.) in your prompt.
Configuration: You can use GLM 5V Turbo in an iterative loop: render code, screenshot the result, feed it back to the model for corrections. This workflow leverages both vision and coding capabilities.
Configuration: The compact parameter size means faster inference, but the most complex visual reasoning tasks may benefit from the full GLM-4.6V (106B). Benchmark on your specific use cases.
Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use GLM 5V Turbo

Best For

Design-to-code generation: Screenshots and mockups convert into responsive React components, HTML, and CSS
Visual debugging: The model examines rendered output, identifies layout issues, and generates fixes
GUI automation: Real screen environments navigated autonomously for testing and interaction workflows
Agentic visual coding pipelines: Image understanding combined with autonomous code planning and iteration
High-volume visual processing: The compact parameter size keeps inference fast and cost-effective

Consider Alternatives When

Maximum visual reasoning depth: GLM-4.6V (106B) provides the largest vision-language model in the lineup without speed constraints
Text-only workloads: GLM-5-Turbo offers the same generation's speed without vision overhead
Simple captioning or classification: A lighter vision model may be more cost-effective for basic image tasks
Deepest text reasoning: The full GLM-5 text-only variant provides multiple thinking modes without vision input

Conclusion

GLM 5V Turbo bridges vision and code generation in a fast, compact package. For teams building design-to-code pipelines, visual debugging loops, or autonomous GUI agents, it delivers the GLM-5 generation's agentic capabilities with multimodal input at a practical speed and cost profile.

Frequently Asked Questions

What can GLM 5V Turbo do with screenshots?
It converts screenshots and design mockups into responsive code, identifies visual bugs in rendered output, and navigates GUI environments by reading screen elements and performing actions.
How does GLM 5V Turbo compare to GLM-4.6V?
GLM 5V Turbo is a newer, compact vision model focused on coding and GUI tasks. GLM-4.6V is a larger 106B parameter model with broader vision-language capabilities including native multimodal function calling and interleaved image-text generation.
Does GLM 5V Turbo support design-to-code generation?
Yes. It's specifically built for this workflow. Provide a screenshot or design mockup and specify the target framework. The model generates matching responsive components.
What is the context window for GLM 5V Turbo?
200K tokens.
How do I authenticate with GLM 5V Turbo through AI Gateway?
AI Gateway provides a unified API key. No separate Z.ai account is needed. Use the glm-5v-turbo model identifier to route requests. BYOK is also supported.
Can GLM 5V Turbo operate GUIs autonomously?
Yes. It reads screen elements, interprets visual context, and performs navigation actions in real GUI environments. This makes it useful for automated testing and UI interaction workflows.
What is the pricing for GLM 5V Turbo?
Check the pricing panel on this page for today's numbers. AI Gateway tracks rates across every provider that serves GLM 5V Turbo.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

GLM 5V Turbo

Playground

Providers

More models by Z.ai

About GLM 5V Turbo

What To Consider When Choosing a Provider

When to Use GLM 5V Turbo

Best For

Consider Alternatives When

Conclusion

Frequently Asked Questions