Grok 2 Vision

Grok 2 vision model excels in vision-based tasks, delivering state-of-the-art performance in visual math reasoning (MathVista) and document-based question answering (DocVQA). It can process a wide variety of visual information including documents, diagrams, charts, screenshots, and photographs.

Tool UseVision (Image)

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'xai/grok-2-vision',
  prompt: 'Why is the sky blue?'
})

Overview Playground Providers Throughput Latency Uptime Status Similar

More models by xAI

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

1.2s

79tps

$1.25/M

$2.50/M

Read:

$0.2/M

Write:

—

$5/K

+ input costs

—

04/30/2026

4.8s

571tps

$1.25/M

$2.50/M

Read:

$0.2/M

Write:

—

$5/K

+ input costs

—

03/11/2026

0.7s

65tps

$0.20/M

$0.50/M

Read:

$0.05/M

Write:

—

$5/K

+ input costs

—

09/19/2025

256K

0.4s

93tps

$0.20/M

$1.50/M

Read:$0.02/M

Write:—

—

08/28/2025

0.2s

189tps

$0.20/M

$0.50/M

Read:

$0.05/M

Write:

—

$5/K

+ input costs

—

07/09/2025

0.7s

342tps

$0.20/M

$0.50/M

Read:

$0.05/M

Write:

—

$5/K

+ input costs

—

07/09/2025

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Grok 2 Vision

More models by xAI