Skip to content

Grok 4

Grok 4 is xAI's high-capability Grok 4 reasoning model. It uses chain-of-thought reasoning for mathematics, science, and coding workloads with a context window of 256K tokens.

ReasoningTool UseVision (Image)Web Search
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'xai/grok-4',
prompt: 'Why is the sky blue?'
})

Playground

Try out Grok 4 by xAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

About Grok 4

Grok 4 is xAI's full-scale Grok 4 reasoning model, released July 9, 2025. It represents a generational advancement over Grok 3, with substantial improvements in reasoning depth, instruction following, and scores on published benchmarks. The model builds on xAI's Colossus training infrastructure.

Grok 4 targets tasks that need extended multi-step reasoning, including competition-level mathematics, graduate-level science questions, and complex software engineering challenges. It produces detailed chain-of-thought traces you can inspect, which suits workflows where you need to verify reasoning steps.

The model supports a context window of 256K tokens and up to 256K tokens per response. It's available through Vercel AI Gateway at $3.0 per million input tokens and $15.0 per million output tokens. For lower latency, the Grok 4 Fast variants offer speed-optimized alternatives.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
xAI
Legal:Terms
Privacy
256K
3.8s
64tps
$3.00/M
$15.00/M
Read:
$0.75/M
Write:
$5/K
+ input costs
07/09/2025
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by xAI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
0.7s
84tps
$1.25/M
$2.50/M
Read:
$0.2/M
Write:
$5/K
+ input costs
xai logo
04/30/2026
2M
3.4s
900tps
$1.25/M
$2.50/M
Read:
$0.2/M
Write:
$5/K
+ input costs
xai logo
03/11/2026
2M
0.7s
55tps
$0.20/M
$0.50/M
Read:
$0.05/M
Write:
$5/K
+ input costs
xai logo
09/19/2025
256K
0.3s
104tps
$0.20/M$1.50/M
Read:$0.02/M
Write:
xai logo
08/28/2025
2M
0.2s
224tps
$0.20/M
$0.50/M
Read:
$0.05/M
Write:
$5/K
+ input costs
vertex logo
xai logo
07/09/2025
2M
0.8s
150tps
$0.20/M
$0.50/M
Read:
$0.05/M
Write:
$5/K
+ input costs
vertex logo
xai logo
07/09/2025

What To Consider When Choosing a Provider

  • Configuration: Grok 4 generates extended reasoning traces that increase output token consumption. Budget output tokens generously and factor reasoning overhead into cost estimates.
  • Configuration: Deep reasoning takes time. For interactive applications, evaluate whether Grok 4 Fast variants provide sufficient quality at acceptable latency.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Grok 4

Best For

  • Competition-grade mathematical reasoning: Competition problems, formal proofs, and advanced quantitative analysis
  • Graduate-level science and research: The model needs to synthesize concepts across disciplines and reason through complex hypotheses
  • Complex software engineering: Architecture design, debugging intricate systems, and reasoning about concurrent or distributed code
  • Multi-step analytical workflows: In finance, law, or consulting where each reasoning step needs to be traceable and verifiable
  • Agentic systems requiring deep planning: The model must reason through multi-step tool use and long-horizon goals

Consider Alternatives When

  • Speed-sensitive applications: Grok 4 Fast Non-Reasoning or Grok 4 Fast Reasoning provide better latency at reduced cost
  • Simple text tasks: Such as classification, extraction, or basic Q&A where Grok 3 Mini Fast is dramatically more cost-efficient
  • High-volume production pipelines: The cost of deep reasoning per request makes the workload uneconomical
  • Coding-specific tasks: Grok Code Fast 1 may provide comparable quality at lower latency for pure code generation

Conclusion

Grok 4 is xAI's full-scale Grok 4 model for math, science, and engineering tasks that need long chain-of-thought traces. Teams that can accept higher latency and cost for traceable, multi-step reasoning should evaluate Grok 4 against Grok 4 Fast variants.

Frequently Asked Questions

  • How does Grok 4 compare to Grok 3?

    Grok 4 is the full Grok 4 model. It scores higher than Grok 3 on reasoning benchmarks, plus improved instruction following and deeper chain-of-thought output.

  • Does Grok 4 show its reasoning process?

    Yes. Grok 4 generates chain-of-thought reasoning traces visible in the API response. You can inspect and verify the model's reasoning steps.

  • What is the context window for Grok 4?

    The context window is 256K tokens.

  • What does Grok 4 cost?

    See the pricing section on this page for today's rates. AI Gateway exposes each provider's pricing for Grok 4.

  • How do I authenticate with Grok 4 through Vercel AI Gateway?

    Use your Vercel AI Gateway API key with xai/grok-4 as the model identifier. AI Gateway handles provider routing automatically.

  • When should I use Grok 4 versus Grok 4 Fast?

    Use Grok 4 when reasoning depth and accuracy are paramount. Use Grok 4 Fast variants when you need faster responses and can accept some quality tradeoff on the hardest reasoning tasks.

  • Does Vercel AI Gateway support Zero Data Retention for Grok 4?

    Zero Data Retention is not currently available for this model. ZDR on AI Gateway applies to direct gateway requests; BYOK flows aren't covered. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.