Grok 4
Grok 4 is xAI's high-capability Grok 4 reasoning model. It uses chain-of-thought reasoning for mathematics, science, and coding workloads with a context window of 256K tokens.
import { streamText } from 'ai'
const result = streamText({ model: 'xai/grok-4', prompt: 'Why is the sky blue?'})Playground
Try out Grok 4 by xAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
About Grok 4
Grok 4 is xAI's full-scale Grok 4 reasoning model, released July 9, 2025. It represents a generational advancement over Grok 3, with substantial improvements in reasoning depth, instruction following, and scores on published benchmarks. The model builds on xAI's Colossus training infrastructure.
Grok 4 targets tasks that need extended multi-step reasoning, including competition-level mathematics, graduate-level science questions, and complex software engineering challenges. It produces detailed chain-of-thought traces you can inspect, which suits workflows where you need to verify reasoning steps.
The model supports a context window of 256K tokens and up to 256K tokens per response. It's available through Vercel AI Gateway at $3.0 per million input tokens and $15.0 per million output tokens. For lower latency, the Grok 4 Fast variants offer speed-optimized alternatives.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by xAI
| Model |
|---|
What To Consider When Choosing a Provider
- Configuration: Grok 4 generates extended reasoning traces that increase output token consumption. Budget output tokens generously and factor reasoning overhead into cost estimates.
- Configuration: Deep reasoning takes time. For interactive applications, evaluate whether Grok 4 Fast variants provide sufficient quality at acceptable latency.
- Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Grok 4
Best For
- Competition-grade mathematical reasoning: Competition problems, formal proofs, and advanced quantitative analysis
- Graduate-level science and research: The model needs to synthesize concepts across disciplines and reason through complex hypotheses
- Complex software engineering: Architecture design, debugging intricate systems, and reasoning about concurrent or distributed code
- Multi-step analytical workflows: In finance, law, or consulting where each reasoning step needs to be traceable and verifiable
- Agentic systems requiring deep planning: The model must reason through multi-step tool use and long-horizon goals
Consider Alternatives When
- Speed-sensitive applications: Grok 4 Fast Non-Reasoning or Grok 4 Fast Reasoning provide better latency at reduced cost
- Simple text tasks: Such as classification, extraction, or basic Q&A where Grok 3 Mini Fast is dramatically more cost-efficient
- High-volume production pipelines: The cost of deep reasoning per request makes the workload uneconomical
- Coding-specific tasks: Grok Code Fast 1 may provide comparable quality at lower latency for pure code generation
Conclusion
Grok 4 is xAI's full-scale Grok 4 model for math, science, and engineering tasks that need long chain-of-thought traces. Teams that can accept higher latency and cost for traceable, multi-step reasoning should evaluate Grok 4 against Grok 4 Fast variants.
Frequently Asked Questions
How does Grok 4 compare to Grok 3?
Grok 4 is the full Grok 4 model. It scores higher than Grok 3 on reasoning benchmarks, plus improved instruction following and deeper chain-of-thought output.
Does Grok 4 show its reasoning process?
Yes. Grok 4 generates chain-of-thought reasoning traces visible in the API response. You can inspect and verify the model's reasoning steps.
What is the context window for Grok 4?
The context window is 256K tokens.
What does Grok 4 cost?
See the pricing section on this page for today's rates. AI Gateway exposes each provider's pricing for Grok 4.
How do I authenticate with Grok 4 through Vercel AI Gateway?
Use your Vercel AI Gateway API key with
xai/grok-4as the model identifier. AI Gateway handles provider routing automatically.When should I use Grok 4 versus Grok 4 Fast?
Use Grok 4 when reasoning depth and accuracy are paramount. Use Grok 4 Fast variants when you need faster responses and can accept some quality tradeoff on the hardest reasoning tasks.
Does Vercel AI Gateway support Zero Data Retention for Grok 4?
Zero Data Retention is not currently available for this model. ZDR on AI Gateway applies to direct gateway requests; BYOK flows aren't covered. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.