Skip to content
Dashboard

Grok 4.20 Beta Non-Reasoning

Grok 4.20 Beta Non-Reasoning is xAI's non-reasoning model in the Grok 4.20 beta generation, optimized for speed and direct responses with low hallucination rates and strict prompt adherence.

Tool UseImplicit CachingVision (Image)File InputWeb Search
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'xai/grok-4.20-non-reasoning-beta',
prompt: 'Why is the sky blue?'
})

Playground

Try out Grok 4.20 Beta Non-Reasoning by xAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

xai logo
xai logo

Ask Grok 4.20 Beta Non-Reasoning anything to try it out.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
xAI
2M
0.4s
161tps
$1.25/M
$2.50/M
Read:
$0.2/M
Write:
—
$5/K
+ input costs
—
+3
03/11/2026
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by xAI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
256K
0.4s
131tps
$1.00/M
$2.00/M
Read:
$0.2/M
Write:
—
$5/K
+ input costs
—
+3
xai logo
05/20/2026
1M
1.0s
151tps
$1.25/M
$2.50/M
Read:
$0.2/M
Write:
—
$5/K
+ input costs
—
+4
vertex logo
xai logo
04/30/2026
2M
0.4s
152tps
$1.25/M
$2.50/M
Read:
$0.2/M
Write:
—
$5/K
+ input costs
—
+3
vertex logo
xai logo
03/10/2026
2M
0.4s
140tps
$1.25/M
$2.50/M
Read:
$0.2/M
Write:
—
$5/K
+ input costs
—
+4
vertex logo
xai logo
03/10/2026
1M
0.3s
122tps
$0.20/M$0.50/M
Read:$0.05/M
Write:—
——
+2
vertex logo
11/19/2025
1M
1.1s
121tps
$0.20/M$0.50/M
Read:$0.05/M
Write:—
——
+3
vertex logo
11/19/2025

About Grok 4.20 Beta Non-Reasoning

Grok 4.20 Beta Non-Reasoning was released March 11, 2026 as the beta-tagged variant of xAI's Grok 4.20 non-reasoning model. It produces direct answers without chain-of-thought overhead, optimizing for speed and strict prompt adherence rather than deliberation depth.

The non-reasoning architecture makes Grok 4.20 Beta Non-Reasoning well suited for agentic pipelines where individual steps need fast, deterministic responses: tool selection, structured data extraction, classification, and routing decisions. Low hallucination rates and tight prompt following reduce the need for output validation in automated workflows.

As a beta release, Grok 4.20 Beta Non-Reasoning may receive weight updates or behavior changes before xAI stabilizes it. Teams running production traffic should use the stable grok-4.20-non-reasoning and reserve Grok 4.20 Beta Non-Reasoning for evaluation and staging.

What To Consider When Choosing a Provider

  • Configuration: Grok 4.20 Beta Non-Reasoning is in beta. Expect potential changes to behavior, pricing, or availability before general availability.
  • Configuration: This variant produces direct answers. If you need the model to reason through complex problems step by step, use the Grok 4.20 Reasoning variant instead.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Grok 4.20 Beta Non-Reasoning

Best For

  • High-throughput production APIs: Direct, precise answers at low latency serve end users best
  • Agentic tool-calling workflows: That benefit from fast decision-making with low hallucination rates
  • Classification and routing pipelines: That need reliable, prompt-adherent output for downstream processing
  • Chat and conversational interfaces: Low-hallucination, prompt-adherent responses arrive quickly without chain-of-thought overhead
  • Content generation tasks: Where strict prompt adherence matters more than deep reasoning

Consider Alternatives When

  • Complex analytical tasks: Requiring multi-step reasoning. Use the Grok 4.20 Reasoning variant
  • Multi-agent orchestration: The Grok 4.20 Multi-Agent variant is purpose-built for agent collaboration
  • Stable production deployments: Beta models introduce unwanted risk. Use Grok 4.1 Fast Non-Reasoning instead
  • Maximum cost efficiency on simple tasks: Grok 3 Mini Fast offers lower per-token costs

Conclusion

Grok 4.20 Beta Non-Reasoning trades reasoning depth for speed. Use it in agentic pipelines where fast, direct responses matter more than extended deliberation. For production stability, prefer the non-beta grok-4.20-non-reasoning.