Skip to content

Grok 3 Mini Beta

Grok 3 Mini Beta is xAI's compact reasoning model in the Grok 3 family. It provides efficient inference for tasks that need solid reasoning without the full computational overhead of the full-scale Grok 3, within a context window of 131.1K tokens.

Tool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'xai/grok-3-mini',
prompt: 'Why is the sky blue?'
})

Playground

Try out Grok 3 Mini Beta by xAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

About Grok 3 Mini Beta

Grok 3 Mini Beta is the compact variant in the Grok 3 model family, released February 17, 2025. It distills the reasoning capabilities of the full Grok 3 into a smaller, more efficient architecture that reduces both latency and cost per token while still handling standard language tasks across summarization, Q&A, code, and instruction following.

The model supports a context window of 131.1K tokens and handles general-purpose tasks including summarization, question answering, code generation, and instruction following. Where the full Grok 3 targets the hardest benchmark-style reasoning tasks, Grok 3 Mini Beta fits the broad middle ground of production workloads where lower cost matters more than pushing benchmark limits.

Grok 3 Mini Beta is available at $0.3 per million input tokens and $0.5 per million output tokens through Vercel AI Gateway. For workloads that prioritize speed above all else, the Grok 3 Mini Fast variant adds further latency optimization.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
xAI
Legal:Terms
Privacy
131K
0.3s
101tps
$0.30/M$0.50/M
Read:$0.07/M
Write:
02/17/2025
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by xAI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
0.8s
75tps
$1.25/M
$2.50/M
Read:
$0.2/M
Write:
$5/K
+ input costs
xai logo
04/30/2026
2M
4.2s
886tps
$1.25/M
$2.50/M
Read:
$0.2/M
Write:
$5/K
+ input costs
xai logo
03/11/2026
2M
0.6s
52tps
$0.20/M
$0.50/M
Read:
$0.05/M
Write:
$5/K
+ input costs
xai logo
09/19/2025
256K
0.3s
93tps
$0.20/M$1.50/M
Read:$0.02/M
Write:
xai logo
08/28/2025
2M
0.2s
80tps
$0.20/M
$0.50/M
Read:
$0.05/M
Write:
$5/K
+ input costs
vertex logo
xai logo
07/09/2025
2M
0.7s
146tps
$0.20/M
$0.50/M
Read:
$0.05/M
Write:
$5/K
+ input costs
vertex logo
xai logo
07/09/2025

What To Consider When Choosing a Provider

  • Configuration: Grok 3 Mini Beta significantly reduces per-token costs compared to the full Grok 3. For high-volume pipelines, this difference compounds quickly.
  • Configuration: Evaluate Grok 3 Mini Beta on your specific tasks before defaulting to the larger Grok 3. Many production workloads see negligible quality differences at substantially lower cost.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Grok 3 Mini Beta

Best For

  • High-volume production workloads: Per-token cost is a primary concern and mid-tier reasoning is sufficient
  • General-purpose chat and instruction following: That doesn't require the deepest analytical depth
  • Summarization and content generation: Pipelines processing large document volumes
  • Code generation for standard tasks: Boilerplate, unit tests, and routine development work
  • Prototyping and development: Fast iteration matters more than maximum model capability

Consider Alternatives When

  • Hardest reasoning tasks: In mathematics, science, or complex multi-step analysis where the full Grok 3 or Grok 4 models return measurably better results
  • Maximum speed needed: The slightly faster Grok 3 Mini Fast variant better serves latency requirements
  • Image or video generation: Use the Grok Imagine family for media generation

Conclusion

Grok 3 Mini Beta makes the Grok 3 reasoning foundation accessible at production-friendly economics. For teams scaling AI-powered features where the full Grok 3 is more capability than needed, it provides the practical balance of quality and cost that most real-world applications require.

Frequently Asked Questions

  • How does Grok 3 Mini Beta compare to the full Grok 3?

    Grok 3 Mini Beta is a smaller, more efficient model that trades some reasoning depth for lower latency and cost. The full Grok 3 still fits better when you need the longest multi-step reasoning chains.

  • What is the difference between Grok 3 Mini Beta and Grok 3 Mini Fast?

    Both share the same compact architecture, but Grok 3 Mini Fast adds further latency optimization for applications where response speed is the top priority.

  • What is the context window for Grok 3 Mini Beta?

    The context window is 131.1K tokens.

  • What does Grok 3 Mini Beta cost?

    See the pricing section on this page for today's rates. AI Gateway exposes each provider's pricing for Grok 3 Mini Beta.

  • How do I authenticate with Grok 3 Mini Beta through Vercel AI Gateway?

    Use your Vercel AI Gateway API key with xai/grok-3-mini as the model identifier. AI Gateway handles routing and provider management automatically.

  • Is Grok 3 Mini Beta suitable for high-volume batch processing?

    Yes. Its lower per-token cost makes it well-suited for batch workloads like document summarization, data extraction, and content classification at scale.

  • Does Vercel AI Gateway support Zero Data Retention for Grok 3 Mini Beta?

    Zero Data Retention is not currently available for this model. ZDR on AI Gateway applies to direct gateway requests; BYOK flows aren't covered. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.