Grok 3 Mini Beta
Grok 3 Mini Beta is xAI's compact reasoning model in the Grok 3 family. It provides efficient inference for tasks that need solid reasoning without the full computational overhead of the full-scale Grok 3, within a context window of 131.1K tokens.
import { streamText } from 'ai'
const result = streamText({ model: 'xai/grok-3-mini', prompt: 'Why is the sky blue?'})Playground
Try out Grok 3 Mini Beta by xAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
About Grok 3 Mini Beta
Grok 3 Mini Beta is the compact variant in the Grok 3 model family, released February 17, 2025. It distills the reasoning capabilities of the full Grok 3 into a smaller, more efficient architecture that reduces both latency and cost per token while still handling standard language tasks across summarization, Q&A, code, and instruction following.
The model supports a context window of 131.1K tokens and handles general-purpose tasks including summarization, question answering, code generation, and instruction following. Where the full Grok 3 targets the hardest benchmark-style reasoning tasks, Grok 3 Mini Beta fits the broad middle ground of production workloads where lower cost matters more than pushing benchmark limits.
Grok 3 Mini Beta is available at $0.3 per million input tokens and $0.5 per million output tokens through Vercel AI Gateway. For workloads that prioritize speed above all else, the Grok 3 Mini Fast variant adds further latency optimization.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by xAI
| Model |
|---|
What To Consider When Choosing a Provider
- Configuration: Grok 3 Mini Beta significantly reduces per-token costs compared to the full Grok 3. For high-volume pipelines, this difference compounds quickly.
- Configuration: Evaluate Grok 3 Mini Beta on your specific tasks before defaulting to the larger Grok 3. Many production workloads see negligible quality differences at substantially lower cost.
- Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Grok 3 Mini Beta
Best For
- High-volume production workloads: Per-token cost is a primary concern and mid-tier reasoning is sufficient
- General-purpose chat and instruction following: That doesn't require the deepest analytical depth
- Summarization and content generation: Pipelines processing large document volumes
- Code generation for standard tasks: Boilerplate, unit tests, and routine development work
- Prototyping and development: Fast iteration matters more than maximum model capability
Consider Alternatives When
- Hardest reasoning tasks: In mathematics, science, or complex multi-step analysis where the full Grok 3 or Grok 4 models return measurably better results
- Maximum speed needed: The slightly faster Grok 3 Mini Fast variant better serves latency requirements
- Image or video generation: Use the Grok Imagine family for media generation
Conclusion
Grok 3 Mini Beta makes the Grok 3 reasoning foundation accessible at production-friendly economics. For teams scaling AI-powered features where the full Grok 3 is more capability than needed, it provides the practical balance of quality and cost that most real-world applications require.
Frequently Asked Questions
How does Grok 3 Mini Beta compare to the full Grok 3?
Grok 3 Mini Beta is a smaller, more efficient model that trades some reasoning depth for lower latency and cost. The full Grok 3 still fits better when you need the longest multi-step reasoning chains.
What is the difference between Grok 3 Mini Beta and Grok 3 Mini Fast?
Both share the same compact architecture, but Grok 3 Mini Fast adds further latency optimization for applications where response speed is the top priority.
What is the context window for Grok 3 Mini Beta?
The context window is 131.1K tokens.
What does Grok 3 Mini Beta cost?
See the pricing section on this page for today's rates. AI Gateway exposes each provider's pricing for Grok 3 Mini Beta.
How do I authenticate with Grok 3 Mini Beta through Vercel AI Gateway?
Use your Vercel AI Gateway API key with
xai/grok-3-minias the model identifier. AI Gateway handles routing and provider management automatically.Is Grok 3 Mini Beta suitable for high-volume batch processing?
Yes. Its lower per-token cost makes it well-suited for batch workloads like document summarization, data extraction, and content classification at scale.
Does Vercel AI Gateway support Zero Data Retention for Grok 3 Mini Beta?
Zero Data Retention is not currently available for this model. ZDR on AI Gateway applies to direct gateway requests; BYOK flows aren't covered. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.