Ministral 3B
Ministral 3B is Mistral AI's smallest production model, a 3B parameter edge-optimized architecture that benchmarks above the larger Mistral 7B at $0.1 per million input tokens.
import { streamText } from 'ai'
const result = streamText({ model: 'mistral/ministral-3b', prompt: 'Why is the sky blue?'})Playground
Try out Ministral 3B by Mistral AI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by Mistral AI
| Model |
|---|
About Ministral 3B
Mistral AI released Ministral 3B on October 1, 2024 as the most compact member of the Ministral edge family. Ministral 3B outperforms the older Mistral 7B, a model more than twice its parameter count, on most evaluation categories. Mistral AI attributes this to architectural refinements specific to the Ministral generation.
Ministral 3B ships with a context window of 128K tokens and supports function calling out of the box. That combination makes it a practical lightweight tool-dispatch agent in multi-step pipelines where a larger model would be wasteful for structured routing work.
Production use falls under the Mistral AI Commercial License.
What To Consider When Choosing a Provider
- Configuration: Symmetrical input/output pricing simplifies cost estimation. You don't need to model prompt-to-completion ratios separately.
- Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Ministral 3B
Best For
- Lightweight function-calling agents: Agents that dispatch tools in multi-step workflows at minimal cost
- High-frequency classification: Summarization or knowledge retrieval at minimal cost
- Cost-sensitive production APIs: Services processing millions of requests where per-call cost dominates
Consider Alternatives When
- Long-context memory efficiency: You need the sliding-window attention Ministral 8B provides
- Complex reasoning or code generation: These workloads exceed the 3B capability
- Image understanding: Your workload involves vision (consider Ministral 14B)
Conclusion
Ministral 3B scores above Mistral 7B on most benchmarks despite a smaller footprint. For cost-sensitive or latency-critical applications, it keeps inference cheap next to larger alternatives.
Frequently Asked Questions
How can Ministral 3B outperform the larger Mistral 7B?
Mistral AI credits architectural refinements in the Ministral generation.
What types of tasks is Ministral 3B well suited for?
Structured function calling, classification, simple summarization, and knowledge retrieval. Use Ministral 3B as a lightweight dispatch layer in agentic systems where the work is routing, not deep reasoning.
What context length does Ministral 3B support?
128K tokens.
What licensing options are available?
The Mistral AI Commercial License covers production use.