Ministral 3B
Ministral 3B is Mistral AI's smallest production model, a 3B parameter edge-optimized architecture that benchmarks above the larger Mistral 7B at $0.1 per million input tokens.
import { streamText } from 'ai'
const result = streamText({ model: 'mistral/ministral-3b', prompt: 'Why is the sky blue?'})What To Consider When Choosing a Provider
Zero Data Retention
AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.Authentication
AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
Symmetrical input/output pricing simplifies cost estimation. You don't need to model prompt-to-completion ratios separately.
When to Use Ministral 3B
Best For
Lightweight function-calling agents:
Agents that dispatch tools in multi-step workflows at minimal cost
High-frequency classification:
Summarization or knowledge retrieval at minimal cost
Cost-sensitive production APIs:
Services processing millions of requests where per-call cost dominates
Consider Alternatives When
Long-context memory efficiency:
You need the sliding-window attention Ministral 8B provides
Complex reasoning or code generation:
These workloads exceed the 3B capability
Image understanding:
Your workload involves vision (consider Ministral 14B)
Conclusion
Ministral 3B scores above Mistral 7B on most benchmarks despite a smaller footprint. For cost-sensitive or latency-critical applications, it keeps inference cheap next to larger alternatives.
FAQ
Mistral AI credits architectural refinements in the Ministral generation.
Structured function calling, classification, simple summarization, and knowledge retrieval. Use Ministral 3B as a lightweight dispatch layer in agentic systems where the work is routing, not deep reasoning.
128K tokens.
The Mistral AI Commercial License covers production use.