Skip to content

Ministral 3B

Ministral 3B is Mistral AI's smallest production model, a 3B parameter edge-optimized architecture that benchmarks above the larger Mistral 7B at $0.1 per million input tokens.

Tool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'mistral/ministral-3b',
prompt: 'Why is the sky blue?'
})

Playground

Try out Ministral 3B by Mistral AI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Mistral AI
Legal:Terms
Privacy
128K
0.2s
$0.10/M$0.10/M
10/01/2024
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by Mistral AI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
256K
0.4s
51tps
$0.40/M$2.00/M
mistral logo
12/09/2025
256K
0.3s
67tps
$0.20/M$0.20/M
mistral logo
12/01/2025
128K
1.0s
30tps
$0.40/M$2.00/M
mistral logo
05/07/2025
32K
0.6s
159tps
$0.10/M$0.30/M
mistral logo
09/01/2024
131K
0.2s
83tps
$0.15/M$0.15/M
deepinfra logo
mistral logo
novita logo
07/01/2024
$0.10/M
mistral logo
12/11/2023

About Ministral 3B

Mistral AI released Ministral 3B on October 1, 2024 as the most compact member of the Ministral edge family. Ministral 3B outperforms the older Mistral 7B, a model more than twice its parameter count, on most evaluation categories. Mistral AI attributes this to architectural refinements specific to the Ministral generation.

Ministral 3B ships with a context window of 128K tokens and supports function calling out of the box. That combination makes it a practical lightweight tool-dispatch agent in multi-step pipelines where a larger model would be wasteful for structured routing work.

Production use falls under the Mistral AI Commercial License.

What To Consider When Choosing a Provider

  • Configuration: Symmetrical input/output pricing simplifies cost estimation. You don't need to model prompt-to-completion ratios separately.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Ministral 3B

Best For

  • Lightweight function-calling agents: Agents that dispatch tools in multi-step workflows at minimal cost
  • High-frequency classification: Summarization or knowledge retrieval at minimal cost
  • Cost-sensitive production APIs: Services processing millions of requests where per-call cost dominates

Consider Alternatives When

  • Long-context memory efficiency: You need the sliding-window attention Ministral 8B provides
  • Complex reasoning or code generation: These workloads exceed the 3B capability
  • Image understanding: Your workload involves vision (consider Ministral 14B)

Conclusion

Ministral 3B scores above Mistral 7B on most benchmarks despite a smaller footprint. For cost-sensitive or latency-critical applications, it keeps inference cheap next to larger alternatives.

Frequently Asked Questions

  • How can Ministral 3B outperform the larger Mistral 7B?

    Mistral AI credits architectural refinements in the Ministral generation.

  • What types of tasks is Ministral 3B well suited for?

    Structured function calling, classification, simple summarization, and knowledge retrieval. Use Ministral 3B as a lightweight dispatch layer in agentic systems where the work is routing, not deep reasoning.

  • What context length does Ministral 3B support?

    128K tokens.

  • What licensing options are available?

    The Mistral AI Commercial License covers production use.