Skip to content

Ministral 8B

Ministral 8B brings an interleaved sliding-window attention architecture to edge inference, delivering faster and more memory-efficient processing across its full context window of 128K tokens at $0.15 per million tokens.

Tool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'mistral/ministral-8b',
prompt: 'Why is the sky blue?'
})

More models by Mistral AI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
256K
0.3s
58tps
$0.40/M$2.00/M
mistral logo
12/09/2025
256K
0.4s
53tps
$0.50/M$1.50/M
mistral logo
12/02/2025
256K
0.2s
87tps
$0.20/M$0.20/M
mistral logo
12/01/2025
128K
0.2s
256tps
$0.10/M$0.10/M
mistral logo
10/01/2024
32K
0.3s
162tps
$0.10/M$0.30/M
mistral logo
09/01/2024
$0.10/M
mistral logo
12/11/2023