Skip to content

Ministral 8B

Ministral 8B brings an interleaved sliding-window attention architecture to edge inference, delivering faster and more memory-efficient processing across its full context window of 128K tokens at $0.15 per million tokens.

Tool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'mistral/ministral-8b',
prompt: 'Why is the sky blue?'
})

More models by Mistral AI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
256K
0.4s
60tps
$0.50/M$1.50/M
mistral logo
12/02/2025
128K
0.3s
198tps
$0.10/M$0.10/M
mistral logo
10/01/2024
32K
0.4s
59tps
$0.10/M$0.30/M
mistral logo
09/01/2024
128K
0.3s
157tps
$0.30/M$0.90/M
mistral logo
05/29/2024
$0.10/M
mistral logo
12/11/2023
256K
0.3s
73tps
$1.50/M$7.50/M
mistral logo