Skip to content

Ministral 8B

Ministral 8B brings an interleaved sliding-window attention architecture to edge inference, delivering faster and more memory-efficient processing across its full context window of 128K tokens at $0.15 per million tokens.

Tool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'mistral/ministral-8b',
prompt: 'Why is the sky blue?'
})

More models by Mistral AI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
256K
0.4s
64tps
$0.50/M$1.50/M
mistral logo
12/02/2025
256K
0.2s
63tps
$0.20/M$0.20/M
mistral logo
12/01/2025
128K
0.2s
149tps
$0.10/M$0.10/M
mistral logo
10/01/2024
32K
0.3s
$0.10/M$0.30/M
mistral logo
09/01/2024
$0.10/M
mistral logo
12/11/2023
256K
0.3s
74tps
$1.50/M$7.50/M
mistral logo