Ministral 8B

Ministral 8B brings an interleaved sliding-window attention architecture to edge inference, delivering faster and more memory-efficient processing across its full context window of 128K tokens at $0.15 per million tokens.

Tool Use

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'mistral/ministral-8b',
  prompt: 'Why is the sky blue?'
})

Overview About Providers Throughput Latency Uptime Status Similar FAQ

More models by Mistral AI

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

mistral/mistral-medium-3.5

256K

0.3s

56tps

$1.50/M

$7.50/M

—

04/29/2026

mistral/devstral-small-2

256K

0.4s

59tps

$0.10/M

$0.30/M

—

12/09/2025

mistral/mistral-large-3

256K

0.5s

67tps

$0.50/M

$1.50/M

—

12/02/2025

mistral/ministral-3b

128K

0.3s

135tps

$0.10/M

—

10/16/2024

mistral/mistral-small

32K

0.4s

$0.10/M

$0.30/M

—

09/17/2024

mistral/mistral-embed

$0.10/M

—

12/11/2023

Agent Stack

Core Platform

Tools

Learn

Build

Explore

Ministral 8B

More models by Mistral AI