Skip to content

MiniMax M2.5

MiniMax M2.5 is a third-generation agentic model from MiniMax that handles full-stack development across Web, Android, iOS, Windows, and Mac platforms. It supports a context window of 1M tokens, a max output of 196K tokens, and completes tasks about 37% faster than M2.1.

ReasoningTool UseImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'minimax/minimax-m2.5',
prompt: 'Why is the sky blue?'
})

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
MiniMax
205K
3.2s
291tps
$0.30/M$1.20/M
Read:$0.03/M
Write:$0.38/M
+1
02/12/2026
Nebius
196K
0.9s
98tps
$0.30/M$1.20/M
02/12/2026
Parasail
197K
0.4s
118tps
$0.30/M$1.20/M
02/12/2026
DeepInfra
197K
0.5s
45tps
$0.27/M$0.95/M
Read:$0.03/M
Write:
02/12/2026
Amazon Bedrock
1M
0.7s
71tps
$0.30/M$1.20/M
02/12/2026
Blackbox
128K
2.3s
324tps
$0.07/M$0.57/M
+1
02/12/2026