MiMo M2.5

MiMo M2.5 is the mid-tier model in Xiaomi's MiMo v2.5 family, a Mixture-of-Experts (MoE) stack with reasoning, tool use, and multimodal input. It supports a context window of 1.1M tokens and 131.1K tokens max output tokens.

ReasoningTool UseImplicit CachingFile InputVision (Image)

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'xiaomi/mimo-v2.5',
  prompt: 'Why is the sky blue?'
})

Overview About Providers Throughput Latency Uptime Status Similar FAQ

About MiMo M2.5

MiMo M2.5 is a MoE language model from Xiaomi, released April 22, 2026 under the MIT license. Each forward pass activates a subset of total parameters, which keeps per-token compute lower than a dense model at the same parameter count.

The architecture uses hybrid attention, interleaving sliding-window and full attention to cut KV-cache storage at long sequence lengths. A multi-token prediction (MTP) block raises output tokens per step during inference. The full window of 1.1M tokens lets MiMo M2.5 reason over large documents, repos, or long agent trajectories.

MiMo M2.5 supports reasoning, tool calling, file input, vision, and implicit prompt caching. Call it through Xiaomi, DeepInfra via AI Gateway. For the higher-capability tier, see mimo-v2.5-pro.

Agent Stack

Core Platform

Tools

Learn

Build

Explore

MiMo M2.5

About MiMo M2.5