MiniMax M2.5 High Speed targets autonomous coding agents that run for extended periods and need fast token generation. See live metrics on this page for current throughput. For cost estimates, use and your expected token volumes rather than a fixed hourly figure.
The "highspeed" label doesn't indicate a distilled or reduced-capability model. MiniMax M2.5 High Speed retains the full architectural planning mode of standard M2.5. It decomposes problems into specifications before writing code, handles the complete development lifecycle across Web, Android, iOS, Windows, and Mac platforms, and matches the same reported SWE-Bench Verified score as standard M2.5.
The tradeoff is straightforward: you pay roughly twice as much per token in exchange for generating tokens roughly twice as fast on paper. For batch jobs where wall-clock time doesn't matter, standard M2.5 is more economical. For interactive sessions, streaming UIs, and agent loops where latency compounds, the highspeed variant can win.