MiniMax M2.5 High Speed
MiniMax M2.5 High Speed is the throughput-optimized variant that retains M2.5's full planning and software engineering capabilities.
import { streamText } from 'ai'
const result = streamText({ model: 'minimax/minimax-m2.5-highspeed', prompt: 'Why is the sky blue?'})Playground
Try out MiniMax M2.5 High Speed by MiniMax. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
About MiniMax M2.5 High Speed
MiniMax M2.5 High Speed targets autonomous coding agents that run for extended periods and need fast token generation. See live metrics on this page for current throughput. For cost estimates, use and your expected token volumes rather than a fixed hourly figure.
The "highspeed" label doesn't indicate a distilled or reduced-capability model. MiniMax M2.5 High Speed retains the full architectural planning mode of standard M2.5. It decomposes problems into specifications before writing code, handles the complete development lifecycle across Web, Android, iOS, Windows, and Mac platforms, and matches the same reported SWE-Bench Verified score as standard M2.5.
The tradeoff is straightforward: you pay roughly twice as much per token in exchange for generating tokens roughly twice as fast on paper. For batch jobs where wall-clock time doesn't matter, standard M2.5 is more economical. For interactive sessions, streaming UIs, and agent loops where latency compounds, the highspeed variant can win.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by MiniMax
| Model |
|---|
What To Consider When Choosing a Provider
- Configuration: The highspeed variant lists at roughly double the standard M2.5 input and output rates on many providers. AI Gateway's per-request cost tracking lets you measure whether the throughput gain pays for itself in your workload before you commit at scale.
- Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use MiniMax M2.5 High Speed
Best For
- Long-running coding agents: Autonomous sessions that span multi-minute or multi-hour workflows
- Streaming pair-programming: Chat interfaces where token delivery speed is user-visible
- High-concurrency services: Per-request latency accumulates across simultaneous users
- Wall-clock time priority: Halving inference time justifies a 2x per-token cost increase
Consider Alternatives When
- Fast tasks anyway: Your tasks complete in seconds regardless of throughput, so the speed premium has no practical impact
- Cost over latency: Per-token cost is a harder constraint than latency and standard M2.5 delivers the same output for less
- Multi-agent coordination: You need the features introduced in M2.7
Conclusion
MiniMax M2.5 High Speed occupies a clear niche: same model, faster output, higher price. It's the right pick when your agent or application is bottlenecked on token generation speed and the cost difference is justified by reduced wall-clock time or improved user experience.
Frequently Asked Questions
Is there any quality difference between MiniMax M2.5 High Speed and standard M2.5?
None. Both variants share the same architecture, planning capabilities, and benchmark scores. The "highspeed" designation reflects inference throughput only.
What development tasks does MiniMax M2.5 High Speed cover?
MiniMax M2.5 High Speed handles the full development lifecycle: specification writing, code generation, debugging, and deployment across Web, Android, iOS, Windows, and Mac platforms.
How does the pricing compare to standard M2.5?
Check the pricing panel on this page for today's numbers. AI Gateway tracks rates across every provider that serves MiniMax M2.5 High Speed.
What throughput can I expect?
Live throughput metrics display on this page and update based on real traffic.
Should I use MiniMax M2.5 High Speed or M2.7 Highspeed?
If you don't need M2.7's multi-agent orchestration or dynamic tool search, MiniMax M2.5 High Speed is more cost-effective at comparable speeds.
Can I try this model before integrating it?
Yes. Open https://ai-sdk.dev/playground/minimax:minimax-m2.5-highspeed to evaluate output quality and perceived speed interactively.