Kimi K2 0905
Kimi K2 0905 is Moonshot AI's September 2025 K2 checkpoint, a refined release focused on agentic coding workflows with a context window of 262.1K tokens, available through AI Gateway via baseten, fireworks, groq, moonshotai.
import { streamText } from 'ai'
const result = streamText({ model: 'moonshotai/kimi-k2-0905', prompt: 'Why is the sky blue?'})Playground
Try out Kimi K2 0905 by Moonshot AI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
About Kimi K2 0905
Kimi K2 0905 carries a date stamp (September 5, 2025), following Moonshot AI's convention of identifying checkpoints by release date. The 0905 release is a distinct checkpoint from the original K2, not a silent in-place update.
The context window of 262.1K tokens is the main structural change. Agentic coding sessions accumulate context quickly: task descriptions, file contents, tool outputs, reasoning steps, and error messages stack up over many turns. The window of 262.1K tokens keeps an entire project-scale context in scope without truncation. This matters when an agent reviews multiple files, runs tests, and iterates on fixes across a long session.
For teams already using the base K2, the 0905 checkpoint is a drop-in upgrade. It brings training refinements and the extended context window. The API interface, tool-calling format, and integration patterns stay the same. Switch by updating the model string to moonshotai/kimi-k2-0905.
The narrower provider set is the main operational difference. This checkpoint routes across baseten, fireworks, groq, moonshotai, while the base K2 covers a wider set. Weigh that against the 0905 training improvements if you need the largest failover coverage.
Kimi K2 0905 is available through AI Gateway at $1.0 per million input tokens and $3.0 per million output tokens.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by Moonshot AI
| Model |
|---|
What To Consider When Choosing a Provider
- Configuration: This checkpoint routes across fewer providers than the base K2. Monitor provider-level status during high-demand periods if you observe elevated latency.
- Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Kimi K2 0905
Best For
- Long agentic sessions: Accumulated context (tool outputs, file contents, multi-turn history) pushes beyond the base K2 context window
- September 5, 2025 training refinements: Workloads targeting the newer checkpoint's agentic coding improvements
- Full-codebase review: Multi-file code review or generation where the context of 262.1K tokens enables a complete codebase view in one call
- Drop-in upgrade: Existing base K2 integrations seeking a direct upgrade to the newer checkpoint
Consider Alternatives When
- Chain-of-thought traces: Kimi K2 Thinking variants are designed for explicit reasoning output
- Maximum routing redundancy: Base Kimi K2 routes across a wider provider set than this checkpoint
- Fastest K2 inference: Kimi K2 Turbo is the speed-optimized variant
- Shorter context needs: Tasks that don't require the full 262.1K tokens benefit from base K2's broader failover pool
Conclusion
Kimi K2 0905 delivers September 5, 2025 training refinements for agentic coding alongside a context window of 262.1K tokens that accommodates the long histories of extended coding agent sessions. For teams running base K2 in agentic coding workflows, it's the checkpoint update with the larger context window. Switch by changing the model string to moonshotai/kimi-k2-0905 with no other integration changes.
Frequently Asked Questions
What was the focus of the 0905 checkpoint update?
Agentic coding. The checkpoint refines multi-step development tasks, tool use in coding workflows, and sustained context across long coding sessions.
Why does the context window of 262.1K tokens matter for agentic coding specifically?
Coding agents accumulate context rapidly: file contents, function signatures, test outputs, error logs, and multi-turn reasoning traces all consume tokens. A window of 262.1K tokens keeps a much larger project scope in context at once, which cuts truncation workarounds.
How does switching from base K2 to kimi-k2-0905 work?
Update the model string in your API call to
moonshotai/kimi-k2-0905. Authentication, tool-calling format, and the rest of the integration stay the same.What providers serve this checkpoint through AI Gateway?
AI Gateway routes Kimi K2 0905 across baseten, fireworks, groq, moonshotai. Failover between them is automatic.
Is the 0905 checkpoint open-weight?
Yes, in the same lineage as other open-weight K2-family models. Check Moonshot AI's Hugging Face repository for license terms specific to this checkpoint.
Does kimi-k2-0905 support tool calling?
Yes. Tool calling through the standard function-calling interface matches the agentic coding focus of the 0905 training refinements.
What if the context of 262.1K tokens is more than my tasks need?
If context length isn't a constraint, the base Kimi K2 routes across a wider provider set and may give more availability headroom for high-uptime production use.