Standard language models produce answers directly. Input goes in, output comes out, and whatever reasoning occurred stays invisible. Kimi K2 Thinking changes the output structure. Before generating its final answer, the model produces an explicit chain-of-thought (CoT) trace: a written record of how it decomposes the problem, what options it considers, and how it reaches its conclusion.
This isn't a prompting trick. The thinking behavior is trained into the model. When K2 Thinking encounters a hard problem, its reasoning trace can run for hundreds or thousands of tokens as the model works through sub-problems, backtracks from dead ends, and synthesizes intermediate results. The final answer follows the trace.
Two practical consequences follow. First, step-by-step decomposition helps on problems that benefit from it: multi-step mathematical proofs, algorithmic design, and debugging sessions where the root cause isn't obvious. Second, the reasoning trace is also an output you can log, audit, or use in evaluations.
K2 Thinking supports long chains of sequential tool calls within a single agentic session. The model reasons about what tool to call next, observes the result, reasons about the implications, and continues. It maintains coherent task state across more interaction steps than many non-thinking models handle.
The model is open source under Moonshot AI's license terms.
Kimi K2 Thinking is available through AI Gateway at $0.60 per million input tokens and $2.50 per million output tokens.