LongCat Flash Thinking
LongCat Flash Thinking is Meituan's 560B MoE reasoning model. It combines Lean4 formal proof capability, agentic tool use, and an ARC-AGI score of 50.3 in a single architecture.
import { streamText } from 'ai'
const result = streamText({ model: 'meituan/longcat-flash-thinking', prompt: 'Why is the sky blue?'})Playground
Try out LongCat Flash Thinking by Meituan. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
About LongCat Flash Thinking
LongCat Flash Thinking unifies deep thinking, tool calling, and formal mathematical reasoning in a single architecture. The reasoning process itself can invoke tools mid-thought rather than thinking first and calling tools as a separate phase.
The Agentic Reasoning Framework implements dual-path inference. The model evaluates each task and autonomously chooses between direct reasoning and tool-augmented reasoning based on complexity. Callers don't configure this routing. The model applies tool invocation where it helps and direct reasoning where it doesn't, without caller overhead.
Benchmarks at release: ARC-AGI 50.3 (abstract pattern reasoning), LiveCodeBench 79.4 (competitive programming), τ²-Bench 74.0 on agent tool use at release (as reported), MiniF2F-test 67.6 pass@1 (formal mathematical proof via Lean4). Meituan reported a 64.5% token efficiency improvement in agentic tool-use settings with 90% task accuracy retention. For methodology and updates, see the technical post.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by Meituan
| Model |
|---|
What To Consider When Choosing a Provider
- Configuration: Flash Thinking's extended reasoning traces increase response latency and per-response token consumption compared to Flash Chat. Configure request timeouts and per-session cost budgets accordingly.
- Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use LongCat Flash Thinking
Best For
- Complex STEM reasoning: Physics, mathematics, and algorithmic problems where deliberative chain-of-thought improves accuracy over direct response
- Formal proof workflows: Lean4 verification where proof generation and verification need to integrate
- Competitive programming: Algorithmic challenges where LiveCodeBench-level performance is the benchmark
- Autonomous tool selection: Agentic workflows where dual-path inference mid-reasoning matters
- Traceable reasoning chains: Research tasks that need explicit reasoning over opaque responses
Consider Alternatives When
- Low-latency conversations: Conversational speed and low per-response latency are the priority (LongCat Flash Chat is the high-throughput direct-response variant)
- Simple instruction following: Tasks don't benefit from extended deliberation overhead
- Multimodal input required: You need image, audio, or video input alongside text
Conclusion
LongCat Flash Thinking combines multi-domain reasoning breadth and Lean4 integration for formally verified mathematical proof. For workloads where reasoning depth, formal verification, and autonomous tool-augmented thinking are required, it's the reasoning-focused option in the LongCat-Flash family. See Meituan's release notes.
Frequently Asked Questions
What does Lean4 formal proof capability enable in practice?
Lean4 is a formal proof assistant that states and machine-verifies mathematical claims. LongCat Flash Thinking integrates with Lean4 at a 67.6 pass@1 rate on MiniF2F-test. It generates and verifies formal mathematical proofs, not just informal natural language arguments. This applies to theorem proving, formal verification, and rigorous mathematical research.
What is the Agentic Reasoning Framework's dual-path inference?
The model autonomously decides whether each task benefits from direct reasoning or tool invocation during the thinking process. Callers don't configure this routing. Meituan reported a 64.5% token efficiency gain in agent tool-use settings while retaining 90% task accuracy.
What are the key benchmark scores for LongCat Flash Thinking?
ARC-AGI: 50.3; LiveCodeBench: 79.4; τ²-Bench: 74.0 (reported at release); MiniF2F-test: 67.6 pass@1 on formal mathematical proof. Full tables are in the technical post.
Is LongCat Flash Thinking open-source?
Yes. Weights and licensing are published alongside Meituan's technical post.