LongCat Flash Chat is Meituan's 560B Mixture-of-Experts (MoE) conversational model that activates roughly 27B parameters per token on average. It targets high-throughput agentic tool use and complex multi-step interactions under an MIT license.
LongCat Flash Chat is the direct-response conversational variant of Meituan's LongCat-Flash series. It answers immediately rather than generating extended internal reasoning chains. The architecture uses a zero-computation expert gating mechanism to activate 18.6 to 31.3 billion parameters per token (roughly 27B on average) from the 560B total. This keeps per-token compute aligned with active-parameter-scale pricing; see [pricing] for current rates. The full 560B parameter breadth shapes knowledge and generalization.
The design emphasizes agentic tool use and reliable instruction following across sequential steps. In practice, the model handles structured function calls consistently across many turns, maintains task state through long tool-augmented conversations, and keeps response formatting stable. These properties matter when an agent invokes tools dozens of times in a session without behavior drift.
Meituan released LongCat Flash Chat under an MIT license. Model weights are publicly available; see the upstream listing. Access it through AI Gateway with one API key.