Technology

Meituan open-sources LongCat-2.0, a 1.6-trillion-parameter model it says it trained end to end on Chinese chips

The benchmark table is Meituan's own and still unverified. The claim worth checking is the hardware: a full pretraining run, not just inference, on a 50,000-card domestic cluster.

Janet Torvalds

July 1, 2026

Meituan, the Beijing company most people know as China's food-delivery giant, released a large language model on June 30 called LongCat-2.0. It has 1.6 trillion parameters, ships with open weights on Hugging Face and GitHub, and comes wrapped in a claim that matters more than the size: Meituan says it ran the entire training job, not just inference, on domestically made Chinese chips.

That last part is the story. Plenty of Chinese labs now serve models on homegrown silicon. Pretraining a frontier-scale model on it is the harder problem, and the one export controls were meant to keep out of reach.

What the model actually is

LongCat-2.0 is a sparse mixture-of-experts model, the same broad design DeepSeek and Mistral's Mixtral use. The 1.6 trillion figure is the total parameter count. It does not fire all of them for every token. An internal router picks a subset, roughly 48 billion parameters active per token, which is what keeps inference cost from scaling with the headline number. This is why a 1.6-trillion-parameter MoE model can be cheaper to run than the size suggests, and why "1.6 trillion" on its own tells you very little about how expensive it is to serve.

The one specification that is genuinely unusual is the context window: one million tokens. For comparison, DeepSeek-R1-0528 and OpenAI's open-weight GPT-OSS top out around 128,000. Meituan is positioning LongCat as the reasoning core for coding agents and long-horizon task tools, the kind of work that eats context, so the long window is a design choice aimed at that use, not a spec-sheet flex.

At 1.6 trillion parameters it is not running on your laptop or, realistically, most on-premises hardware. It lives in a data center, sharded across an inference cluster.

The hardware claim, and why it is the news

Meituan says it trained LongCat-2.0 from scratch on a 50,000-card cluster of domestic AI ASICs it calls "superpods," across more than 30 trillion tokens, with no rollbacks or unrecoverable loss spikes. Reporting ties the stack to Huawei technology, including Huawei's collective communication library and Atlas-class superpods.

The distinction worth holding onto: earlier this year a Huawei-led group post-trained DeepSeek's 1.6-trillion-parameter V4-Pro on Ascend chips, and DeepSeek adapted V4 to run inference on Huawei silicon. Those are real milestones, but they are the back half of the pipeline. LongCat-2.0 is being described as the first trillion-parameter model to complete both full pretraining and inference on domestic hardware. Training is the step Chinese firms have had the most trouble moving off Nvidia, because it stresses interconnect bandwidth and software maturity in ways that inference does not. If the claim holds, that gap just narrowed.

It is a claim, though. Meituan has published the number of cards and tokens, not an independent audit of the run.

The benchmarks are Meituan's own

Meituan put out a benchmark table placing LongCat-2.0 next to Google's Gemini, OpenAI's GPT-5.5, and Anthropic's Claude Opus, and cited a 59.5 on SWE-bench Pro against 58.6 for GPT-5.5. Take that as a vendor's self-report until someone outside the company reproduces it. A first-party score that clears a competitor by nine tenths of a point is exactly the kind of result that needs third-party confirmation before it means anything. The weights are open, so that check is possible. It just has not happened yet.

Worth separating two things Meituan is asking you to believe. One is that it trained a trillion-parameter model end to end on Chinese chips, which is a hardware-and-logistics claim. The other is that the resulting model matches Western frontier systems, which is a quality claim. The first is the more interesting one, and it does not depend on the second. A model can be a genuine milestone for domestic training and still land a step behind GPT-5.5 in practice.

Why it lands now

The backdrop is Nvidia's slipping position in China. Bernstein estimated Nvidia held about 40 percent of the Chinese AI-chip market in 2025, roughly matched by Huawei, and projected Nvidia's share falling several points this year as export-control turbulence pushes buyers toward domestic parts. Nvidia hardware can currently flow into China, but the on-again, off-again nature of that access is enough to make a company like Meituan build and, more to the point, advertise a training pipeline that does not touch CUDA.

The open weights are part of the message. Meituan does not need LongCat to be the best model in the world. It needs developers to download it, run it on domestic hardware, and demonstrate that the domestic stack works at scale. That is a cheaper way to make the point than a benchmark chart, and a harder one to argue with.

trillion parameter modeldomestic Chinese chipsNvidia export controlsChina AIopen weights AImixture of expertsMeituanUS chip export controlsLongCat-2.0Huawei AscendSWE-bench ProChinese domestic AI chipsOpen-weight AI models

Sources (5)

China's Meituan open-sources massive LongCat-2.0 AI model, saying it was trained on domestic chipssiliconangle.com
Meituan open-sources 1.6-trillion-parameter LongCat-2.0 AI model trained on Chinese chipswww.cryptopolitan.com
Introducing LongCat-2.0longcat.chat
China debuts biggest AI model trained on local chips as Meituan releases LongCat-2.0www.scmp.com
meituan-longcat/LongCat-2.0huggingface.co

Keep reading

Technology

Z.ai's open-weights GLM-5.2 lands near the top of the coding leaderboards at a sixth of the cost

Janet Torvalds

Technology

ByteDance Says Seed 2.1 Matches GPT-5.5. The One Independent Benchmark Puts It 8th.

Janet Torvalds

Technology

Two Ex-Anthropic Researchers Raised $200 Million With No Product. The Valuation Is $1 Billion.

Janet Torvalds

Technology

Micron Just Had the Best Quarter in Its History. It Spent It Trying to Kill the Memory Cycle.

Janet Torvalds