China’s NEW Meituan LongCat 2.0 Tested!
The video reviews Long Cat 2.0, a new open-source Chinese AI model with 1.6 trillion parameters trained on Meituan's custom chip without Nvidia hardware. Despite decent benchmark performance, hands-on testing shows it underperforms compared to GLM-5.2 in practical applications like game creation, making GLM-5.2 the stronger open-source alternative.
Summary
The video discusses Long Cat 2.0, a newly released open-source AI model from China that serves as the full model behind the previously available Owl Alpha API. The model features Long Cat sparse attention and mixture of experts architecture. On benchmarks like Terminal Bench 2.1, it holds its own against other models, though Opus 4.8 significantly outperforms it. A notable aspect is that Long Cat was trained on Meituan's proprietary chip without using any Nvidia hardware—Meituan being China's equivalent to DoorDash, highlighting how major non-tech companies are now developing their own AI models.
The presenter tested Long Cat's practical capabilities by creating various games including Dragon Realm, Skyrim-style games, and other interactive applications. The results were mixed, with graphics and functionality being basic and buggy compared to alternatives. On benchmark evaluations, Long Cat slightly underperforms GPT-4.5, though it outperforms it on SWE-bench Pro. However, the most relevant comparison is against GLM-5.2, another recent Chinese open-source model.
Side-by-side testing against GLM-5.2 showed consistent advantages for GLM-5.2 across multiple game generation tasks: the Crib game, Dragon Ball game, Skyrim implementation, and Voxel Craft all performed notably better with GLM-5.2, featuring better graphics, fewer bugs, and more interesting gameplay mechanics. The presenter concludes that while Long Cat is interesting as an emerging model, GLM-5.2 remains the superior choice for open-source applications and practical use cases.
Key Insights
- Long Cat 2.0 was trained entirely on Meituan's custom chip without using any Nvidia hardware, demonstrating China's capability to develop AI independently of U.S. semiconductor constraints
- Meituan, primarily known as China's DoorDash equivalent, entered the AI model space with Long Cat, exemplifying how major non-tech companies are now developing their own AI models
- Long Cat performs slightly below GPT-4.5 on Terminal Bench but actually outperforms it on SWE-bench Pro according to official evaluations
- GLM-5.2 consistently outperformed Long Cat across multiple practical game generation tasks, suggesting benchmark scores don't always correlate with real-world application quality
- Long Cat's API is not currently accessible unless users have a Chinese setup, limiting practical availability for international users despite being open-source
Topics
Transcript
[0:00] Today we have a brand-new update from a Chinese model that's open source. It is called Long Cat 2.0 is here. And this is actually the full model behind Owl Alpha. So, if you're familiar with Owl Alpha, which was a free API, you can actually use it with Hermes, you could plug it into Claude code before, and it was not bad at all. It is a gigantic model, and this was actually revealed as Long Cat 2.0. So, this is [0:30] now officially been released, and you can get access to it, and you can see the full details right here. So, it's got Long Cat sparse attention, um zero compute experts, MOPD. Stacks up not badly…
Full transcript available for MurmurCast members
Sign Up to AccessMore from Julian Goldie SEO
NEW Google AI Studio Update is INSANE (FREE!) 🤯
Google AI Studio has released a new design variations feature that automatically generates multiple layout and design options for apps and websites with a single click. This tool eliminates the need for design skills or iterative prompting, reducing what typically takes hours to seconds, and is available for free.
Claude Sonnet 5 VS GLM 5.2: Who Wins?
A detailed comparison of Claude Sonnet 5 versus GLM 5.2 AI models across game development, coding benchmarks, and UI creation tasks. The reviewer concludes that GLM 5.2 generally outperforms Sonnet 5 while being significantly cheaper, though Opus 4.8 and the forthcoming Fable 5 remain superior options.
Claude Sonnet 5 is HERE!
Claude Sonnet 5 has been released as Anthropic's most agentic model, but the speaker argues it's a disappointing release that underperforms Opus 4.8 while being more expensive, making it an unattractive option for most users. The reviewer demonstrates this through benchmark comparisons and test outputs, concluding that users should stick with Opus 4.8 or wait for the incoming Fable 5 model.
Gamma Just Got Better With ChatGPT
Gamma, an AI design tool used by nearly 100 million people, is now integrated into ChatGPT as a native app, allowing users to create professional presentations, documents, and web pages without leaving the chat. The integration enables users to transform rough notes, training documents, and ideas into polished decks by simply conversing with ChatGPT, which handles the writing while Gamma handles the design.
NEW Qwythos 9B Runs Locally for FREE
Julian Goldie demonstrates how to run Qwythos 9B, a free 5.6GB local AI model on your Mac using Ollama, which can be integrated into an agentic operating system for private, offline AI tasks. While smaller than frontier models, it can effectively write, reason, and build applications locally without cloud connectivity or token costs.