Yao Shunyu: Let Me Go a Little Crazy! Training Models at Anthropic & Gemini, Heroism Is Over
Yao Shunyu, a researcher who moved from Anthropic to Google DeepMind, discusses the current state of AI model development, the competitive landscape between major AI labs, and his personal journey from theoretical physics to AI research. He shares candid views on why individual heroism has ended in AI, the importance of reliability over brilliance, and his technical perspectives on pre-training, post-training, and long-horizon tasks.
Summary
The interview features Yao Shunyu, a researcher at Google DeepMind who previously worked at Anthropic, discussing the current AI landscape and his personal career journey. He begins by clarifying the difference between himself and the other famous Yao Shunyu (now at Tencent), noting that his own background is in theoretical physics rather than computer science.
On the state of AI models, Yao argues that the major labs (Anthropic, OpenAI, Gemini) have largely converged in capabilities, with benchmark differences now representing mostly noise rather than signal. He observes that the harder problem has shifted from 'can AI do this?' to 'what should we actually build?' Claude maintains an edge in agentic tool use, Gemini in pure reasoning, while coding remains competitive across all three.
Regarding the AI startup ecosystem, he discusses how wrappers like Manus and OpenClaw ultimately sold to larger companies because model moats remain dominant. He identifies two survival strategies: grow fast enough to build user mindshare before model companies copy you (Cursor's approach, though increasingly precarious), or stay small enough that big companies won't bother competing (Midjourney's approach). He describes the Cursor-Anthropic relationship as having entered a 'delicate competitive phase' now that Claude Code has launched.
On technical progress, Yao pushes back against the narrative that scaling laws have plateaued, attributing apparent plateaus mostly to bugs or flawed experimental assumptions rather than fundamental limits. He emphasizes that pre-training has continued to improve in recent months and that the primary drivers remain compute and data. He predicts that within 6-12 months, AI will begin completing full research cycles autonomously.
His career narrative traces from condensed matter physics at Tsinghua (where he co-discovered non-Hermitian skin effects), to theoretical high-energy physics at Stanford, to a brief Berkeley postdoc, then to Anthropic's reinforcement learning team where he worked on scaling post-training for what became Claude 3.5 new and 3.7. He joined Gemini in late September 2024 partly due to disagreement with Dario Amodei's anti-China stance (which he attributes roughly 40% weight), but primarily to broaden his learning beyond Anthropic's focused coding/agentic scope.
He reflects that the era of individual heroism in AI has passed—the Transformer moment was the last true heroic discovery, and now progress is fundamentally collective. He argues AI is 'essentially simple' compared to physics because every experiment is runnable and there's no fundamental energy-scale barrier to understanding. The most important trait in the field, he claims, is reliability and responsibility rather than brilliance.
For the future, he highlights two key research directions: ML coding (enabling AI to run complete research cycles) and long-horizon context (training with finite context but operating with effectively infinite context). He is skeptical of the chatbot as the ultimate AI interface, suggesting the form factor needs a product manager to unlock the model's true capabilities.
Key Insights
- Yao argues that the most important trait in AI research is reliability and being detail-oriented, not intelligence — claiming the field 'doesn't really require much brains' and that 'doing simple things cleaner than anyone else is the most critical thing,' because anyone can think of the ideas but few can execute them stably.
- Yao claims that the majority of apparent scaling law plateaus he has observed in the industry are caused by bugs or flawed experimental assumptions rather than fundamental limits, stating 'the vast majority of people who hit a wall, it's because of the third reason — there's a bug,' and that fixing a single bug often brings more progress than any fancy technique.
- Yao describes Anthropic's key organizational advantage as having its top technical decision-makers (Jared Kaplan, Sam McCandlish) also be cofounders with full authority, enabling fast top-down bets — something he says OpenAI lost when Ilya departed and which Google DeepMind structurally cannot replicate as a large company.
- Yao reveals he left Anthropic partly (roughly 40% weight) due to disagreement with Dario Amodei's anti-China stance, which he characterizes as 'a very emotional reaction' that was inappropriate for a company CEO to push to such an extreme, though he frames his primary motivation as wanting to learn different things like multimodal generation that Anthropic doesn't pursue.
- Yao predicts that within 6-12 months AI will complete full autonomous research cycles — not just writing code but also running experiments, analyzing results, forming new hypotheses, and designing follow-up experiments — describing this chain as 'the next thing to gradually become complete' and noting it is already partially happening.
Topics
Transcript
[0:00] English subtitles were generated by AI and are for reference only. Hello everyone, I'm Xiaojun Today our guest is Yao Shunyu, a researcher at Google DeepMind There are two famous Yao Shunyus in Silicon Valley One previously worked at OpenAI, then jumped ship to Tencent to become their Chief AI Scientist He's been on our show before Today I've invited the other Yao Shunyu He was previously at Anthropic Now he's at Google DeepMind [0:30] We'll start by talking about the recent series of massive model changes So next is my interview with Shunyu Anthropic as a company It's able to implement this kind of relatively top-down mechanism is something quite unique But is this difficult for other model…
Full transcript available for MurmurCast members
Sign Up to AccessMore from Zhang Xiaojun Podcast
Anker / Steven Yang: Consumer Electronics Death & birth, The Third Category, Product Philosophy
The interview features Steven Yang, CEO of Anker, discussing the challenges and strategic shifts within the consumer electronics industry, particularly in response to AI advancements. He emphasizes the importance of creating tangible customer value and the company's evolution from a focus on charging products to more complex technologies, including AI and chips.
Luo Fuli: OpenClaw, Agent Frameworks — The AI Paradigm Has Already Changed Dramatically!
Luo Fuli, head of Xiaomi's large model division, describes how her firsthand experience with OpenClaw over Spring Festival 2026 fundamentally changed her understanding of AI agent frameworks as a paradigm shift — not just a product. She explains how OpenClaw's open-source, sophisticated context orchestration enabled her team to dramatically accelerate research and model training, and outlines how this Agent era demands a new approach to model architecture, post-training, and organizational design.
A 4-hour Interview with Carina Hong: AI for Math, Lean, Proofs from The Book, and Intuition
Carina Hong (洪乐潼), a 24-year-old Chinese-born founder, discusses her company Axiom, which recently closed a $200M Series A at a $1.6B valuation, focusing on AI for Math using formal proof systems like Lean. She shares her journey from competitive math olympiads in Guangzhou to MIT, Oxford, and Stanford, before dropping out to build an AI theorem-proving system that achieved the first-ever perfect score on the Putnam mathematics competition.