NewsTechnical

Google DeepMind’s powerful AI co-mathematician

The Rundown AI

Google DeepMind released an AI co-mathematician built on Gemini 3.1 that uses agentic pipelines to assist researchers with unsolved math problems, achieving a 48% score on FrontierMath Tier 4. The newsletter also covers AI discoveries in exoplanet detection, practical AI use cases from staff, and various AI industry news. A key highlight is Oxford professor Marc Lackenby solving an open mathematical problem using a strategy found in a rejected AI output.

Summary

Google DeepMind published research on its AI co-mathematician, an agentic system based on Gemini 3.1 modeled after AI coding environments like Claude Code. The system uses a coordinator agent to break research into parallel workstreams, with sub-agents handling code writing, literature search, and proof attempts. It set a new high on Epoch AI's FrontierMath Tier 4 benchmark at 48%, more than doubling Gemini 3.1 Pro's raw score of 19%. Notably, Oxford professor Marc Lackenby used the system to resolve an open problem from the Kourovka Notebook after identifying a clever proof strategy buried within a proof the system's own reviewers had rejected.

The newsletter's Rundown Roundtable section featured staff AI use cases: a developer built an async Magic: The Gathering app using OpenAI Codex's /goal command, and a partnerships team member used Claude to plan an entire Greece itinerary, claiming it rivaled professional travel agents. An AI training guide detailed how to use Codex's Computer Use plugin to automate repetitive local tasks like Photoshop exports and file renaming.

In astronomy, University of Warwick researchers confirmed 100+ exoplanets using an AI system called RAVEN, which scanned 4 years of NASA TESS data covering 2.2 million stars. RAVEN also identified 2,000+ additional candidates, including 31 never-before-spotted exoplanets and planets in the 'Neptunian Desert' — a region where Neptune-sized planets were thought unable to survive. The system achieves 10x greater precision in measuring planet-type frequency compared to previous methods.

Additional news covered includes Google's Isomorphic Labs reportedly raising $2B+ for its Drug Design Engine, Greece proposing AI protections in its constitution, Baidu releasing ERNIE 5.1 at 6% of rival training costs, and OpenRouter launching Pareto Code for cost-optimized AI routing. A reader submission highlighted using ChatGPT to train four dogs, avoiding thousands in professional trainer costs.

About this episode

PLUS: Automate any manual task with Codex

Key Insights

  • Oxford professor Marc Lackenby solved an open problem in the Kourovka Notebook not from a successful AI output, but by extracting a proof strategy from a proof the AI's own review system had rejected — suggesting value exists even in AI failures.
  • DeepMind's co-mathematician achieved 48% on FrontierMath Tier 4 by adopting the agentic pipeline architecture used in AI coding environments, more than doubling the raw model score of 19%, indicating that architectural design matters as much as raw model capability.
  • RAVEN's exoplanet detection achieves 10x greater precision in measuring planet-type frequency using smarter AI integration alone — not new telescope hardware — implying existing astronomical datasets contain far more discoverable knowledge than previously extracted.
  • The newsletter frames the co-mathematician's value as augmenting expert researchers rather than replacing them, pointing to Lackenby's discovery as evidence that the most significant near-term AI math contribution may be accelerating human insight rather than autonomous problem-solving.
  • Baidu claims ERNIE 5.1 cost just 6% as much to train as rival models while ranking No. 4 on Arena's Search Leaderboard, suggesting that training efficiency gaps between frontier labs and challengers are narrowing significantly.

Topics

Google DeepMind AI Co-MathematicianRAVEN AI Exoplanet DiscoveryAgentic AI Systems and PipelinesStaff AI Use Cases and WorkflowsAI Industry News and Funding

Transcript

Good morning, {{ first_name | AI enthusiasts }}. Google DeepMind just took AI’s coding strategy and applied it to math: don't ask a model for the answer, give a team of agents the workspace. The company’s AI co-mathematician just scored a new high on a benchmark built to stump AI for decades, with one professor even cracking an unsolved problem using a strategy buried inside a proof the system's own reviewers had rejected. Google DeepMind’s AI co-mathematician The Rundown Roundtable: Our AI use cases Automate any manual task with Codex AI finds 100+ new exoplanets from NASA data 4 new AI tools, community workflows, and more GOOGLE DEEPMIND Image source: Pushmeet Kohli (@pushmeet on X) The Rundown: Google DeepMind just…

Full transcript available for MurmurCast members

Sign Up to Access

More from The Rundown AI

Get AI summaries like this delivered to your inbox daily

Get AI summaries delivered to your inbox

MurmurCast summarizes your YouTube channels, podcasts, and newsletters into one daily email digest.