InsightfulTechnical

Anjney Midha's Plan to Radically Lower the Price of Compute

Odd Lots50m 21s

Anjney Midha, founder of AMP PBC and early Anthropic investor, discusses the physical and economic inefficiencies in AI compute infrastructure, arguing that most data centers run at dangerously low utilization rates. He explains how AMP is building a software-based 'grid' to standardize and optimize compute across heterogeneous chip types, similar to how AC/DC standardization unlocked electricity distribution. He also challenges the notion that only three frontier AI labs exist, arguing instead for a 'jagged frontier' with many specialized leaders.

Summary

The episode opens with hosts Tracy Alloway and Joe Wiesenthal framing AI as a technology defined by physical constraints — energy, GPUs, real estate — despite its seemingly ephemeral nature. They introduce Anjney Midha, founder of AMP PBC, former a16z general partner, Stanford visiting scientist, and one of the earliest investors in Anthropic.

Midha recounts his origin story: born in India, educated in Singapore and at Stanford, he got swept up in the early deep learning wave under figures like Andrew Ng and Andrej Karpathy. After a stint at Kleiner Perkins, he founded Ubiquity 6, a 3D mapping startup that was derailed by the pandemic but ultimately acquired by Discord. Shortly after, he was recruited into early conversations with the founders of Anthropic — Dario Amodei and others — who had just trained GPT-3 at OpenAI and wanted to spin out. Midha wrote the first angel check and helped workshop fundraising strategy, only to find that nearly every major VC on Sand Hill Road passed. The team raised $100M from angels before eventually securing a $4B compute and capital partnership with Amazon.

On the 'frontier model' question, Midha pushes back on the framing that only three labs (OpenAI, Anthropic, Google DeepMind) are at the frontier. He argues the frontier is 'jagged' — there are 17 different frontiers with four different players each, and the models differ meaningfully when used hands-on. He uses his Stanford course, 'Frontier Systems,' to illustrate the four-step pipeline of model creation: pre-training, mid-training, post-training, and the continuous verifiable feedback loop. He emphasizes that AI progress is fastest where feedback is objectively verifiable — like software engineering (code passes or fails unit tests) or materials science (physical experiments confirm or deny predicted properties). Subjective tasks like creative writing or therapy remain much harder.

The bulk of the conversation focuses on AMP PBC's core mission: standardizing and optimizing compute infrastructure. Midha argues that the current compute market is built on long-term leases where labs pay for chips 24/7 regardless of usage, leading to massive waste. Average independent data centers run at less than 70% node utilization, and model flop utilization (how much of each chip is actually being used during a workload) can be as low as 11% — he cites xAI's Colossus 2 as an example. This waste inflates the effective cost from the advertised ~$2.50/GPU hour to an actual ~$25-28/GPU hour. AMP's solution is a software translation layer — inspired by Google's internal 'Borg' system, co-built with Midha's Stanford roommate Sebastian Lobo — that makes heterogeneous compute fungible through standardized 'grid credits,' improving utilization to 95-96% and dramatically lowering effective costs.

Midha also explains why so many companies are designing custom chips: economic independence (reducing the ~80 cents of every dollar that flows to NVIDIA) and supply chain control (TSMC effectively decides which labs grow by allocating production capacity). He is skeptical of financializing compute into speculative markets, preferring efficient allocation through demand forecasting similar to a corporate trading desk.

On corporate AI adoption, Midha describes a barbell distribution: CEOs who use the tools themselves understand the jagged frontier and ask sharp questions, while those who don't are flying blind. He argues technical literacy should be non-negotiable for leaders — outsourcing understanding to AI systems, rather than just tedious workflows, is the real risk. He predicts that over time, companies won't care which model is running under the hood; they'll just want the cheapest, most efficient service, with routing across hundreds of models abstracted away by trusted brands.

Key Insights

  • Midha argues that the effective cost of GPU compute is ~$25-28/hour despite advertised rates of ~$2.50/hour, because labs over-provision for peak demand and pay for idle chips around the clock — a massive hidden deadweight loss.
  • Midha claims xAI's Colossus 2 cluster of 500,000 GB300s was running at less than 60% node utilization and less than 11% model flop utilization, suggesting public chip-count announcements are a poor proxy for actual AI capability.
  • Midha argues AI progress is fastest and most predictable wherever feedback is objectively verifiable — software engineering and materials science — and that subjective domains like creative writing will remain a persistent weakness of current models.
  • Midha contends there is no single 'frontier' in AI; instead there are roughly 17 different frontiers (e.g., software engineering, consumer chat, video generation) each with multiple competitive leaders, making the 'three labs at parity' framing a fundamental misunderstanding.
  • Midha describes AMP PBC as an 'independent system operator' of a compute grid — analogous to AC/DC standardization unlocking the electrical grid — using software (not hardware) to make heterogeneous chips from different manufacturers fungible via 'grid credits.'
  • Midha argues that the breakthroughs in tools like Claude's Code were not purely 'harness innovations' but the result of deliberate co-design between model researchers and tooling engineers, who coordinate months in advance on what capabilities the next model will have.
  • Midha warns that leaders who outsource their understanding to AI — rather than just tedious workflows — are the most at risk, because without technical literacy they project inaccurate capabilities onto the models and make poor strategic and deployment decisions.
  • Midha expresses concern about the financialization of compute markets, arguing that allowing speculative trading on compute capacity would create artificial scarcity and harm research teams, preferring instead efficient allocation through demand forecasting similar to an internal corporate trading desk.

Topics

AI compute infrastructure and inefficiencyAMP PBC's software-based compute gridThe 'jagged frontier' of AI model developmentVerifiable feedback loops in AI trainingEarly history and fundraising of AnthropicCustom chip design economics and supply chain controlCorporate AI adoption and technical literacyFinancialization of compute markets

Full transcript available for MurmurCast members

Sign Up to Access

Get AI summaries like this delivered to your inbox daily

Get AI summaries delivered to your inbox

MurmurCast summarizes your YouTube channels, podcasts, and newsletters into one daily email digest.