NewsOpinion

The AI Token Shortage Begins [AI Monthly Recap]

The AI Daily Brief: Artificial Intelligence News and AnalysisJune 1, 202628m 41s

The AI Daily Brief recaps May 2026 as a pivotal month marked by explosive revenue growth at OpenAI and Anthropic, driven by token-based API consumption rather than seat subscriptions. The host argues the industry is transitioning from an 'AI subsidy era' to a 'token scarcity era,' with major implications for enterprise AI budgets, business models, and infrastructure investment. Key developments include Elon Musk repositioning SpaceX as a compute provider for Anthropic, shifting enterprise billing models, and growing recognition that agentic AI demands far more resources than previously anticipated.

Summary

The episode opens by framing May 2026 as the second major AI transitional moment of the year, following the emergence of the true agent era at the start of 2026. The host traces how tools like Claude Code and Codex shifted mainstream software engineers from prototype vibe coding to pushing agent-created code into production, fundamentally changing how AI value was consumed and measured.

The most significant economic shift discussed is the move from seat-based billing to token-based consumption as the dominant revenue model. Anthropic's annualized revenue surged from $3 billion at the start of 2025 to $47 billion, while OpenAI reached $30 billion ARR. The host uses a personal anecdote — spending $5,000 in six weeks on a single API-driven project versus $200/month for a Claude Max seat — to illustrate just how differently token-based usage scales compared to flat subscriptions.

The host introduces the central thesis of the episode: the industry is transitioning from an 'AI subsidy era,' where power users extracted 10–20x the value of their subscription costs, to a 'token scarcity era,' defined by structural compute shortages and rising costs. This shift is evidenced by companies like GitHub Copilot, Google (Gemini), and Anthropic all moving toward usage-based billing and introducing limits on previously flat-rate plans.

Enterprise AI ROI comes under scrutiny, with Uber's CTO revealing the company burned through its entire 2026 AI budget in four months, and Uber's COO later expressing skepticism about actual value received. This feeds into broader 'AI sticker shock' coverage from outlets like Axios. The host acknowledges the 'token maxing' trend — internal leaderboards incentivizing maximum token consumption — as a symptom of the subsidy era that is now becoming financially untenable, with companies like Amazon scrapping such programs.

Both OpenAI and Anthropic responded to the enterprise capability overhang by launching deployment and consulting services: OpenAI via a majority-owned deployment company embedding forward-deployed engineers in large clients, and Anthropic through a partnership with Blackstone, Hellman & Freeman, and Goldman Sachs to form a separate enterprise AI services firm.

A major geopolitical and infrastructure story emerged with Elon Musk repositioning XAI/SpaceX as a compute provider. SpaceX granted Anthropic access to both Colossus 1 and Colossus 2 data centers, effectively making SpaceX a 'neocloud.' The host argues this reframes the SpaceX IPO narrative from an also-ran AI model company to a dominant AI infrastructure play, with implications extending to orbital data centers that both Elon and Jeff Bezos are now publicly discussing as near-term realities.

On the model release front, the host notes relative quiet, with Claude Opus 4.8 launching at month's end but generating muted excitement. Commentators like Greg Eisenberg compared model releases to iPhone updates — incremental and hard to distinguish — arguing that what matters more now is the harness and surface around models, such as Claude Code's Dynamic Workflows and the Slash Goal primitive.

Narrative shifts from AI leaders were also noted: both Sam Altman and Dario Amodei have begun walking back apocalyptic labor displacement rhetoric, with Altman articulating that he overestimated the speed and nature of transformation. On the policy side, the host highlights diverging Democratic approaches — from Bernie Sanders and AOC calling for data center moratoriums to Elizabeth Warren advocating for a token tax — and notes White House involvement in restricting model releases like Anthropic's Mythos partly due to token shortage concerns.

The episode closes by forecasting June will be shaped by the SpaceX IPO, expected new model releases from OpenAI and Anthropic (including Mythos), and enterprise recalibration to the token scarcity reality. The host frames this as an opportunity for enterprises that move quickly to manage AI costs efficiently.

About this episode

One of the most consequential AI months of 2026, May marked a major shift from the AI subsidy era into a new period defined by token scarcity, usage-based pricing, enterprise sticker shock, and a broader scramble for compute. NLW argues that the next phase of AI competition will be shaped by who can access, afford, optimize, and deploy AI tokens most effectively.Sign up for AI Executive Catchup: <a href="https://aiexecutivecatchup.com/">⁠https://aiexecutivecatchup.com/⁠</a>Brought to you by:KPMG – Research from KPMG and the University of Texas at Austin shows the highest-impact AI users treat AI like a reasoning partner — and those skills can be taught at scale. Learn more at <a href="kpmg.com/us/Sophisticated" rel="ugc noopener noreferrer" target="_blank">⁠⁠⁠⁠kpmg.com/us/Sophisticated⁠⁠⁠⁠</a>Outsystems - Stop wondering how AI will change your business and start building the agents that will lead it - http://outsystems.com/Scrunch - The AI customer experience platform - <a href="https://scrunch.com/" rel="ugc noopener noreferrer" target="_blank">⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://scrunch.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠</a>Zenflow Work - Agents for knowledge work - <a href="https://zenflow.free/" rel="ugc noopener noreferrer" target="_blank">⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://zenflow.free/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠</a>Blitzy - Want to accelerate enterprise software development velocity by 5x? <a href="https://blitzy.com/" rel="ugc noopener noreferrer" target="_blank">⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://blitzy.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠</a>AssemblyAI - The best way to build Voice AI apps - <a href="https://www.assemblyai.com/brief" rel="ugc noopener noreferrer" target="_blank">⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.assemblyai.com/brief⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠</a>Robots & Pencils - Cloud-native AI solutions that power results <a href="https://robotsandpencils.com/" rel="ugc noopener noreferrer" target="_blank">⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://robotsandpencils.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠</a>The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: <a href="https://pod.link/1680633614" rel="ugc noopener noreferrer" target="_blank">⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://pod.link/1680633614⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠</a>Our Newsletter is BACK: <a href="https://aidailybrief.beehiiv.com/" rel="ugc noopener noreferrer" target="_blank">⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://aidailybrief.beehiiv.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠</a>Interested in sponsoring the show? [email protected]

Key Insights

The host argues that Anthropic's revenue explosion — from $3B to $47B annualized in roughly one year — was driven not by seat subscriptions but by token-based API consumption from agentic workflows, fundamentally disproving earlier AI bubble narratives that assumed revenue couldn't keep pace with infrastructure costs.
The host contends that the 'AI subsidy era,' in which power users received 10–20x the value of their subscription fees, is structurally ending as providers like GitHub Copilot, Google, and Anthropic shift to usage-based billing — a change he frames as inevitable given that agentic AI sessions consume orders of magnitude more compute than simple chat queries.
The host argues that Elon Musk's decision to provide SpaceX's Colossus 1 and Colossus 2 data centers to Anthropic represents a strategic repositioning away from competing on model quality (Grok) toward dominating AI infrastructure supply — a move the host says makes the SpaceX IPO far more compelling to investors as an AI infrastructure play rather than an AI model company.
The host claims that the 'token maxing' trend — companies creating internal leaderboards to incentivize maximum AI consumption — is being rapidly abandoned not primarily due to Goodhart's Law concerns about gaming metrics, but because the end of the subsidy era has made indiscriminate token consumption financially unsustainable at an enterprise scale.
The host observes that AI model releases are losing cultural and commercial significance relative to improvements in the harnesses surrounding them, citing commentators who ignored Claude Opus 4.8's release while getting excited about Claude Code's Dynamic Workflows — suggesting the competitive battleground has shifted from model benchmarks to developer surfaces and agentic tooling.

Topics

Transition from AI subsidy era to token scarcity eraExplosive revenue growth at OpenAI and AnthropicShift from seat-based to usage-based billing modelsEnterprise AI ROI skepticism and sticker shockSpaceX/Elon Musk repositioning as AI compute infrastructure providerAgentic AI adoption and the capabilities overhangToken maxing experiments and their unravelingAI model releases vs. harness/surface innovationPolicy responses including token taxes and data center moratoriumsAI infrastructure investment surge

Transcript

Today on the AI Daily Brief, we're recapping the month of May, one of the single most consequential AI months we've had in a very, very long time. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. All right, friends, quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG, Robots and Pencils, Zencoder, and OutSystems. To get an ad-free version of the show, go to patreon.com slash ai-dailybrief, or you can subscribe on Apple Podcasts. If you want to learn more about sponsoring the show, send us a note at sponsors at ai-dailybrief.ai. Today is the first day of June. And while I don't…

Full transcript available for MurmurCast members

View original source →

More from The AI Daily Brief: Artificial Intelligence News and Analysis

Get AI summaries like this delivered to your inbox daily

The AI Token Shortage Begins [AI Monthly Recap]

Summary

About this episode

Key Insights

Topics

Transcript

More from The AI Daily Brief: Artificial Intelligence News and Analysis

The Self-Driving Company

Is Kimi K3 Really Fable Class?

The New Enterprise Battle Over Who Owns the Model

5 AI Engineering Trends for Non-Engineers

AI Optimism vs. AI Pessimism

Get AI summaries delivered to your inbox