Premium: Modular inference Summary — HHHYPERGROWTH

Summary

This transcript introduces a multi-part analysis series focused on NVIDIA's Vera Rubin platform, framed within a broader 'NVIDIA Week' of coverage. The author sets the stage by noting that Vera Rubin was first unveiled at CES in January, significantly expanded at GTC in March, and further refined at GTC Taipei. The platform is positioned as a response to the rise of agentic AI, which the author describes as a new scaling law driving demand not just for GPUs but also CPUs for orchestration, tool use, and code execution.

The Vera Rubin line consists of 6 frontier chips designed across GPU, CPU, DPU, and three networking layers. A key development was NVIDIA's acquisition of Groq approximately two weeks before CES. Groq's SRAM-based serialized architecture is being integrated as a 7th chip in the ecosystem, not to replace GPUs but to complement them by reducing latency in specific inference stages. This is being positioned to enable premium-priced, low-latency inference tiers for end users.

By GTC in March, the platform had grown into 5 new rack systems, allowing customers to mix and match modular components across disaggregated compute, networking, storage, and agentic orchestration. Revenue from Vera Rubin is expected to begin appearing at the tail end of Q3 2027 and ramp heavily in Q4 2027 and Q1 2028. Management has provided rough projections — termed 'Jensen Math' — estimating 25% revenue uplift from Groq, 20% from AI storage, and 5% from Vera CPUs. Vera CPUs are expected to represent a $200B TAM expansion, with Vera systems contributing approximately $20B in FY27, roughly 5% of the overall revenue mix.

The transcript also notes that future iterations, including Rubin Ultra and Feynman, will continue using Oberon-based NVL72 racks, ensuring backward compatibility for existing AI data centers without requiring full power and cooling overhauls. The series is broken into three parts: Part 1 covers the new agentic scaling law, the Groq acquisition, and initial/GTC announcements; Part 2 dives into specific rack systems and hardware changes; Part 3 addresses the agentic software stack, ecosystem investments, and partnerships.

Key Insights

The author argues that agentic AI constitutes an entirely new scaling law that will drive GPU demand AND CPU demand simultaneously, as agents require orchestration, tool use, and code execution beyond pure model inference.

The author claims NVIDIA is positioning Groq's low-latency architecture not as a GPU replacement but as a complementary inference accelerator that enables inference providers to offer premium-priced, lower-latency service tiers.

The author reports that NVIDIA management projects Vera Rubin will add $200B in TAM and that Vera systems alone will represent approximately $20B in FY27 revenue, framed as incremental to an already-expected $1 trillion in GPU sales through 2027.

The author notes that NVIDIA's decision to continue future iterations (Rubin Ultra, Feynman) on Oberon-based NVL72 racks is a deliberate backward-compatibility strategy, allowing existing AI data centers to upgrade without full infrastructure overhauls.

The author highlights that NVIDIA's acquisition of Groq occurred just two weeks before CES, suggesting the integration into Vera Rubin was rapidly incorporated into the product roadmap and became a central pillar by GTC in March.

Transcript

Welcome to NVIDIA Week! It's time to catch up on the strategic moves from our favorite AI infrastructure provider. This will soon be followed by a Neocloud Week, to catch up on CoreWeave, Nebius, and IREN. Now that we've looked at NVIDIA's stellar Q127 , let's peek at what comes later this year with Vera Rubin. NVIDIA debuted this new era of AI systems at CES in January, greatly expanded it at GTC in March, and refined their agentic message at GTC Taipei this week. This will also be of interest to AI chip competitors, hyperscaler clouds, neoclouds, and AI providers like Anthropic and OpenAI. This will be a multi-part series. This first post focuses on the strategic and financial…

Full transcript available for MurmurCast members

Premium: Modular inference

Summary

About this episode

Key Insights

Topics

Transcript

More from HHHYPERGROWTH

Premium: Farther out waves

Premium: Wave after wave of demand

Premium: Vera Rubin decoder ring

Get AI summaries delivered to your inbox