ResearchInsightful

Premium: Modular inference

HHHYPERGROWTH

NVIDIA's Vera Rubin platform represents a major expansion into a modular AI factory system, incorporating 7 chips including the acquired Groq architecture. The platform is expected to begin generating revenue in Q3 2027, with management projecting significant TAM expansion and revenue uplift across multiple dimensions including inference, storage, and CPU workloads.

Summary

This transcript introduces a multi-part analysis series focused on NVIDIA's Vera Rubin platform, framed within a broader 'NVIDIA Week' of coverage. The author sets the stage by noting that Vera Rubin was first unveiled at CES in January, significantly expanded at GTC in March, and further refined at GTC Taipei. The platform is positioned as a response to the rise of agentic AI, which the author describes as a new scaling law driving demand not just for GPUs but also CPUs for orchestration, tool use, and code execution.

The Vera Rubin line consists of 6 frontier chips designed across GPU, CPU, DPU, and three networking layers. A key development was NVIDIA's acquisition of Groq approximately two weeks before CES. Groq's SRAM-based serialized architecture is being integrated as a 7th chip in the ecosystem, not to replace GPUs but to complement them by reducing latency in specific inference stages. This is being positioned to enable premium-priced, low-latency inference tiers for end users.

By GTC in March, the platform had grown into 5 new rack systems, allowing customers to mix and match modular components across disaggregated compute, networking, storage, and agentic orchestration. Revenue from Vera Rubin is expected to begin appearing at the tail end of Q3 2027 and ramp heavily in Q4 2027 and Q1 2028. Management has provided rough projections — termed 'Jensen Math' — estimating 25% revenue uplift from Groq, 20% from AI storage, and 5% from Vera CPUs. Vera CPUs are expected to represent a $200B TAM expansion, with Vera systems contributing approximately $20B in FY27, roughly 5% of the overall revenue mix.

The transcript also notes that future iterations, including Rubin Ultra and Feynman, will continue using Oberon-based NVL72 racks, ensuring backward compatibility for existing AI data centers without requiring full power and cooling overhauls. The series is broken into three parts: Part 1 covers the new agentic scaling law, the Groq acquisition, and initial/GTC announcements; Part 2 dives into specific rack systems and hardware changes; Part 3 addresses the agentic software stack, ecosystem investments, and partnerships.

Key Insights

  • The author argues that agentic AI constitutes an entirely new scaling law that will drive GPU demand AND CPU demand simultaneously, as agents require orchestration, tool use, and code execution beyond pure model inference.
  • The author claims NVIDIA is positioning Groq's low-latency architecture not as a GPU replacement but as a complementary inference accelerator that enables inference providers to offer premium-priced, lower-latency service tiers.
  • The author reports that NVIDIA management projects Vera Rubin will add $200B in TAM and that Vera systems alone will represent approximately $20B in FY27 revenue, framed as incremental to an already-expected $1 trillion in GPU sales through 2027.
  • The author notes that NVIDIA's decision to continue future iterations (Rubin Ultra, Feynman) on Oberon-based NVL72 racks is a deliberate backward-compatibility strategy, allowing existing AI data centers to upgrade without full infrastructure overhauls.
  • The author highlights that NVIDIA's acquisition of Groq occurred just two weeks before CES, suggesting the integration into Vera Rubin was rapidly incorporated into the product roadmap and became a central pillar by GTC in March.

Topics

NVIDIA Vera Rubin modular AI platformGroq acquisition and SRAM-based inference architectureAgentic AI as a new scaling lawRevenue projections and TAM expansionModular rack systems for disaggregated AI infrastructure

Full transcript available for MurmurCast members

Sign Up to Access

Get AI summaries like this delivered to your inbox daily

Get AI summaries delivered to your inbox

MurmurCast summarizes your YouTube channels, podcasts, and newsletters into one daily email digest.