InsightfulDiscussion

Jonathan Ross, Founder of Groq

David SenraJuly 5, 20261h 11m

Jonathan Ross, founder of Groq, discusses the $20 billion partnership with NVIDIA integrating GPUs and LPUs for faster AI inference, shares leadership lessons learned over a decade building Groq including the importance of hiring for negatives and manufacturing discontent, and articulates an optimistic vision of AI democratizing software development and education through accessibility.

Summary

Jonathan Ross explains how Groq's partnership with NVIDIA emerged from recognizing that GPUs and LPUs are complementary—GPUs excel at compute-constrained tasks while LPUs handle memory-throughput-constrained operations. The deal came together remarkably fast, with money in the bank within three weeks of the initial pitch, though the underlying integration work had taken 3-4 months. Ross emphasizes that speed in AI inference matters increasingly for AI-to-AI communication, where agents operate exponentially faster than human reading speeds, making latency critical.

Ross shares extensively on leadership evolution at Groq. Early on, he was a poor manager because his natural delegation style conflicted with hiring people who needed direction. This cost the company 3-4 years. He eventually learned to set brutally clear objectives (distilled to a challenge coin: "25 million tokens per second") while minimizing constraints, allowing autonomous teams to innovate and surprise him. He adopted intentional leadership from David Marquette's submarine example—stating "I intend to do X" rather than asking permission, which invites critical feedback without inviting unnecessary pessimism.

On hiring, Ross initially looked for positives (which suits talent development) but realized that selecting talent requires screening for negatives through a people spec. He developed attributes like return on luck, poetic design, and loss bias—hiring people who "book the win early" by immediately committing to possibilities they identify. The most critical near-death moment came when Groq had three weeks of runway remaining. Rather than lay off critical staff needed for the compiler breakthrough, Ross implemented "Grok bonds"—salary-for-equity exchanges. Eighty percent of employees participated, many cutting salaries to statutory minimums, extending runway by two months.

Ross discusses why Groq's fast inference thesis was initially dismissed. Four years ago, potential customers couldn't grasp why LLM speed mattered beyond what humans could read. Ross had to let customers experience the speed difference firsthand; a viral video of their fast model on X created immediate demand. This illustrates his broader point about return on luck—he saw three opportunities to build LLMs on Groq chips, talked out of the first two, but forced the third, hitting exact performance predictions the team said were impossible.

On the AI future, Ross expresses optimism about accessibility. Just as literacy democratized writing, AI will democratize software creation. His executive assistant now builds applications. He envisions individual founders solving community problems without large teams—a shift from historical capital and talent constraints. Education should shift from teaching students to answer questions toward teaching them to ask questions, with AI as the tool for learning on demand. The broader opportunity is making code nearly free to produce, shifting from scarcity to abundance in software development.

About this episode

Jonathan Ross is the founder of Groq and the inventor of the Google Tensor Processing Unit (TPU), now a senior executive at NVIDIA following the company's $20 billion partnership with Groq. Before Groq, Ross built something that didn't exist: a custom AI chip at Google called the TPU, which became the backbone of DeepMind's AlphaGo — the system that defeated world Go champion Lee Sedol in 2016. After watching the TPU push AlphaGo's ELO score up by hundreds of points overnight, Ross grasped a principle that would define his next decade: faster inference produces more capable models. He left Google to act on it. Groq's first decade was brutal. Early West Coast VCs passed — and would later watch as NVIDIA announced what Ross describes as the firm's largest deal by nearly 3x. Ross came within weeks of running out of money. Rather than lay off the engineers he needed to hit a critical product milestone, he created "Groq bonds" — war-bond–style instruments that exchanged salary for equity. About 80% of the team participated; nearly half took statutory minimum wage. They saved two months of runway and kept the company alive. The core bet Ross made — that fast inference would matter — was widely dismissed, inside Groq and out. When the CEO of GitHub called needing chips to run LLMs, Ross's own engineers told him it couldn't be done. He eventually stopped asking and started declaring: "I intend to do this." He describes that shift — from inviting pessimism to announcing direction — as the most important leadership change he made. Now at NVIDIA, Ross carries what he calls manufactured discontent: a deliberate refusal to rest, convinced that every day without sufficient compute is a day the world waits longer for cures for cancer and aging. Show notes: https://www.davidsenra.com/episode/jonathan-ross Made possible by Ramp: ⁠https://ramp.com AppLovin: https://applovin.com/senra Deel: https://deel.com/senra Chapters (00:00:00) The $20 Billion NVIDIA Deal Closed In 3 Weeks (00:00:25) Why GPUs And LPUs Are Better Together (00:01:46) When AI Talks To AI, Speed Wins (00:03:30) Always Start With A Hobby Project (00:05:55) Ask The Right Questions, Not Answer Them (00:08:23) There Are Infinite Ways To Be A Leader (00:13:00) I Was One Of The World's Worst Leaders (00:14:34) Fewer Constraints, More Room To Surprise You (00:16:31) At NVIDIA There Is No Politics (00:19:44) You Have To Learn Confidence (00:22:23) East Coast VCs Think, West Coast VCs Follow (00:23:50) The Keynesian Beauty Contest Of Silicon Valley (00:26:48) The Autonomy That Created The NVIDIA Deal (00:30:07) Making A Model Smarter By Making It Faster (00:34:52) Reality Quotient Beats Intelligence Quotient (00:35:44) Find The Dominant Game And Play It (00:37:11) A Founder's Job Is Full-Time Change Management (00:38:34) Return On Luck: Seize It Better Than Anyone (00:42:54) You Can't Sell Speed, You Have To Let People Try It (00:46:32) I Intend To: Intentional Leadership (00:51:07) Groq Bonds: Trading Salary For Survival (00:54:13) Hire For Negatives, Grow For Positives (00:58:46) Loss Aversion And Booking The Win Early (01:00:37) How Michael Jordan Weaponized Humiliation (01:03:13) Manufactured Discontent Drives Everything (01:05:02) Every Day Without Compute Has A Real Cost (01:07:07) Code Was Rationed, Now It's Nearly Free (01:10:04) Teach Kids To Ask Questions, Not Answer Them Learn more about your ad choices. Visit megaphone.fm/adchoices

Key Insights

Ross argues that the $20 billion NVIDIA partnership emerged because combining GPUs for compute-constrained tasks and LPUs for memory-throughput-constrained tasks defeats bottlenecks across different AI models, something neither architecture handles optimally alone.
Ross claims that AI thinking so much faster than human reading speeds means speed is becoming critical for AI-to-AI communication, where agents produce tokens exponentially faster than humans can consume them.
Ross contends his leadership style cost Groq 3-4 years in the early years because he naturally delegated to people who needed direction, creating gridlock rather than empowerment until he learned to hire autonomous people.
Ross argues that stating 'I intend to do this' rather than asking 'should I do this' fundamentally changes team dynamics—it invites critical feedback only when truly warranted rather than inviting pessimistic opinions that halt progress.
Ross maintains that hiring for negatives (what to avoid in people) differs fundamentally from hiring for positives (what to develop in people), and successful hiring requires screening against negative traits like squandering luck.
Ross contends that loss bias—the psychological weight of losing something possible versus gaining something new—should be a primary hiring criterion, attracting people who 'book wins early' by immediately committing to identified opportunities.
Ross argues that the Grok bonds crisis, where 80% of employees took massive salary cuts for equity when facing three weeks of runway, created shared ownership that reduced attrition below pre-announcement levels.
Ross claims he passed on the first two opportunities to deploy LLMs on Groq chips because he invited team opinions rather than stating intent, but forced the third opportunity through conviction, achieving exactly the performance metrics the team said were impossible.
Ross contends that four years ago, fast LLM inference was widely dismissed as unnecessary because people couldn't conceive of use cases beyond human reading speed—only viral demonstration of speed created market validation.
Ross argues that the AI age will democratize software creation similarly to how literacy democratized writing, enabling individuals without technical backgrounds to build valuable applications and companies.
Ross maintains that traditional education's problem is force-feeding answers, but the AI age should teach students to ask effective questions, with AI providing immediate feedback and learning on demand around problems that matter to their communities.
Ross contends that exceptional entrepreneurs and athletes manufacture discontent intentionally—never resting on wins but immediately identifying next problems to solve—which distinguishes continuous innovators from those satisfied with past success.

Topics

GPU-LPU integration and complementary architecturesAI inference speed and AI-to-AI communicationLeadership transition from command-control to intentional leadershipHiring for negatives vs. positivesManufacturing discontent as motivational driverPeople spec and organizational culture fitReturn on luck and seizing opportunitiesGrok bonds and crisis capital managementFast inference market validationAI democratization of software developmentEducation reimagined around questioningLoss bias and competitive drive

Transcript

Let's start with this rumored $20 billion partnership that you have with NVIDIA. Can you talk about the structure of the deal and how it came about? The most interesting part about it is the call where the idea was first floated, was about three weeks before money was in the bank. That's a chance to lose fast. Of course. That's how you stay ahead. How did it come about? So we had been working on integrating GPU and LPUs together. The best way to describe why this helps is if you're building out a logistics network for the United States, and I told you you could have either 18 wheelers or vans for last mile delivery, which one would…

Full transcript available for MurmurCast members

View original source →

Get AI summaries like this delivered to your inbox daily

Summary

About this episode

Key Insights

Topics

Transcript

Get AI summaries delivered to your inbox