20VC: Open Models vs Frontier Models: Who Actually Wins? | The $100,000 Token Budget Every Engineer Will Need | Why Forward-Deployed Engineers Are the Future of Enterprise AI with Clay Bavor, Co-Founder of Sierra
Clay Bavor, co-founder of Sierra (a $16B AI enterprise software company), discusses the divergence between frontier and open-source AI models, the importance of forward-deployed engineers in enterprise AI, and how to build high-intensity teams with strong cultural values. He shares insights from 18 years at Google and Sierra's rapid scaling to serve 40% of the Fortune 50.
Summary
Clay Bavor discusses Sierra's positioning in the AI enterprise software market and the strategic question of frontier versus open models. He argues there is unbounded demand for frontier-level intelligence in domains requiring high stakes and complexity (coding, science, legal), while open models will handle routine tasks. Chinese companies' advantage in scaled distillation of frontier models means open weights models will become increasingly capable, but this won't eliminate frontier model demand—instead, companies will use both for different workloads.
On token economics, Bavor notes that costs are not simply declining as predicted; reasoning models like OpenAI's O1 drive token consumption upward through test-time compute and inference. The real constraint is GPU supply, not model capability. He expects token spending on engineering to eventually reach 20% of developer salaries (up from current 3.8%), with the $100,000 annual token budget per engineer becoming normalized.
Bavor emphasizes Sierra's forward-deployed engineer (FDE) model, adapted from Palantir, as critical to enterprise AI success. Rather than selling software and hoping customers implement it, Sierra embeds engineers inside Fortune 50 companies during deployment, dramatically reducing time-to-value (e.g., six weeks to deployment at Nextiva, 58 days at Cigna). This approach works because no one has deployed AI agents before, requiring deep business understanding and partnership rather than vendor-customer dynamics.
On company building, Bavor describes Sierra's unique operational practices: six-week board cycles instead of quarterly, board memos instead of decks, and explicit focus on what the company sucks at. Every funding round was deliberately priced below market rate to maintain founder control and long-term thinking. The company's three core values—craftsmanship, intensity, and family—directly reflect Bavor's and co-founder Brett's priorities and inform hiring and culture.
Barvor discusses the shift from hiring in traditional ways to AI-native interview processes, where candidates build applications using coding agents with a $150 token budget. He notes that 22-23 year-olds deeply familiar with AI tools are among Sierra's most effective employees. On engineering productivity, he reports top engineers spending $100K+ annually on tokens and estimates 3-20x productivity gains from AI coding agents.
The company has built internal tools including Pinecone (an internal agent for running the company) and an MCP gateway aggregating all company systems, allowing employees to query and interact with all accessible company information. Sierra Brain serves as a strategy thought partner grounded in company knowledge, board letters, and competitive insights.
Barvor discusses the value of sustained founder intensity, being selective about where founders apply detailed involvement (agent architecture, key hiring decisions), and the importance of in-person work for building culture, mentorship, and apprenticeship. He addresses parenting four children alongside building a high-growth company, emphasizing efficiency, clear goals, good habits, and making space for family time and rituals.
About this episode
<p dir="ltr">Clay Bavor is the Co-Founder of Sierra, one of the world's fastest-growing enterprise AI companies. Sierra is valued at approximately $15.8 billion, has raised more than $1.5BN from leading investors including Sequoia, Benchmark, Greenoaks, GV and Tiger Global, and today serves more than 40% of the Fortune 50. The company recently surpassed $150 ARR, making it one of the fastest-growing enterprise software businesses in history.</p> <p dir="ltr">AGENDA:</p> <p dir="ltr">00:00 – Why Frontier AI Demand Will Be Unlimited</p> <p dir="ltr">08:00 – Open Models vs Frontier Models: Who Actually Wins?</p> <p dir="ltr">17:00 – China's AI Advantage & The Distillation Debate</p> <p dir="ltr">20:30 – Inside Sierra: The AI Agents Running the Entire Company</p> <p dir="ltr">24:00 – The $100,000 Token Budget Every Engineer Will Soon Need</p> <p dir="ltr">29:00 – Building AI for 40% of the Fortune 50</p> <p dir="ltr">37:00 – Why Forward-Deployed Engineers Are the Future of Enterprise AI</p> <p dir="ltr">43:00 – Sierra's Unusual Board Meetings & Billion-Dollar Company Playbook</p> <p dir="ltr">48:00 – The Four Values Behind a $16B Startup: Craftsmanship, Intensity & Family</p> <p dir="ltr">56:00 – Clay Bavor's Hiring Philosophy, AI-First Teams & What's Coming Next</p> <p dir="ltr"> </p>
Key Insights
- Bavor argues that demand for frontier intelligence is effectively unbounded in high-stakes domains like coding, science, and law, meaning open model advancement won't eliminate frontier model demand but will create a portfolio approach where companies use both models for different tasks.
- He claims that Chinese companies' willingness to do scaled distillation of U.S. frontier models is the primary reason Chinese open-weights models appear more advanced than U.S. open models, not superior capability development.
- Bavor states that token costs are increasing rather than decreasing due to the proliferation of reasoning models and test-time compute, contradicting early assumptions about token price deflation.
- He asserts that the fundamental constraint on token costs is GPU supply and power availability, not model capability, meaning even if open models become very capable, prices will remain high due to compute scarcity.
- Bavor claims that forward-deployed engineers are not strictly necessary to sell enterprise AI but are a critical catalyst for rapid value delivery and customer success, enabling 6-week deployments at Fortune 50 companies.
- He argues that no organization has successfully deployed AI agents before Sierra's customers, meaning deep business partnership and embedded engineers matter more than traditional vendor relationships in this nascent category.
- Bavor states that token spending on engineering will eventually reach approximately 20% of developer salaries (five times current levels), making the $100,000 annual token budget per engineer a normalized operating expense.
- He claims that 22-23 year-old AI-native employees are among Sierra's most effective at the entire company, suggesting age and experience matter less than deep AI tool familiarity for this technology wave.
- Bavor asserts that founders must set the pace for intensity and selectively apply direct involvement only to things that won't happen quickly without founder force, avoiding performative founder-mode in areas that don't require it.
- He argues that in-person work is essential for young companies to build culture, shared norms, mentorship, and apprenticeship, and cannot be replicated effectively through remote work regardless of tools.
- Bavor claims that deliberately pricing funding rounds below market rate preserves founder decision-making power and enables long-term thinking unconstrained by investor pressure for rapid returns.
- He states that ambitious goals function as self-fulfilling prophecies that reveal what must be true to achieve them, and suspend disbelief on timeline constraints (like building Japan business in current year rather than next year).
Topics
Transcript
We have not yet appreciated the unbounded demand for, call it frontier levels of intelligence. Part of the driver of the difference is probably the willingness of Chinese companies to do scaled distillation of the frontier models. If you can't build frontier models yourself, okay, maybe the next best approach is to distill them and offer them up. Every one of our rounds, we actually guided to and took a lower price than we could have. Some of our most effective employees at the entire company are 22 or 23 years old and have been completely AI-pilled. We completely changed our engineering interview process. When Pat Grady at Sequoia and Neil Major at Greenoats tell you someone is special, well,…
Full transcript available for MurmurCast members
Sign Up to AccessMore from The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Dario and Anthropic Declare War on Open-Source | Coinbase Slash AI Spend by 50% | Kalshi's $40BN Valuation and Impending IPO | Bending Spoons: Smartest IPO of 2026 and the Year for SaaS Roll-Ups
Harry Stebbings, Jason Lemkin, and Rory O'Driscoll discuss major tech developments including Coinbase's 50% AI spend reduction, Anthropic's concerns about Chinese model distillation, Microsoft's declining growth, Bending Spoons' $20B IPO valuation, and emerging opportunities in B2B SaaS consolidation. The conversation explores tensions between frontier model economics, open-source competition, regulatory dynamics, and venture funding standards in the AI era.
20VC: Nikesh Arora on the Frontier Model Problem: Breadth vs Depth | The Future of Token Costs | Memory Becoming the Moat | Where Value Accrues: Infra, Models, or Apps? | Why Enterprise AI is Not Ready & Systems of Record vs Systems of Intelligence
Nikesh Arora, CEO of Palo Alto Networks, discusses the tension between frontier AI models pursuing consumer breadth versus enterprise depth, arguing token pricing will decline 90% long-term and that enterprise transformation requires rethinking workflows with AI-native applications rather than marginal improvements to existing processes.
20VC: Why Remote Work is White Collar Fraud | Why Revenge and Patriotism are the Best Founder Traits | Two Questions Every Founder Needs to Ask | The Wild Story of Raising $1BN from Masa Son in an Hour Long Meeting with Ryan Peterson, Founder @ Flexport
Ryan Peterson, founder of Flexport, discusses venture capital dynamics, remote work skepticism, AI integration in logistics, and his journey building a $450M revenue freight forwarding company. He shares insights on founder mistakes, VC collusion, angel investing lessons, and the future of enterprise automation through AI agents.
20VC: SpaceX Launches Largest Ever IPO | OpenAI Files to Go Public | Uber Cuts 23% of HR | Lovable Hits $500M ARR | Founders Revolt Against VCs: The Fundraising Horror Stories Going Viral
Harry Stebbings, Jason Lemkin, and Rory O'Driscoll discuss the week's biggest tech news including SpaceX's $1.77 trillion IPO roadshow, OpenAI filing to go public, Uber cutting 23% of HR, Lovable hitting $500M ARR, and founders sharing VC fundraising horror stories. The panel debates IPO mechanics, AI efficiency trends, and the structural shift toward leaner startups powered by AI.
20VC: Nebius Co-Founder on AI Infrastructure Bubbles | The Real Impact of Open Source on OpenAI & Anthropic | How Price Elastic is Demand for Compute | Could Nebius Sell 10x More Compute If They Had It & more with Roman Chernin
Roman Chernin, co-founder of Nebius, discusses the company's AI infrastructure strategy, arguing we are at the very beginning of AI adoption rather than in a bubble. He outlines Nebius's four-layer product stack from bare metal capacity to managed inference, and explains why diversification of customers and vertical integration are critical to long-term survival against hyperscaler competition.