Stanford's Method Turns Claude Into a PHD Level Research Team
A researcher demonstrates the STORM method from Stanford, which uses five expert perspectives (practitioner, academic, skeptic, economist, historian) to create verified research reports. The method produces 25% more organized articles than competing approaches and is packaged as a reusable Claude skill that generates HTML briefings with peer-reviewed citations.
Summary
The transcript presents a detailed walkthrough of the STORM research methodology developed at Stanford, which the speaker has adapted into a Claude skill for AI-assisted research. The core innovation is using multiple agent perspectives simultaneously to identify blind spots that single-perspective research would miss. Each agent role (practitioner, academic, skeptic, economist, historian) brings different expertise and identifies gaps the others overlook.
The speaker compares STORM to Claude's native Deep Research feature, demonstrating that while Deep Research spins up 100+ agents to produce a markdown report with limited sources, STORM uses a more structured 5-agent approach followed by 6 verification agents to produce a more reliable, organized HTML briefing. In their head-to-head comparison, another AI model (Codex) rated the STORM output superior in six categories: evidence quality, source diversity, thesis strength, actionability, risk control, and content usability.
The methodology works through four sequential prompts: first spinning up the five expert lenses, second creating a contradiction map to identify where perspectives disagree, third synthesizing everything into a single report, and fourth conducting adversarial peer review that verifies citations against primary sources. The skill incorporates an HTML template for consistency.
The speaker emphasizes this skill is freely available in their community and can be installed in Claude's .Claude folder. He demonstrates the live execution showing how subagents run in parallel, how to monitor their research in real-time, and how the final output ranks findings by reliability based on which perspectives supported or challenged each claim. He also explains the distinction between subagents (which only communicate with the main session) and agent teams (which can communicate with each other), noting that agent teams are more expensive.
The broader lesson emphasized is not just about this specific skill, but about the principle that multiple contradicting perspectives conducting research together produce more holistic, accurate results than single-perspective approaches, and that using agents to borrow subject matter expertise can help overcome knowledge gaps.
Key Insights
- Stanford's STORM method produces articles 25% more organized than the next best method through peer-reviewed testing by simulating five distinct expert perspectives that each identify holes the others miss.
- STORM requires a verification phase where sources are not just collected but actively confirmed, corrected, or demoted based on accuracy, whereas comparable methods like Deep Research produce unvetted brain dumps of statistics.
- When comparing STORM's HTML briefing output against Claude's Deep Research on the same topic, a third AI model (Codex) rated STORM superior across six categories while using 100+ fewer agents and being 100% cheaper to run.
- The skill automatically identifies its own blind spots by noting missing lenses—for example, all five original perspectives analyzed the research from ownership/ROI viewpoint, missing customer and frontline employee perspectives entirely.
- Subagents architecture (where multiple agents work for one main session but cannot communicate with each other) differs fundamentally from agent teams (where agents can debate each other to consensus) and costs significantly less while still enabling parallel research.
Topics
Transcript
[0:00] So Stanford has a research method called storm, which has actually been shown in peer-reviewed testing to produce articles 25% more organized than the next best method. So I put all of those storm principles into my own Claude skill, which I'm going to give you guys for completely free, and you end up with the result that looks like this. It is an HTML briefing that has been put together by five different perspectives of agents, and it has been verified. Meaning if I scroll down to the bottom, you can see that the different perspectives are giving analysis on each parts of the report. But at the very bottom, you can see that we have different sources…
Full transcript available for MurmurCast members
Sign Up to AccessMore from Nate Herk | AI Automation
Is Claude Mythos Coming?
A YouTuber analyzes the brief appearance of 'Claude Mythos' on Anthropic's API, arguing it signals a marketing move rather than an imminent public launch. Despite competitive pressure from OpenAI and IPO timing, the creator believes Mythos capabilities will quietly fold into future Opus models rather than release publicly under that name. Three possible scenarios are outlined ranging from a limited gated release to Mythos remaining permanently restricted to vetted security partners.
AGI is Here. Anthropic Just Proved It.
A YouTuber analyzes Anthropic's internal report 'When AI Builds Itself,' arguing that AGI has effectively already arrived based on Claude's ability to solve open-ended problems autonomously. Key data points include Claude achieving a 76% success rate on open-ended coding tasks (up from 26% in six months) and AI-authored code now comprising over 80% of Anthropic's shipped code. The video also addresses the risks of compounding misalignment and the lack of any viable mechanism to slow down AI development globally.
The Skill That 10x’d My Claude Code Projects
The video introduces a Claude Code skill called 'Grill Me' that relentlessly interviews users to extract tacit knowledge from their heads into reusable AI context documents. The creator explains how this front-loaded knowledge extraction leads to higher-quality AI outputs faster than iterative trial-and-error. He shares his enhanced version of the original skill by Matt PCO, which adds automatic checkpointing to preserve Q&A sessions in markdown brainstorm files.
I Tested Every Claude Code Feature, These 12 Are the Best
A content creator with 500+ hours in Claude's ecosystem ranks Claude Code features from D to S tier based on personal productivity impact. The video highlights 12 top features, with Skills ranked #1 for enabling consistent, reusable agent workflows. The ranking prioritizes knowledge work and automation use cases over traditional software development.
100 Years of Artificial Intelligence Explained
This transcript covers 100 years of AI history, from Alan Turing's Enigma-cracking Bombe machine in WWII through the symbolic vs. neural network debates, the breakthroughs of AlexNet and transformers, and culminating in the modern AI gold rush dominated by OpenAI, Google, and Anthropic. The narrative traces how foundational mathematical ideas, hardware advances, and massive datasets converged to produce today's large language models. It ends with Claude Code emerging as the dominant developer tool by late 2025.