AI shows its skills in the emergency room
A Harvard study published in Science found that OpenAI's o1-preview model outperformed two attending ER physicians across 76 real emergency cases, correctly diagnosing patients 67.1% of the time versus 55.3% and 50.0% for the doctors. The newsletter also covers the Pentagon's addition of 8 AI companies to classified networks while excluding Anthropic, and shares staff and reader AI use cases ranging from medical research to real estate investment tools.
Summary
The newsletter opens with coverage of a Harvard study published in Science that tested OpenAI's o1-preview model against two attending emergency room physicians across 76 real ER cases. The AI outperformed both doctors at the initial triage stage, achieving a 67.1% correct diagnosis rate compared to 55.3% and 50.0% for the two physicians. Notably, physician reviewers scoring the diagnoses could not distinguish AI-generated outputs from human ones. In one highlighted case, the AI identified a rare flesh-eating infection in a transplant patient approximately 12–24 hours before the treating physician did. The newsletter frames this as evidence that AI is ready for a formal role alongside doctors, especially given that newer frontier models would likely perform even better.
The Rundown Roundtable section features staff members sharing personal AI use cases. One writer describes using Gemini and ChatGPT to navigate her daughter's rare autoimmune disease diagnosis, using AI to parse medical literature, identify leading specialists, and understand treatment options across countries. An editor shares how he uses ChatGPT to analyze packaged food labels, uploading product photos to flag unhealthy ingredients and decode technical food-industry jargon like INS numbers and emulsifiers.
The newsletter includes a tutorial on using Claude Design, Anthropic's new AI design tool, to generate high-converting landing page mockups. The guide walks through a multi-step process involving wireframe creation, screenshot references from high-traffic pages, and iterative refinement using comments — with a final option to hand off the design to Claude Code for deployment.
On the geopolitical and defense front, the newsletter reports that the Pentagon added 8 AI companies — including OpenAI, Google, SpaceX, Nvidia, Microsoft, AWS, Oracle, and Reflection — to its classified networks, while notably excluding Anthropic due to a standing supply-chain risk designation. The newsletter highlights the apparent contradiction in the White House simultaneously blacklisting Anthropic while seeking priority access to its Mythos model. It also flags Reflection's inclusion as notable, given its $2B funding from a Donald Trump Jr.-backed venture fund.
Additional news briefs cover OpenAI shipping Codex Pets (animated desktop companions for tracking agent tasks), Maryland passing the first U.S. ban on AI-driven personalized grocery pricing, SAG-AFTRA securing AI protections in a new studio deal, and a Chinese court ruling that replacing a worker with AI does not legally justify termination. A reader workflow from Finland describes using Gemini Pro and other free AI tools to build a real estate market analysis tool in a county with limited data transparency.
About this episode
PLUS: Create converting landing pages in Claude Design
Key Insights
- The Harvard study found that OpenAI's o1-preview — a model already a generation behind current frontier models — outdiagnosed two attending ER physicians at triage, with the newsletter arguing this suggests even greater potential for newer AI in clinical settings.
- The newsletter highlights that physician reviewers scoring the ER cases could not tell which diagnoses came from the AI and which from humans, suggesting AI-generated medical reasoning has reached a level of qualitative parity with experienced doctors.
- The newsletter frames the Pentagon's simultaneous blacklisting of Anthropic and pursuit of access to its Mythos model as a political contradiction, suggesting the White House wants strategic AI leverage without granting Anthropic formal defense contractor status.
- A staff writer argues that AI tools like ChatGPT and Gemini proved valuable not by replacing doctors, but by helping a patient's family understand complex medical literature, evaluate treatment options across countries, and identify top specialists — describing it as a demystifying resource during a health crisis.
- The newsletter notes that Reflection, one of the eight companies added to Pentagon classified networks, raised $2 billion from a fund backed by Donald Trump Jr., raising implicit questions about the political dimensions of defense AI contracting decisions.
Topics
Transcript
Good morning, {{ first_name | AI enthusiasts }}. AI just beat two attending emergency room physicians across real patient cases in a new Harvard study. The model? OpenAI’s o1-preview, released in… 2024. Millions of users are already turning to ChatGPT for health advice every day, but the data shows that AI models (preferably not several generations behind) may be ready for a more formal seat in the exam room alongside the doctor as well. Old AI model tops doctors in ER trial The Rundown Roundtable: Our AI use cases Create converting landing pages in Claude Pentagon announces new AI partners 4 new AI tools, community workflows, and more OPENAI Image source: Images 2.0 / The Rundown The Rundown: A Harvard…
Full transcript available for MurmurCast members
Sign Up to AccessMore from The Rundown AI
Jeff Bezos' $41B 'artificial general engineer'
Jeff Bezos revealed more details about his AI startup Prometheus, which raised $12B at a $41B valuation with a goal of building an 'artificial general engineer' to accelerate physical product design. Anthropic faced backlash over its Fable model's invisible safety filters that downgraded answers without user notification. The 2026 FIFA World Cup debuted as the first AI-integrated tournament, with optical tracking, 3D body scans, and AI analytics wired into nearly every layer.
Anthropic writes Washington an AI regulation playbook
This newsletter covers Anthropic CEO Dario Amodei's new AI policy essay urging faster regulation, SpaceX's reveal of its orbital AI datacenter satellite AI1, and OpenAI's IPO plans tied to self-improving AI timelines. Additional stories include new AI tools, industry drama around model restrictions, and a community workflow from a teacher using AI to help refugees navigate legal documents.
Anthropic hands the public Mythos-class AI
Anthropic released Claude Fable 5, a restricted public version of its Mythos-class AI that tops nearly all major benchmarks, with access limits and pricing changes coming June 22. The newsletter also covers a Perplexity/Harvard study on AI agents shifting knowledge work patterns, and profiles a self-taught Japanese farmer using AI to build his own farm automation systems.
Apple’s new Siri AI overhaul is here (sort of)
Apple unveiled its Siri AI overhaul at WWDC 2026, but analysts found it underwhelming compared to frontier models. OpenAI published a blog declaring a 'third phase' of AI development, while Argentina introduced legislation creating 'non-human corporations' run by AI systems.
Washington wants a piece of OpenAI
The Rundown newsletter covers the U.S. government's reported talks with OpenAI about taking a 1-5% equity stake to fund a public wealth fund for Americans. It also covers OpenAI's planned ChatGPT overhaul into an agentic 'superapp' centered on Codex, plus staff AI use cases and community workflows.