Codex Shifted Categories Entirely (and Nobody Noticed)
OpenAI transformed Codex from a coding tool into a comprehensive desktop agent that can control any Mac application through visual interface interaction, significantly outperforming Claude's computer use capabilities. This represents a strategic shift toward computer work rather than just knowledge work, enabled by acquiring a specialized team with deep macOS expertise.
Summary
OpenAI completely revamped Codex on April 16th, transforming it from a simple command-line coding tool into a sophisticated desktop agent capable of operating any Mac application through visual interface control. The transformation occurred in stages throughout 2025, with the Mac desktop app launching in February, Windows support in March, and the major computer use capabilities arriving in April. Codex can now see screens, click and type like a human, run multiple agents in parallel in the background, generate images, browse the web, and maintain memory across sessions.
The speaker extensively compares Codex to Claude's computer use capabilities, finding Codex significantly faster and more reliable. While Claude's cursor hesitates and sometimes requires retries, Codex operates at near-human speed and rarely fumbles tasks. This performance advantage stems from both GPT 5.4's native computer use capabilities and what OpenAI calls 'deep OS level wizardry' - background agents that don't hijack the user's cursor or steal focus.
The analysis reveals fundamental strategic differences between OpenAI and Anthropic. Anthropic focuses on knowledge work through structured interfaces, MCP servers, and explicit permissions, requiring ecosystem cooperation. OpenAI pursues broader 'computer work' through direct graphical interface control, eliminating the need for vendor cooperation or API development. This approach means any software with a graphical interface becomes automatable, including legacy enterprise software and internal tools.
Codex's computer use capabilities originated from OpenAI's October 2025 acquisition of Software Applications Incorporated, a 12-person team behind an unreleased Mac AI interface called Sky. This team previously built Workflow (acquired by Apple and turned into Shortcuts) and includes former Apple engineers with deep macOS expertise. The speaker emphasizes that both labs are acquiring specialized teams rather than just intellectual property, as model capabilities converge while human expertise remains scarce.
Looking forward, both companies aim for persistent, ambient, event-driven agents. OpenAI's Chronicle feature captures screen activity to improve computer use over time, while Anthropic's leaked Conway system represents an always-on agent environment with webhook triggers and browser control. The speaker predicts OpenAI's approach is more likely to succeed because it doesn't require ecosystem cooperation, though acknowledges enterprise software could move faster than expected in Claude's favor.
Key Insights
- Greg Brockman stated that models have shifted from being the product to being part of the product, with the focus now on building the 'body' around the AI brain
- GPT 5.4 benchmarks in the mid-70s on OS World, placing it above the human baseline for graphical user interface control
- Sam Altman admitted that OpenAI started the year behind Anthropic on real-world coding data and only appreciated the gap in hindsight
- OpenAI acquired the entire 12-person team from Software Applications Incorporated who built Workflow (which became Apple Shortcuts) and previously had deep Apple OS experience
- OpenAI leadership cut popular products like Sora and drug discovery efforts because they didn't align with their three strategic vectors: agentic platform, computer work, and personal AGI
Topics
Full transcript available for MurmurCast members
Sign Up to Access