NewsInsightful

9 Codex Tips From the Codex Team

The AI Daily Brief: Artificial Intelligence News and AnalysisMay 19, 202629m 36s

The AI Daily Brief covers three main stories: Cursor's launch of Composer 2.5 (a competitive coding model at 10x lower cost than rivals), Cloudflare's findings on Anthropic's Mythos security model, and Elon Musk losing his lawsuit against OpenAI. The main episode breaks down nine tips from OpenAI's Codex team member Jason Liu on maximizing Codex as a persistent work system rather than a simple chat interface.

Summary

The episode opens with three headline stories before diving into a practical guide on using Codex effectively.

The first headline covers Cursor's release of Composer 2.5, a significant upgrade to their in-house coding model built on Moonshot's Kimi 2.5 base with improved reinforcement learning. The model scores competitively against Claude Opus 4.7 and GPT-5.5 on key benchmarks (69.3% on Terminal Bench 2.0, 79.8% on SweeBench Multilingual) while costing half as much — 50 cents per million input tokens versus rivals. Cursor also claims 10x token efficiency, with benchmark runs costing under $1 per task compared to $5-$11 for competitors. Cursor is simultaneously training a new model from scratch on XAI's Colossus 2 cluster. The release is framed within the broader competitive squeeze Cursor faces from both model labs (entering the harness space) and harness labs (building their own models).

The second headline covers Cloudflare's review of Anthropic's Mythos model for security research. Cloudflare found Mythos represents a qualitative leap: unlike previous models that only detected individual bugs, Mythos can chain multiple vulnerabilities into functional exploits, behaving more like a senior security researcher. It can also test and refine exploits iteratively, making it far more useful than models that generate lists of unverified potential vulnerabilities.

The third headline covers the conclusion of Elon Musk's lawsuit against OpenAI and Sam Altman. The jury returned a unanimous verdict in just two hours, finding Musk's claims were barred by the statute of limitations — he had waited too long to file. The trial surfaced internal OpenAI history, including a 2017 proposal by Musk to fold OpenAI into Tesla and a 2018 term sheet describing the for-profit structure Musk later claimed was illegitimate.

The main episode extracts nine tips from Codex team member Jason Liu's 'Codex Maxing' post. Tip 1 advocates using long-running, durable 'monothreads' per workstream, relying on Codex's improved context compaction to maintain continuity. Tip 2 champions voice input, arguing that rambling verbally gives the model access to the messy, uncertain version of one's thinking rather than a polished prompt, leading to better outputs. Tip 3 covers Codex's 'Steer' feature, which lets users inject feedback mid-task without stopping execution, enabling human-agent parallel work. Tip 4 is about externalizing memory into structured file systems (like an Obsidian vault synced to GitHub) so that insights from threads survive beyond any single conversation. Tip 5 covers tool use — computer use, browser use, and connectors — as the mechanism by which Codex becomes an evidence gatherer across live environments. Tip 6 addresses mobile and remote control, enabling users to steer long-running tasks without being at a desktop. Tip 7 introduces 'heartbeats,' scheduled or triggered check-ins that keep threads active and cross tool boundaries (e.g., checking Slack, re-rendering video, uploading via computer use). Tip 8 briefly touches on the 'slash goal' feature for projects with verifiable success criteria, noted as deserving its own dedicated episode. Tip 9 highlights the side panel as the space where Codex transitions from a chat app to a work environment, allowing artifact inspection and annotation without interrupting the agent's workflow.

About this episode

Codex is quickly becoming a full work environment for agentic building, and today’s episode breaks down nine practical tips from one of OpenAI’s Codex team for getting more out of it. NLW covers durable long-running threads, voice as a way to give agents richer context, steering while work is still in progress, structured memory, tool access, remote control, heartbeats, goals, and the side panel as the place where human and agent work stay in motion together. In the headlines: Cursor’s Composer 2.5, Cloudflare’s review of Anthropic’s Mythos Preview, and the verdict in Elon Musk’s OpenAI lawsuit.Source: <a href="https://jxnl.github.io/blog/writing/2026/05/10/codex-maxxing/" rel="noopener noreferrer nofollow" target="_blank">https://jxnl.github.io/blog/writing/2026/05/10/codex-maxxing/</a>

Key Insights

Jason Liu argues that voice input is valuable not just for speed but because it gives the model access to the 'messy version' of one's thinking — including uncertainty, trade-offs, and half-formed ideas — which leads to better outputs than polished typed prompts.
The host frames Cursor's competitive challenge as a two-sided squeeze: model labs like Anthropic are building their own coding harnesses (Claude Code), while Cursor simultaneously can't afford to keep subsidizing Anthropic model costs — making building their own model an existential priority, not just a strategic one.
Cloudflare's review found that what distinguishes Mythos from other models isn't bug detection (which many models can do) but the ability to synthesize multiple vulnerabilities into functional, iteratively refined exploits — a qualitative shift from automated scanner to senior researcher behavior.
Jason Liu argues that native Codex memory features are insufficient for serious workflows, and that structured external file systems (like an Obsidian vault) are necessary because they force the agent to compress experience into inspectable, editable artifacts that survive thread death or compaction failures.
The Musk vs. OpenAI trial was decided purely on technical grounds — the statute of limitations — without the jury ever considering the substantive merits of the breach of charitable trust claim, meaning the deep questions about OpenAI's for-profit conversion remain legally unresolved.

Topics

Cursor Composer 2.5 model releaseAnthropic Mythos security model review by CloudflareElon Musk vs. OpenAI trial verdictCodex tips from the Codex team (Jason Liu's 'Codex Maxing')AI harness vs. model lab competitive dynamics

Transcript

Today on the AI Daily Brief, nine Codex tips from the Codex team. Before that in the headlines, yeah, we got a verdict in the Elon OpenAI trial, but that's much less interesting than Composer 2.5. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. All right, friends, quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG, Robots and Pencils, Bolt, and Zencoder. To get an ad-free version of the show, go to patreon.com slash ai-dailybrief, or you can subscribe on Apple Podcasts. To learn more about sponsoring the show, head on over to ai-dailybrief.ai, or send us a note at sponsors at ai-dailybrief.ai.…

Full transcript available for MurmurCast members

View original source →

More from The AI Daily Brief: Artificial Intelligence News and Analysis

Get AI summaries like this delivered to your inbox daily

9 Codex Tips From the Codex Team

Summary

About this episode

Key Insights

Topics

Transcript

More from The AI Daily Brief: Artificial Intelligence News and Analysis

Anthropic Can Now Read Claude’s Mind

AI Is Making One-Person Million-Dollar Companies More Common

The Job Positions of the AI Future

The Big Ways AI Just Changed

AI Companies Are Hiring More

Get AI summaries delivered to your inbox