Build A Token Dashboard This Weekend. It'll Show The Work You Keep Avoiding.
The speaker built a token burn dashboard using Codex to visualize and analyze his AI usage patterns, arguing that measuring token consumption creates a feedback loop that drives smarter AI use. He emphasizes that token burn correlates with solution quality and that sharing usage data publicly can help communities learn and grow together. The core message is that reimagining your entire computing experience through AI requires tools that reveal behavioral patterns, not just raw statistics.
Summary
The speaker opens by framing his token burn dashboard not as a bragging tool but as a mechanism for self-reflection and behavioral improvement. He burned approximately 800 million tokens in a single day, but his focus is on what that usage reveals about how he deploys AI. He argues that without a feedback loop showing how AI usage maps to outcomes, it is nearly impossible to self-improve in how one uses these tools.
A key technical challenge discussed is the difficulty of measuring token usage outside of direct API access. While Codex provides precise token counts, tools like Claude's chat interface and co-work environment do not surface this data easily. The speaker used Codex itself to approximate his Claude token usage by reasoning from logs and artifacts, noting the irony of using one AI to infer the usage of another.
The dashboard was built using a Tufte-inspired open-source design skill, featuring a GitHub-style heatmap, logarithmic scaling to handle orders-of-magnitude variation in daily usage, top-10 usage day breakdowns, and multi-model distribution views. The speaker describes prompting Codex in plain English with a clear mental picture rather than precise technical specifications, completing the build in roughly an hour.
He uses a concrete example of adopting the /workflows command from the Opus 4.5 release, which enables multi-agent orchestration. He applied it to research schools for his children, deploying three to four sub-agents to produce a comprehensive report. The dashboard allowed him to visually confirm that this behavioral change corresponded to a spike in token usage and higher-quality outputs.
The speaker draws a broader philosophical point: AI models are grown through reinforcement learning across trillions of parameters, not designed with fully understood capabilities. This means even their creators do not know the full extent of what they can do, and discovery requires active experimentation and community sharing. He contrasts users burning a few million tokens daily with those approaching a billion, calling this a near-99% difference in both token volume and practical fluency.
He concludes with a call to action for the community to build their own dashboards, share their token burn data publicly, and contribute to a collective learning culture. He references a future where token burn history could function similarly to a GitHub profile for demonstrating AI competency to employers, and invites viewers to share creative AI use cases so the community can learn from one another.
Key Insights
- The speaker argues that token burn correlates directly with solution quality, citing repeated lab studies showing that spending more tokens produces better results — making it a measurable proxy for 'deployed delegated intelligence' rather than just a cost metric.
- The speaker used Codex to approximate his own Claude token usage by having Codex quiz him about his activity and reason from artifacts and logs to produce a tight estimated range — highlighting that Claude's chat and co-work interfaces provide no native token visibility outside the API.
- The speaker claims that only 0.6% of ChatGPT users are currently using Codex, framing this as evidence of how early the market is and how much imagination expansion remains untapped among even active AI users.
- The speaker asserts that AI models are 'grown, not made' through reinforcement learning across roughly 10 trillion parameters, meaning even their creators do not fully understand their capabilities — and portraying them as traditional software is actively deceptive.
- The speaker describes a personal computing architecture where a 'chief of staff' thread in Codex maintains full project context and spins up child threads for detail work, keeping context windows clean — and argues this parallel multi-agent approach is what drives the highest token burn days and the most successful outcomes.
Topics
Full transcript available for MurmurCast members
Sign Up to Access