Claude Mythos is Too Dangerous to Use
Anthropic has released Claude Mythos Preview, a powerful AI model with exceptional coding abilities that's restricted to select partners due to security risks. The model can find zero-day vulnerabilities and exploits, leading to Project Glasswing - a $100 million initiative to address security concerns before wider release.
Summary
Anthropic has unveiled Claude Mythos Preview, marking the first time their frontier model isn't publicly available due to security concerns. The model is restricted to select partners including Apple, Google, Nvidia, and Microsoft under Project Glasswing, named after glasswing butterflies whose transparency represents both the vulnerabilities discussed and the transparent approach being taken. The model demonstrates impressive coding capabilities, scoring 93.9% on SWE Bench Verify tests and 77.8% on SWE Bench Pro, significantly outperforming predecessors by 24 points. However, these capabilities also enable the model to find software vulnerabilities, including discovering a 27-year-old OpenBSD flaw, a 16-year-old FFmpeg vulnerability, and a Linux privilege escalation bug. The model has reportedly found thousands of zero-day vulnerabilities across major operating systems and web browsers. This development has accelerated the 'token maxing' trend, where companies incentivize employees to use more AI tokens, with Meta creating internal leaderboards ranking employees by token consumption. Google's product director suggests that go-to-market skills are becoming crucial as AI democratizes building capabilities. Industry data shows GitHub commits have increased 14x year-over-year to 275 million weekly, while company AI spending has grown 4x, with median companies dedicating 15% of software budgets to AI tools. Despite concerns about job displacement, tech job openings are up 30% this year, though 52,000 tech job cuts were announced with AI cited as a leading reason.
Key Insights
- Anthropic argues that Claude Mythos Preview's coding abilities are so advanced that it can surpass all but the most skilled humans at finding and exploiting software vulnerabilities
- The speaker claims that Project Glasswing represents a proactive response to the reality that such powerful capabilities will eventually spread beyond actors committed to deploying them safely
- Meta has built internal leaderboards that rank 85,000 employees by AI token consumption, turning token usage into a measure of engineering prestige with badges like 'Token Legend'
- Google's product director contends that go-to-market has become the essential skill as AI democratizes building capabilities, shifting the question from 'can you build' to 'should you build'
- The analysis shows that GitHub commits have exploded 14x year-over-year while median companies now dedicate nearly 15% of their total software budget to AI tools
Topics
Full transcript available for MurmurCast members
Sign Up to Access