OpinionNews

Claude Sonnet 5 is HERE!

Julian Goldie SEO

Claude Sonnet 5 has been released as Anthropic's most agentic model, but the speaker argues it's a disappointing release that underperforms Opus 4.8 while being more expensive, making it an unattractive option for most users. The reviewer demonstrates this through benchmark comparisons and test outputs, concluding that users should stick with Opus 4.8 or wait for the incoming Fable 5 model.

Summary

The transcript covers a critical review of Claude Sonnet 5, which Anthropic announced as their most agentic model to date. The speaker begins by acknowledging the release but immediately tempers expectations, noting that while Sonnet 5 represents an improvement over Sonnet 4.6, it significantly underperforms Opus 4.8 across nearly every benchmark. The agentic coding score is 63% for Sonnet 5 versus 69% for Opus 4.8, with Opus outperforming on "literally every single benchmark."

The speaker then showcases practical examples of Sonnet 5's capabilities through Goldy Bench tests, including a ray caster maze, orbit simulation, synthwave background, and crypt game. While some outputs are aesthetically pleasing and functional, others are completely broken—notably the orbit test failing entirely. When compared side-by-side with GLM 5.2, Sonnet 5 shows mixed results: it creates smoother maze graphics but fails on the orbit task where GLM 5.2 succeeds.

A critical issue highlighted is the pricing problem: Sonnet 5 costs 1.2x more than Opus 4.8, making it objectively worse value. Multiple tweets from industry figures (Lisa, Bridge Mind) are quoted criticizing Anthropic's token efficiency and questioning the release decision. The speaker compares Sonnet 5 unfavorably to other models like Sekana Fugu Ultra, which produces noticeably superior visual quality in test outputs.

The speaker's conclusion is unambiguous: there is no compelling reason to use Sonnet 5 over Opus 4.8, and Anthropic's honest benchmarking (rather than inflating numbers) is appreciated but doesn't change the underlying problem. The speaker anticipates Fable 5 will overshadow this release entirely. The final recommendation pivots to a systems-based approach rather than chasing individual model releases—building flexible agent systems that can swap models as needed rather than depending on any single "hot model."

Key Insights

  • Opus 4.8 outperforms Claude Sonnet 5 on nearly every benchmark, with 69% versus 63% on agentic coding, despite Sonnet 5 being more expensive at 1.2x the cost of Opus 4.8
  • Sonnet 5 demonstrates inconsistent performance across different tasks—succeeding on the maze creation but completely failing on the orbit simulation test, whereas GLM 5.2 achieved the opposite results
  • Multiple industry figures criticized the Sonnet 5 release as fundamentally flawed because the whole point of using Sonnet is that it should be faster and cheaper, but Sonnet 5 violates this by being more expensive than the superior Opus 4.8
  • When comparing visual quality outputs, Sonnet 5 produces darker, less interesting results than Sekana Fugu Ultra and creates noticeably worse liquid simulation graphics compared to competing models
  • The speaker recommends building flexible systems with pluggable models rather than optimizing for any single model release, so that regardless of which model performs best, the underlying architecture remains valuable

Topics

Claude Sonnet 5 Release and PerformanceModel Benchmarking and ComparisonPricing and Value PropositionAgentic AI CapabilitiesSystems-Based AI Architecture

Transcript

[0:00] So, today we have the release of Claude Sonnet 5, apparently the most agentic model from Claude and Anthropic yet. And apparently, you can see the announcement here, just dropped a few hours ago. It can make plans, use tools like browser terminals, run autonomously at a level just a few months ago required larger and more expensive models. We'll come on to this in a minute. I'm not going to hype it up here because you'll see from my test. I'm just going to tell you the honest [0:30] truth. So, here you can see you got Sonnet 5, you got Sonnet 4.6. So, it is a step up from Sonnet 4.6. If you actually look, like Sonnet…

Full transcript available for MurmurCast members

Sign Up to Access

More from Julian Goldie SEO

Get AI summaries like this delivered to your inbox daily

Get AI summaries delivered to your inbox

MurmurCast summarizes your YouTube channels, podcasts, and newsletters into one daily email digest.