Claude Sonnet 5 is HERE! Summary — Julian Goldie SEO

Summary

The transcript covers a critical review of Claude Sonnet 5, which Anthropic announced as their most agentic model to date. The speaker begins by acknowledging the release but immediately tempers expectations, noting that while Sonnet 5 represents an improvement over Sonnet 4.6, it significantly underperforms Opus 4.8 across nearly every benchmark. The agentic coding score is 63% for Sonnet 5 versus 69% for Opus 4.8, with Opus outperforming on "literally every single benchmark."

The speaker then showcases practical examples of Sonnet 5's capabilities through Goldy Bench tests, including a ray caster maze, orbit simulation, synthwave background, and crypt game. While some outputs are aesthetically pleasing and functional, others are completely broken—notably the orbit test failing entirely. When compared side-by-side with GLM 5.2, Sonnet 5 shows mixed results: it creates smoother maze graphics but fails on the orbit task where GLM 5.2 succeeds.

A critical issue highlighted is the pricing problem: Sonnet 5 costs 1.2x more than Opus 4.8, making it objectively worse value. Multiple tweets from industry figures (Lisa, Bridge Mind) are quoted criticizing Anthropic's token efficiency and questioning the release decision. The speaker compares Sonnet 5 unfavorably to other models like Sekana Fugu Ultra, which produces noticeably superior visual quality in test outputs.

The speaker's conclusion is unambiguous: there is no compelling reason to use Sonnet 5 over Opus 4.8, and Anthropic's honest benchmarking (rather than inflating numbers) is appreciated but doesn't change the underlying problem. The speaker anticipates Fable 5 will overshadow this release entirely. The final recommendation pivots to a systems-based approach rather than chasing individual model releases—building flexible agent systems that can swap models as needed rather than depending on any single "hot model."

Key Insights

Opus 4.8 outperforms Claude Sonnet 5 on nearly every benchmark, with 69% versus 63% on agentic coding, despite Sonnet 5 being more expensive at 1.2x the cost of Opus 4.8

Sonnet 5 demonstrates inconsistent performance across different tasks—succeeding on the maze creation but completely failing on the orbit simulation test, whereas GLM 5.2 achieved the opposite results

Multiple industry figures criticized the Sonnet 5 release as fundamentally flawed because the whole point of using Sonnet is that it should be faster and cheaper, but Sonnet 5 violates this by being more expensive than the superior Opus 4.8

When comparing visual quality outputs, Sonnet 5 produces darker, less interesting results than Sekana Fugu Ultra and creates noticeably worse liquid simulation graphics compared to competing models

The speaker recommends building flexible systems with pluggable models rather than optimizing for any single model release, so that regardless of which model performs best, the underlying architecture remains valuable

Transcript

[0:00] So, today we have the release of Claude Sonnet 5, apparently the most agentic model from Claude and Anthropic yet. And apparently, you can see the announcement here, just dropped a few hours ago. It can make plans, use tools like browser terminals, run autonomously at a level just a few months ago required larger and more expensive models. We'll come on to this in a minute. I'm not going to hype it up here because you'll see from my test. I'm just going to tell you the honest [0:30] truth. So, here you can see you got Sonnet 5, you got Sonnet 4.6. So, it is a step up from Sonnet 4.6. If you actually look, like Sonnet…

Full transcript available for MurmurCast members

Claude Sonnet 5 is HERE!

Summary

Key Insights

Topics

Transcript

More from Julian Goldie SEO

Claude Sonnet 5 VS GLM 5.2: Who Wins?

Gamma Just Got Better With ChatGPT

NEW Qwythos 9B Runs Locally for FREE

GLM 5.2 + Claude Code is INSANE!

This NEW AI AGENT is INSANE! 🤯

Get AI summaries delivered to your inbox