Baiting AI [LIVE]
Matthew Berman hosts a casual live stream covering several AI industry topics including app store scams targeting his mother, the jagged nature of AI intelligence, Anthropic's controversial user policies, Meta's employee keystroke monitoring program, and a new mystery model called Owl Alpha. The stream also features live AI model testing and discussion of content creator economics on YouTube vs. X.
Summary
The stream opens with extended technical difficulties involving Matthew's newly recabled XLR microphone setup, which was accidentally set to phantom power despite the mic not requiring it. After resolving the audio issues, Matthew transitions into the main content.
Matthew shares a story about his mother being scammed by a counterfeit ChatGPT app on the Apple App Store. When searching for ChatGPT on the App Store, the first several results are deliberately designed lookalike apps with nearly identical logos, some charging around $40/year while likely serving free-tier API access. Matthew argues Apple bears responsibility for allowing these deceptive apps and demonstrates live how none of the top search results are the actual ChatGPT app. He notes his mother had purchased two fake AI apps before he intervened.
Matthew previews an upcoming video breaking down André Karpathy's talk at Sequoia's AI Ascent event, focusing on the 'jagged intelligence' phenomenon — where AI excels at some tasks and fails surprisingly at simple ones. He explains this stems from two factors: the verifiability of certain domains (like coding, where you can test outputs and get clear feedback) and the revenue incentives that drive AI labs to optimize for coding and math. He demonstrates live testing across GPT-5.3, GPT-5.5 thinking mode, and Gemini on a classic logic puzzle about whether to walk or drive to a car wash 50 meters away, showing that non-thinking models fail while thinking models and Gemini succeed.
The stream then covers controversy around Anthropic, citing content creator Theo's viral open letter criticizing the company. Key criticisms include opaque and arbitrary token quota manipulation, preventing subscribers from using tokens in third-party tools like OpenClaw, and a cult-like corporate culture where employees reportedly fear being fired for a single bad tweet. Matthew also notes Anthropic's strong anti-open-source lobbying stance. A data analyst named Powell challenged Theo's claim that his anti-Anthropic content costs him money, showing anti-posts get 2.8x more views on X — but Matthew and Theo both argue X view counts are heavily inflated and don't reflect meaningful conversion compared to YouTube.
Matthew briefly tests a new mystery model called 'Owl Alpha' on OpenRouter, described as a high-performance model for agentic workloads compatible with Claude Code and OpenClaw. The model initially fails the car wash logic puzzle but succeeds after being prompted to reason as a logical expert.
The final segment covers Meta CEO Mark Zuckerberg's announcement at a company-wide meeting that Meta would install a monitoring tool called the 'Model Capability Initiative' to track employee keystrokes and mouse movements to train AI models. Zuckerberg argued Meta employees have higher average intelligence than typical data labeling contractors, making their activity more valuable training data. Matthew notes this practice of employee monitoring already exists at many companies for productivity and security reasons, but the AI training application is new. He speculates this could spread to other AI labs and eventually to non-tech industries if successful.
Key Insights
- Matthew argues that Apple bears direct responsibility for allowing deliberately deceptive AI app clones on the App Store, noting all top ChatGPT search results are counterfeit apps designed to confuse non-technical users.
- Matthew explains the jagged nature of AI intelligence stems primarily from two factors: the verifiability of domains like coding (short feedback loops) and revenue incentives that push labs to optimize for coding and math over creative tasks.
- Matthew demonstrated live that GPT-5.3 in instant (non-thinking) mode fails the car wash logic puzzle but can be corrected by prefacing with 'you are an expert in logical thinking,' which he argues is a pointless workaround that should be unnecessary.
- Matthew argues that Anthropic's practice of segmenting tokens by product (e.g., Claude Design having separate quotas from other models) is arbitrary and frustrating since they are technically the same tokens the user paid for.
- Matthew contends that Theo's criticism of Anthropic appears genuine because Theo has stated it costs him sponsors and money, and Matthew finds Theo to be a consistently genuine person in his interactions.
- Matthew characterizes Anthropic's CEO Dario as 'completely AGI-pilled' rather than explicitly disrespecting engineers, arguing Anthropic's singular focus on AGI drives their seemingly user-hostile decisions.
- Matthew observes that unlike other AI labs which have experienced significant founder and employee departures, all of Anthropic's original seven founders are reportedly still at the company, which contributes to its cult-like perception.
- Matthew argues that X view counts are heavily gamed and inflated — a view counts even if the post merely appears on screen without being clicked — making X metrics far less meaningful for sponsor conversion compared to YouTube.
- Matthew describes the information flow hierarchy for AI ideas as: X first, then YouTube, then Instagram, then TikTok, then Facebook, then LinkedIn weeks later, positioning X as where influential people soundboard ideas before they reach broader audiences.
- Matthew argues that Meta employees at big tech companies are already subject to significant computer monitoring for security and productivity reasons, so keystroke logging for AI training is not as novel as it might seem.
- Matthew speculates that if Meta's employee monitoring for AI training succeeds without significant pushback, other AI labs and eventually non-tech companies may be approached by third-party data aggregators to monetize employee activity data.
- Matthew notes that Anthropic is the least open-source AI company he has seen and is actively lobbying against open-source AI, contrasting it with OpenAI which at least open-sourced its Codex evaluation harness.
Topics
Full transcript available for MurmurCast members
Sign Up to Access