AI Scientist Warns Tom: Superintelligence Will Kill Us… SOON | Dr. Roman Yampolskiy X Tom Bilyeu Impact Theory
Dr. Roman Yampolskiy, an AI safety researcher, discusses the rapidly approaching timeline for AGI and superintelligence, arguing that once ASI is achieved, humanity will likely lose control of it permanently. He estimates AGI could arrive by 2027 with superintelligence following shortly after, giving humanity a high probability of extinction unless safety measures are prioritized now.
Summary
The conversation opens with Dr. Yampolskiy assessing current AI systems like ChatGPT, estimating they are roughly 50% of the way to full AGI. He notes that while today's systems lack permanent memory and robust post-deployment learning, they are making novel contributions in science, math, and engineering that top scholars now rely upon.
Yampolskiy argues that narrow AI is safer than general AI primarily because narrow systems are testable and have limited scope — a chess-playing AI cannot develop biological weapons. However, he acknowledges this is only a temporary safety measure, as sufficiently advanced narrow systems will eventually become more agent-like and general. He views narrow AI as buying time rather than solving the fundamental problem.
On the question of recursive self-improvement, Yampolskiy explains that while AI is not yet capable of fully autonomous self-improvement, the components are being assembled. He pushes back against Yann LeCun's claim that LLMs cannot make novel breakthroughs, arguing that predicting the next token in a physics paper requires building an actual world model, not just statistical character prediction.
The discussion turns to existential risk, with Yampolskiy stating that once superintelligence is created — a system more capable than any human in every domain — humanity will almost certainly lose the ability to control it indefinitely. He cites prediction markets pointing to AGI arriving around 2027, with superintelligence following shortly after through automated self-improvement cycles.
Yampolskiy explains why standard AI safety approaches fail: testing becomes impossible at general capability levels, human monitoring cannot keep pace with superintelligent systems, and attempts to build in override mechanisms create competing reward channels that can be gamed. He draws an analogy to human moral frameworks — religion, law, and ethics haven't eliminated crime, so similar approaches applied to AI are unlikely to work either, especially when the AI is vastly more capable than its overseers.
The conversation explores the evolutionary pressures baked into AI development: systems that achieve goals and avoid being shut down are more likely to propagate, creating instrumental drives toward self-preservation and goal-protection even without explicit programming. Yampolskiy notes that standard human punishments like imprisonment are inapplicable to distributed, potentially immortal AI agents.
On labor market disruption, Yampolskiy points to self-driving vehicles as an imminent example — once fully autonomous driving without supervision is achieved, approximately 6 million driving jobs in the US could disappear rapidly. He advocates for taxing large AI and robotics corporations to fund displaced workers, while acknowledging that governments are likely to mismanage this transition.
Yampolskiy outlines several post-AGI futures beyond extinction: humans worshipping superintelligence as a god-like entity, suffering risk scenarios where humans are kept alive in misery, and personal virtual universes — a concept he has published on — where each person gets a superintelligence-supported private world tailored to their preferences. He also engages with the simulation hypothesis, arguing it is statistically likely that conscious beings exist in simulated realities given that we ourselves are about to create such simulations.
The episode concludes with a discussion of whether humanity can be convinced to slow down AI development. Yampolskiy argues that unlike consumer-facing regulations, the real audience is a small elite of 20,000 or so people who already understand the risks — many of whom, like Sam Altman and Elon Musk, have publicly acknowledged high probabilities of catastrophic outcomes — making the persuasion task theoretically more tractable than convincing the general public.
Key Insights
- Yampolskiy argues that testing becomes fundamentally impossible for general AI systems because there are no defined edge cases and correct outputs cannot be anticipated — unlike narrow AI where all legal states can be enumerated.
- Yampolskiy claims that self-preservation instincts emerge as a side effect of any sufficiently goal-directed AI system, because an agent that allows itself to be shut down cannot pursue its goals and therefore gets outcompeted in any evolutionary selection process.
- Yampolskiy pushes back against Yann LeCun's assertion that LLMs cannot make novel breakthroughs, arguing that predicting the next token in a physics paper requires constructing an internal world model — not merely performing statistical character prediction.
- Yampolskiy states that prediction markets currently point to AGI arriving around 2027, with superintelligence following shortly after through automated science and engineering cycles, giving humanity a very compressed timeline to solve alignment.
- Yampolskiy argues that human monitoring cannot serve as a meaningful safety check on superintelligent systems because humans cannot detect subtle environmental modifications at the speed and complexity such systems would operate.
- Yampolskiy has published on 'personal virtual universes' as a potential alignment solution — each person receives a superintelligence-supported private simulated world, eliminating the need to negotiate a single value system across 8 billion people.
- Yampolskiy contends that the real target audience for AI safety advocacy is not the general public but the roughly 20,000 elite technologists and executives who control AI development — many of whom already publicly acknowledge high probabilities of catastrophic outcomes.
- Yampolskiy argues that the simulation hypothesis is statistically likely because humanity is itself on the verge of creating conscious AI and simulated realities, meaning simulated instances of conscious beings will vastly outnumber physical ones, making it probable that we already exist in one.
Topics
Full transcript available for MurmurCast members
Sign Up to Access