Claude Code is now FREE: Here’s how…
Google's new Gemma 4 model running on Ollama is 90% faster on Apple Silicon, enabling free Claude Code usage locally without token costs. The setup requires three simple steps: downloading Ollama, Gemma 4, and installing into Claude Code, with alternatives available via OpenRouter API for non-Mac users.
Summary
The video demonstrates how to use Claude Code for free indefinitely using Google's newly updated Gemma 4 model with Ollama, which achieves a 90% speed improvement on Apple Silicon devices. The presenter outlines three straightforward setup steps: download Ollama (free), download Gemma 4 (free), and install into Claude Code (free), creating a complete free agentic coding system that can run autonomously 24/7. For users without Apple Silicon, the same capability is available via OpenRouter's free API endpoint, routing through the open-source free Claude Code project. The presenter demonstrates practical applications including building a to-do list app and Space Invaders game, showing that while Gemma 4 isn't suitable for highly complex tasks, it performs well for basic projects like blog posts, landing pages, and background scheduling tasks. The key advantage over previous local model approaches is the significant speed improvement, making local models viable as a cost-free alternative to cloud-based APIs that consume tokens. The presenter advocates for using open-source, locally-runnable models to avoid dependency on proprietary closed systems that can be removed or changed at any time, citing the example of previous model disruptions. The transcript mentions integration into an Agent Operating System with additional features like memory systems, and references Goldiebench, a local leaderboard testing models across 42 different tasks for comparative benchmarking.
Key Insights
- Gemma 4 from Google is now 90% faster on Apple Silicon with Ollama using MLX, making free local models viable as a speed-competitive alternative to traditional cloud APIs
- Claude Code remains the same CLI tool regardless of which model backs it; users simply point it at different models like Gemma 4, making it not a watered-down version but the full product with a different inference backend
- Gemma 4 is suitable for basic tasks like writing blog posts or creating landing pages, but frontier models should be used for complex projects, creating a tiered approach to model selection based on task complexity
- Local models running on schedules or in the background don't require speed optimization since results can be reviewed hours later, making Hermes and other slower agents viable for asynchronous workflows
- Using open-source local models provides system ownership and protection against disruption, unlike closed proprietary models that can be taken down or removed, breaking dependent workflows
Topics
Transcript
[0:00] Today I'm going to show you how to use clawed code free forever with a brand new update that just came out from Google that allows you to use free clawed code with a free local model and then also it's 90% faster which is usually the main downfall of local models. So I'm going to run you through exactly how we're running this system right here, what you can build with it, how it works, etc. You can see a bunch of things that we built with it over here. So we've already plugged this into our Asian operating system and this works in three simple steps. So first of [0:32] all the update and why this is…
Full transcript available for MurmurCast members
Sign Up to AccessMore from Julian Goldie SEO
This NEW Chinese AI is INSANE! (FREE + Open Source!)
Long Cap 2.0 is a new open-source Chinese AI model from a food delivery app company that offers 1 million tokens of free context memory, beats GPT-4.5 on SWE bench pro benchmarks, and uses efficient parameter activation to reduce computational overhead while maintaining high performance.
X AI MCP Server Just Changed AI Agents
X has launched a hosted MCP (Model Context Protocol) server that gives AI agents direct access to real-time data from X's platform through a standardized connection, eliminating the need for custom API integration work. The setup involves OAuth authentication, the XRL token manager, and access to 200+ X API tools for research, content creation, and trend tracking.
New NotebookLM Update is INSANE!
Google's NotebookLM now features short video overviews that convert documents into engaging 60-second vertical videos using the new Nano Banana 2 Light image model. The feature represents rapid iteration in AI tools and offers practical applications for students, creators, and businesses seeking to transform static documents into shareable video content.
How to Rank #1 with Claude Fable 5 AI SEO!
The speaker demonstrates how to use Claude Fable 5 AI for SEO automation to rank websites, showing real examples of sites growing from zero to hundreds of daily clicks. The strategy emphasizes using Fable 5 for planning and building automation systems, while deploying content creation with cheaper alternative models due to Fable 5's token limitations.
NEW Hermes + Paperclip AI Agent Update Is INSANE
A new Hermes and Paperclip AI agent update enables users to manage a full team of AI agents with organizational structure, built-in Hermes integration, automatic task monitoring, and multi-platform connectivity across Claude, Gemini, Cursor, and other AI systems.