TechnicalNews

How to Run Hermes FREE Forever!

Julian Goldie SEO

The video demonstrates how to run the Hermes AI agent for free using Gemma 4, a local open-source model from Google, with significant speed improvements through MLX optimization. The setup works on Apple Silicon Macs or via free APIs on Open Router, enabling autonomous agents to work offline and privately without subscription costs.

Summary

Julian presents a complete system for running Hermes, one of the most powerful AI agents, without ongoing costs. The core update involves Gemma 4, Google's free local model, which is now 90% faster when run through Olama on Apple Silicon using MLX technology. For those without Apple Silicon, Open Router provides free API access to the 31B Gemma model.

The key advantage of local models is privacy and offline capability—all work stays local without requiring internet connectivity or API subscriptions. Julian demonstrates practical applications including building a to-do list app, using the forward slash learn feature to add skills to Hermes, reading and analyzing emails, and monitoring AI news automatically.

Hermes now includes a sub-agents update that allows multiple Gemma 4 workers to handle parallel tasks simultaneously. The system uses agentic loops, where users set a goal and the agent autonomously completes it, checking its own work and retrying as needed, without expensive token consumption. Julian contrasts this with the old approach where users had to manually prompt the agent repeatedly.

Setup is straightforward: download the latest Olama, select Gemma 4, choose the new MLX version, and connect it to Hermes with a single command. Julian emphasizes that technical expertise isn't required and provides access to the AI Profit Boardroom community with 194 pages of user testimonials, full training courses, playbooks for token optimization, and weekly coaching calls.

Key Insights

  • The new MLX-optimized version of Gemma 4 runs 90% faster than previous versions, making local model execution fast enough for practical use with Hermes, whereas it was previously too slow to be usable.
  • Hermes agents can run autonomous loops where they set a goal, execute tasks, check their own work, and retry automatically without manual intervention or token cost concerns, fundamentally changing how agentic workflows operate compared to traditional prompting.
  • The forward slash learn feature allows Hermes to read tutorials and documentation, then add that knowledge as a permanent skill that it never forgets, enabling continuous skill accumulation from local training materials.
  • Open Router provides a free API alternative to local models for users without Apple Silicon, ensuring Gemma 4 deployment is accessible regardless of hardware limitations.
  • The AI Profit Boardroom community contains 194 pages of documented testimonials from non-technical users successfully implementing agent operating systems, demonstrating that no technical expertise is required to set up and use these systems.

Topics

Gemma 4 local AI modelHermes AI agent automationMLX optimization for Apple SiliconAgentic loops and autonomous workflowsLocal vs. API-based model deploymentZero-cost AI agent systemsSub-agents and parallel processingAI Profit Boardroom community

Transcript

[0:00] Today I'm going to show you how to run Hermes for free forever with a new update to Gemma 4 that actually makes it 90% faster. So this is a new update that just dropped for Gemma 4 with Olama on Apple Silicon using MLX. And so with MLX you can run models 95% faster using Gemma 4. Gemma 4 is a local free model from Google and you can now plug it into Hermes in like one single click and run [0:30] free models forever. So let me show you an example of this. We already plugged it into the agent OS over here and then if we go to the different profiles that we have. I usually…

Full transcript available for MurmurCast members

Sign Up to Access

More from Julian Goldie SEO

Get AI summaries like this delivered to your inbox daily

Get AI summaries delivered to your inbox

MurmurCast summarizes your YouTube channels, podcasts, and newsletters into one daily email digest.