I Fixed Claude's Token Limits. Here's How.

ICOR with Tom | AI ProductivityMarch 31, 20265m 4s

The speaker hit Claude's usage limits despite having a $200/month plan and shows how to optimize AI agent setups by using different models for different agents. They demonstrate switching the main orchestrator to Sonnet while keeping specialized agents on Opus, and adjusting effort levels to reduce token consumption.

Summary

The speaker begins by explaining their frustration with constantly hitting Claude's usage limits, even with a premium $200/month plan that includes weekly limits. They note this is a recent change, as they previously ran everything on the Opus model without any issues. The speaker shows their usage dashboard, revealing they're close to session limits and have already spent an additional €70 on top of their plan. They attribute this to being 'very generous' with their AI agents and model usage. The main solution involves optimizing their multi-agent setup within Claude. The speaker demonstrates how to access and modify individual agents through the /agents command, showing that each agent can be configured to use different models rather than inheriting from the parent session. Their strategy involves using Sonnet for the main orchestrator agent 'Larry,' whose job is simply to understand which team member is best for a task and delegate accordingly. For specialized work like coding, they keep specific agents like 'Felix' on the more powerful Opus model. Additionally, they show the new effort functionality that can be adjusted with arrow keys, explaining they were previously running Opus with high effort and a 1 million context window, which contributed to excessive token usage. By switching to Sonnet with medium effort for coordination and reserving Opus for specialized tasks, they can significantly reduce costs while maintaining functionality.

About this episode

Stop burning through your Claude Max plan in 3 days. 🚀✨ Build your Future-Proof AI Productivity System TODAY: https://myicor.com If you're running a multi-agent AI team on Claude Code, you've probably hit the same wall I did. This video shows the exact fix. I was running all 20+ agents on Opus 4.6 with high effort and a 1 million token context window. Session limits, weekly limits, extra charges on top of the $200/month Max plan. The solution: assign the right model tier per agent. My orchestrator now runs on Sonnet at medium effort. Specialists like Felix still get Opus for the heavy lifting. I also walk through the /agents UI and the new effort control (arrow keys) so you can do this yourself in 5 minutes. TIMESTAMPS 0:00 Hitting the wall: session and weekly limits 0:27 What changed: more tokens, more obvious problem 0:52 My actual plan usage (70 euros overage) 1:20 The fix: optimizing model assignments 1:38 Inside the Larry folder and Claude Code agents 1:58 The /agents UI: viewing and editing agent models 2:45 Switching Quinn to Haiku for lightweight tasks 3:29 Switching Larry to Sonnet 4.6 with effort control 4:22 Why I was burning tokens: Opus + high effort + 1M context 4:39 The new setup: Sonnet orchestrator, Opus specialists 5:17 How to apply this to your own AI team RESOURCES Claude Code: https://claude.ai/code Anthropic Max Plans: https://www.anthropic.com/pricing SUBSCRIBE for more AI productivity walkthroughs every week. ABOUT Tom helps professionals build AI productivity systems that work, using the tool-agnostic ICOR Methodology, courses, personal coaching, and a growing community on myICOR. LinkedIn: https://www.linkedin.com/in/tomsolid/ X: https://x.com/TomSolidPM Podcast: https://www.youtube.com/@productivitylikeapro Disclaimer: Some links may be affiliate links. If you click and purchase, I may earn a small commission at no extra cost to you. #ClaudeAI #claudecode #AIProductivity #claudecowork #myICOR #ICOR #ProductivitySystem #BusyProfessionals

Key Insights

The speaker was previously running Opus 4.6 with a 1 million context window in high effort mode constantly, which explains why they started running out of tokens
Claude automatically creates agents at the project level when you set up AI teams using simple folder structures with instructions
The main orchestrator agent Larry only needs to understand who is the best team member for a job and delegate to them, making Sonnet sufficient for this coordination role

Topics

Claude usage limits optimizationMulti-agent AI team configurationToken cost management strategies

Transcript

[0:00] Like many of you, I also started to constantly hit my limits and now even running out of my weekly limits with my max model 20x plan with $200 per month. Well, yes, I'm do I'm using my AI team for many things for my PKA, personal knowledge assistance, but also running the business team here doing a lot of things like coding and so on. So, I'm really leveraging this nonstop. Nonetheless, there's a total shift because I never hit any limits in the [0:32] past and I ran everything on OPUS model all the time and no issues whatsoever. I never hit these limits. So, there's certainly something that changed in the background. So, here is actually…

Full transcript available for MurmurCast members

View original source →

More from ICOR with Tom | AI Productivity

Get AI summaries like this delivered to your inbox daily

I Fixed Claude's Token Limits. Here's How.

Summary

About this episode

Key Insights

Topics

Transcript

More from ICOR with Tom | AI Productivity

I Made Opus and Fable Grade Each Other. Opus Admitted It Lost.

Our AI invented the numbers. We caught it.

An AI Just Handed Me a Fake $67B Statistic

AI Is Quietly Taking Your Memory

My AI Team Now Has an Interface. All 12 Agents. Free.

Get AI summaries delivered to your inbox