Как экономить токены и управлять памятью AI агента в 2026 году Команда Convolife
The Convo Life team presents a tool for tracking AI agent token usage and context window status within Telegram bots. The system provides usage thresholds and recommendations for when to compact or reset sessions to avoid exhausting the context window. It can also be visually customized per bot.
Summary
The presentation introduces Convo Life, a system designed to help users monitor and manage the token consumption of AI agents, specifically by tracking how much of the context window has been used and how much remains in the current session.
The tool operates on a threshold-based alert system. When the context window is between 0–40% full, the situation is considered fine. From 40–60%, it is acceptable but warrants attention. At 60–75%, the system recommends running a 'compact' command to compress the session context. At 75–90%+ usage, launching a new chat session ('newchat' or 'newat') is strongly advised to prevent rapid exhaustion of the available token budget.
The system is implemented via a command sent to a Telegram bot. The presenter shares an example where the command was run and the context window was found to be 70% loaded, prompting an immediate recommendation to clear the session and run a checkpoint due to context overload. Implementation requires sending a specific prompt with usage examples through a skill or CLD interface, after which the bot understands the command.
The tool is also customizable in terms of presentation. Some bots display the context load as a plain percentage (e.g., '70%', '5%', '25%'), while others show a visual representation of how loaded the context window is. One example bot named 'Demniy' is shown responding with a detailed message: 'The text window is loaded at 0.2%, you still have 998,000 tokens left. Everything is fine.'
Key Insights
- The Convo Life team argues that context window usage above 75% warrants launching a new chat session ('newchat') to prevent rapid token depletion in the AI agent.
- The presenter demonstrates that a 70% context load triggers an automatic recommendation from the bot to clear the session and run a checkpoint, treating it as an overloaded state.
- The team explains that the tracking command is implemented by sending a specific prompt with usage examples through a skill or CLD interface, after which the AI agent interprets it autonomously.
- The presenter notes that the context window display is customizable per bot — some show a plain percentage while others render a visual indicator of context load.
- An example bot named 'Demniy' is shown responding with granular detail, reporting '0.2% loaded with 998,000 tokens remaining,' illustrating the system's capacity for precise token accounting.
Topics
Full transcript available for MurmurCast members
Sign Up to Access