auto_compact
Automatically compress context when the conversation gets too long, preventing token overflow without interrupting the agent.
When context hits 90%, old messages are summarized and replaced — the agent keeps working without hitting token limits.
Quick Start
How it works
After each response, checks context_percent. Requires at least 8 messages before activating.
Calls co/gemini-2.5-flash to generate a compact summary of earlier turns (max 800 words).
Keeps: system prompt + summary message + last 5 messages. Discards the rest. Session continues normally.
| Threshold | Min messages | Messages kept | Summary model |
|---|---|---|---|
| 90% | 8 | system + summary + last 5 | co/gemini-2.5-flash |
When to use
Agents that read many large files fill context quickly
Processing dozens of items in a loop
Already included by default in co ai
Extended sessions browsing and summarizing content
Events used
| Event | Handler | Purpose |
|---|---|---|
after_llm | check_and_compact | Check usage, compact if >= 90% |
