DocsUseful Pluginsauto_compact

auto_compact

Automatically compress context when the conversation gets too long, preventing token overflow without interrupting the agent.

Download

When context hits 90%, old messages are summarized and replaced — the agent keeps working without hitting token limits.

Quick Start

main.py
python
from connectonion import Agent
from connectonion.useful_plugins import auto_compact

agent = Agent("researcher", plugins=[auto_compact], model="co/gemini-2.5-pro")

# Works even on sessions that would normally hit token limits
agent.input("Analyze all 50 files in src/ and write a report")

How it works

Monitor after every LLM call

After each response, checks context_percent. Requires at least 8 messages before activating.

Summarize old messages

Calls co/gemini-2.5-flash to generate a compact summary of earlier turns (max 800 words).

Replace with summary

Keeps: system prompt + summary message + last 5 messages. Discards the rest. Session continues normally.

Threshold	Min messages	Messages kept	Summary model
90%	8	system + summary + last 5	co/gemini-2.5-flash

When to use

Long file analysis

Agents that read many large files fill context quickly

Batch operations

Processing dozens of items in a loop

co ai sessions

Already included by default in co ai

Research agents

Extended sessions browsing and summarizing content

Events used

Event	Handler	Purpose
`after_llm`	check_and_compact	Check usage, compact if >= 90%

Useful Plugins

tool_approval