ConnectOnionConnectOnion
DocsUseful Pluginsauto_compact

auto_compact

Automatically compress context when the conversation gets too long, preventing token overflow without interrupting the agent.

When context hits 90%, old messages are summarized and replaced — the agent keeps working without hitting token limits.

Quick Start

main.py
1from connectonion import Agent 2from connectonion.useful_plugins import auto_compact 3 4agent = Agent("researcher", plugins=[auto_compact], model="co/gemini-2.5-pro") 5 6# Works even on sessions that would normally hit token limits 7agent.input("Analyze all 50 files in src/ and write a report")

How it works

1
Monitor after every LLM call

After each response, checks context_percent. Requires at least 8 messages before activating.

2
Summarize old messages

Calls co/gemini-2.5-flash to generate a compact summary of earlier turns (max 800 words).

3
Replace with summary

Keeps: system prompt + summary message + last 5 messages. Discards the rest. Session continues normally.

ThresholdMin messagesMessages keptSummary model
90%8system + summary + last 5co/gemini-2.5-flash

When to use

Long file analysis

Agents that read many large files fill context quickly

Batch operations

Processing dozens of items in a loop

co ai sessions

Already included by default in co ai

Research agents

Extended sessions browsing and summarizing content

Events used

EventHandlerPurpose
after_llmcheck_and_compactCheck usage, compact if >= 90%

Enjoying ConnectOnion?

⭐ Star us on GitHub = ☕ Coffee chat with our founder. We love meeting builders.