Context Management

Long agent runs accumulate messages: tool results, model replies, and subagent exchanges. Left unmanaged, that history can exceed the model's context window and cause errors or truncation. Dawn manages the context window two ways — tool-output offloading (on by default) and conversation summarization (opt-in) — so agents keep working even when a thread grows long.

Tool-output offloading

When a tool returns a large result, Dawn writes the full output to workspace/tool-outputs/ and replaces the in-context payload with a short stub. The stub contains a configurable number of preview lines plus a file handle. When the agent needs the full content it calls readFile with that handle — the content is retrieved from disk and fed into the next model turn.

Offloading is active as soon as a workspace/ directory exists at the app root. No additional configuration is required to enable it.

What the model sees

The in-context stub looks like this:

text
[Tool output offloaded — 12,345 chars exceeded the 1,500-char limit.
Full output saved to: tool-outputs/abc123.txt
Preview (first 10 lines):
line 1 …
line 2 …

Read the full output with the readFile tool at the path above.]

The model can proceed with the preview or call readFile to fetch more. Because the full content lives on disk, it survives across model turns without consuming context tokens.

Configuration

dawn.config.ts
export default {
  toolOutput: {
    offloadThresholdChars: 40000, // default
    previewLines: 10,             // default
    maxBytes: 268435456,          // default (256 MB)
    ttlMs: 10800000,              // default (3 h)
    gcThrottleMs: 10000,          // default (10 s)
    noOffloadTools: [],           // merged with built-in exempt set
  },
}
KeyTypeDefaultDescription
offloadThresholdCharsnumber40000Serialized character length above which a result is offloaded.
previewLinesnumber10Number of leading lines kept in the in-context stub.
maxBytesnumber268435456Maximum total bytes stored under workspace/tool-outputs/. Oldest files are evicted first when the budget is exceeded.
ttlMsnumber10800000Offloaded files older than this many milliseconds are deleted. Default is 3 hours.
gcThrottleMsnumber10000Minimum milliseconds between GC scans. Default is 10 seconds.
noOffloadToolsstring[][]Additional tool names whose output is never offloaded. Merged with the built-in exempt set (readFile, listDir).

Exemptions

readFile and listDir are always exempt from offloading. Exempting retrieval tools is required so the agent can read back offloaded content without the result being re-offloaded into a second pointer. Use noOffloadTools to add more tool names to the exempt set — for example, if you have a tool that already returns a compact summary.

Garbage collection

Dawn runs a GC pass (at most once per gcThrottleMs) that removes files older than ttlMs and, if the total size still exceeds maxBytes, deletes oldest files first until the budget is satisfied. The GC runs in the background and does not block agent turns.

Conversation summarization

Summarization compresses older message history once a thread's token count exceeds a threshold. The most recent turns stay verbatim; everything older is folded into a rolling summary that is prepended to the conversation on the next model call.

Summarization is opt-in. Enable it in dawn.config.ts:

dawn.config.ts
export default {
  summarization: {
    enabled: true,
  },
}

Configuration

KeyTypeDefaultDescription
enabledbooleanfalseEnable conversation summarization. Off by default.
maxTokensnumber12000Token count above which older history is summarized.
keepRecentTurnsnumber6Most-recent turns (each starting at a HumanMessage) kept verbatim, never summarized.
modelstringRoute's modelModel used for the summary LLM call. Defaults to the same model the route uses.
tokenCounter(text: string) => number | Promise<number>Lazy gpt-tokenizer (o200k_base)Custom token-counting function.
summarize(args) => Promise<string>Built-in single-LLM-call summarizerCustom summary generator. Receives messages, model, previousSummary, and signal.

The tokenCounter hook lets you plug in a different tokenizer. The summarize hook replaces the entire summary generation step — useful if you want to use a cheaper model, apply domain-specific compression, or route through a different provider.

How they compose

Offloading and summarization address different axes of context growth:

  • Tool-output offloading acts per tool result — it prevents a single large result from blowing up the context.
  • Conversation summarization acts per conversation — it compresses history that has accumulated over many turns.

Both can be active at the same time. A typical configuration for a long-running research agent:

dawn.config.ts
export default {
  toolOutput: {
    offloadThresholdChars: 1500, // tight threshold for a research agent
    previewLines: 10,
  },
  summarization: {
    enabled: true,
    maxTokens: 12000,
    keepRecentTurns: 6,
  },
}

Related