Summarization And Context

Summarization in Agent SDK is built for autonomous agents that may run long enough to outgrow the model-facing context window.

Why this exists

Autonomous agents tend to accumulate:

large tool outputs
repeated tool call traces
intermediate assistant turns
memory and planning context

Without compaction, that state becomes expensive, noisy, and eventually unusable.

What summarization actually does

The smart runtime can compact conversation and tool history when the configured context budget is under pressure.

The important part is what it does not do:

it does not blindly erase the past
it does not require the application to manually rewrite messages
it does not remove recovery options for archived tool outputs

Key knobs

const agent = createSmartAgent({
  model,
  tools,
  summarization: {
    enable: true,
    maxTokens: 24000,
    summaryTriggerTokens: 17000,
    summaryMode: "incremental",
  },
  context: {
    policy: "hybrid",
    toolResponsePolicy: "summarize_archive",
  },
});

Important fields:

summarization.enable
summarization.maxTokens — output ceiling (and the fallback trigger when summaryTriggerTokens is omitted)
summarization.summaryTriggerTokens — token threshold that triggers compaction; falls back to maxTokens, then to limits.maxContextTokens
summarization.summaryMode — "incremental" (default) chains the prior summary; "full_rewrite" discards it
summarization.summaryPromptMaxTokens — caps the summarization prompt size
summarization.integrityCheck — verifies stable facts survive across passes (default true)
context.toolResponsePolicy — retention applied when the summarizer rewrites tool messages
toolResponses.defaultPolicy — same as above; takes precedence over context.toolResponsePolicy
toolResponses.toolResponseRetentionByTool — per-tool override, highest priority
toolResponses.criticalTools — tool names that are never reduced (defaults include response, manage_todo_list, get_tool_response)

How the trigger interacts with the loop

The base agent checks the threshold on every iteration before invoking the model. When the conversation token count exceeds summaryTriggerTokens:

The base agent sets __needsSummarization and breaks out so the smart wrapper can run compaction.
The smart wrapper runs the summarizer.
If the summarizer cannot compress anything (e.g. every remaining tool response is keep_full), the runtime sets __summarizationExhausted and falls through to a normal model call instead of deadlocking.
As soon as a new compactable tool result is appended on a later turn, the runtime automatically clears __summarizationExhausted so future compaction can fire again.

This keeps the loop progressive even when retention policies temporarily lock out compaction.

Recovery after compaction

When large tool outputs are summarized and archived, the runtime keeps a recovery path through get_tool_response.

That means an autonomous agent can continue with compact context, then fetch raw evidence again if it later needs the original output.

State surfaces to inspect

state.summaryRecords
state.toolHistoryArchived

These surfaces tell you not only that summarization happened, but also which evidence remains available for later recovery.

When autonomous agents benefit most

Summarization is especially helpful when the agent:

performs repeated search or retrieval
calls MCP tools that return large payloads
uses multi-step planning over a longer session
needs resume and recovery without replaying every raw tool result into live context

When to disable it

You can turn summarization off for debugging or short deterministic runs:

const agent = createSmartAgent({
  model,
  tools,
  summarization: false,
});

That is usually appropriate only when the task is short or when you are inspecting exact raw transcripts for debugging.

Design rule to remember

Summarization is not a cosmetic feature. It is a runtime survival mechanism for agents that need to think and act across longer horizons.

Summarization And Context ​

Why this exists ​

What summarization actually does ​

Key knobs ​

How the trigger interacts with the loop ​

Recovery after compaction ​

State surfaces to inspect ​

When autonomous agents benefit most ​

When to disable it ​

Design rule to remember ​