blue/.blue/docs/spikes/2026-01-26T1843Z-read-tool-token-limit-on-assembled-dialogue-documents.wip.md
Eric Garcia 02901dfec7 chore: batch commit - ADRs, RFCs, dialogues, spikes, and code updates
ADRs:
- Update 0008-honor, 0009-courage, 0013-overflow, 0015-plausibility
- Add 0017-hosted-coding-assistant-architecture

RFCs:
- 0032: per-repo AWS profile configuration (draft)
- 0033: round-scoped dialogue files (impl + plan)
- 0034: comprehensive config architecture (accepted)
- 0036: expert output discipline (impl)
- 0037: single source protocol authority (draft)
- 0038: SDLC workflow discipline (draft)
- 0039: ADR architecture greenfield clarifications (impl)
- 0040: divorce financial analysis (draft)
- 0042: alignment dialogue defensive publication (draft)

Spikes:
- Read tool token limit on assembled dialogues
- RFC ID collision root cause
- Expert agent output too long
- Judge writes expert outputs
- Blue MCP server on superviber infrastructure
- Playwright MCP multiple window isolation

Dialogues: 16 alignment dialogue records

Code:
- blue-core: forge module enhancements
- blue-mcp: env handlers and server updates
- alignment-expert agent improvements
- alignment-play skill refinements
- install.sh script

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 16:28:31 -05:00

5 KiB
Raw Blame History

Spike: Read tool token limit on assembled dialogue documents

Status In Progress
Date 2026-01-26
Time Box 30 minutes

Question

Why does the alignment dialogue fail with token limit errors when using file-based subagent output?


Root Cause

The error occurs when the Judge agent tries to read the assembled dialogue document after completing all rounds. Individual agent output files are small (~2-3KB each, ~400 words), but the combined dialogue document accumulates:

  • 3-6 expert perspectives per round
  • Multiple rounds (typically 2-3)
  • Each perspective ~400 words
  • Plus judge synthesis, tension markers, and metadata

Result: A 4-round dialogue with 5 experts produces ~10KB per round × 4 = ~40KB+, exceeding the Read tool's 25,000 token limit.

Evidence

Error observed:

Read(~/.claude/projects/-Users-ericg-letemcook-fungal-image-analysis/acd9a1b2-29fd-437c-a1
Error: File content (31767 tokens) exceeds maximum allowed tokens (25000)

The path ~/.claude/projects/... is where Claude stores Task output, suggesting the Judge was reading back its own assembled document (not the individual /tmp/blue-dialogue/{slug}/round-N/{agent}.md files).

Already Documented

RFC 0029 (file-based-subagent-output) captured this as Churro T02 (open question at line 159):

When agent output exceeds Write tool buffer limits, should the Task system JSONL approach serve as fallback?

The original dialogue noted:

TENSION T02: Stream vs document modes — when agent output exceeds buffer

What Works

  • Individual agent files in /tmp/blue-dialogue/{slug}/round-N/{agent}.md (~2-3KB each)
  • Write tool successfully stores agent perspectives
  • Round-scoped paths prevent collisions
  • Fallback to blue_extract_dialogue(task_id=...) exists for missing files

What Breaks

  1. Assembled dialogue documents can exceed Read tool's 25K token limit
  2. Judge can't verify its own writes to large dialogue files
  3. No paginated read strategy in the judge protocol

Options

A. Paginated reading

Judge reads dialogue with offset/limit parameters. Requires tracking document structure to know what to skip.

B. Streaming writes, chunk reads

Each round writes to a separate section file. Judge assembles by reading chunks. More complexity.

C. Trust-but-verify pattern

Judge writes without reading back the full document. Only reads individual agent files which stay small. Final document assembly happens at dialogue completion, not during.

D. Summary-based continuation

After each round, Judge writes a summary of accumulated state rather than re-reading the full document. Avoids needing to read large files.

Recommendation

Option C (trust-but-verify) aligns with the file-based approach:

  1. Judge reads individual agent output files (always small)
  2. Judge appends to dialogue document without re-reading it
  3. blue_dialogue_save handles final assembly and validation
  4. Remove any Judge instructions that require reading the full assembled document mid-dialogue

This requires updating build_judge_protocol in dialogue.rs to not instruct the Judge to read back its own document.


Alignment Dialogue Outcome

A 3-expert alignment dialogue reached 100% convergence on an improved architecture:

Dialogue: .blue/docs/dialogues/2026-01-26T1850Z-round-scoped-file-architecture-for-alignment-dialogues.dialogue.recorded.md

Final Architecture

/tmp/blue-dialogue/{slug}/
├─ round-0/
│  ├─ muffin.md          ← Agents write (working artifacts)
│  ├─ cupcake.md
│  └─ scone.md
├─ round-0.dialogue.md   ← Judge assembles (continuity artifact)
├─ round-1/
│  └─ {agent}.md
├─ round-1.dialogue.md
└─ .archive/             ← Post-round archive (optional)

Key Resolutions

Tension Resolution
Stateless vs stateful synthesis Stateful by reference — global tension IDs (T01, T02...) enable cross-round references without copying content
What content in synthesis Full round content — synthesis + all expert perspectives + metadata (~8-12KB per round, safely under 25K)
Cross-round tension references Global namespace — T01, T02, T03... never reused across rounds
Dual-write burden on Judge Necessary separation of concerns — prompt templating (pre-round) and synthesis assembly (post-round) serve different consumers

Implementation Changes Required

  1. Judge reads per round: ~15-20KB max

    • Current round agent files (~2-3KB × agents)
    • Prior round's round-N.dialogue.md only (~8-12KB) — NOT full history
  2. Judge writes per round:

    • Agent prompt files (pre-round, templated)
    • Round dialogue file (post-round, synthesis + perspectives)
  3. Agents read per round:

    • All prior round-N.dialogue.md files for context
    • Source grounding files specified in prompt

This eliminates the token overflow by ensuring no single Read exceeds 25K tokens.