Eric Garcia 0fea499957 feat: lifecycle suffixes for all document states + resolve all clippy warnings

Every document filename now mirrors its lifecycle state with a status
suffix (e.g., .draft.md, .wip.md, .accepted.md). No more bare .md for
tracked document types. Also renamed all from_str methods to parse to
avoid FromStr trait confusion, introduced StagingDeploymentParams struct,
and fixed all 19 clippy warnings across the codebase.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-26 12:19:46 -05:00

5.8 KiB

Raw Blame History

Spike: Dialogue Generation Linter Mismatch


Status	In Progress
Date	2026-01-26
Time Box	30 minutes

Question

Why does the alignment dialogue generation code + Judge protocol produce output that fails the dialogue linter? What specific mismatches exist between what the generation scaffold creates, what the Judge protocol instructs agents to produce, and what the linter validates?

Findings

Three components interact to produce a dialogue: the scaffold generator (generate_alignment_dialogue_markdown in dialogue.rs:805-885), the Judge protocol (build_judge_protocol in dialogue.rs:887-1015), and the linter (dialogue_lint.rs). These three disagree on format in several places. The linter is the authority, but the generator and Judge protocol don't always produce what it expects.

Mismatch 1: Round numbering — generator and linter wrong, agents correct

Round numbering starts at 0. Opening Arguments is Round 0.

Component	Behavior	Correct?
Generator	Created `## Round 1: Opening Arguments` (line 878)	No — fixed to Round 0
Judge protocol	Says nothing about round numbering	Missing — should specify 0-based
Linter	`check_round_numbering` required first round `1`; `check_round_sequencing` required `[1, 2, ..., N]`	No — fixed to 0-based
Agents	Wrote `## Round 0: Opening Arguments`	Yes

Root cause: Generator and linter both hardcoded 1-based round numbering. The Judge protocol was silent. Agents naturally used 0-based numbering, which is correct. Fixed in generator (dialogue.rs:878), linter (dialogue_lint.rs:486, 595), and associated tests.

Mismatch 2: Agent header emoji order — generator vs Judge protocol vs linter

Component	Format
Generator	`### {} {}` → `### Muffin 🧁` (name first, line 880)
Judge protocol	"Round headers use emoji prefix (### 🧁 Muffin)" (line 989)
Linter	Regex `r"###\s(\w+)\s([🧁💙]?)"` — expects name first, emoji second

Root cause: The Judge protocol instructs emoji-first (### 🧁 Muffin) but the generator scaffold and the linter regex both assume name-first (### Muffin 🧁). If agents follow the Judge protocol, the linter's emoji-consistency check can't parse the emoji. If they follow the scaffold, the Judge protocol is violated.

Mismatch 3: Judge assessment sections not in linter's model

The Judge adds ## 💙 Judge: Round N Assessment sections as h2 headings. These are neither rounds (## Round N) nor agent responses (### Name 🧁). The linter's round regex r"(?i)##\s*Round\s+(\d+)" doesn't match them (correct), but it also doesn't account for them in the document structure model. If a Judge writes ## Round 0 Assessment without the 💙 Judge: prefix, it would be parsed as a round and break sequencing.

Root cause: No explicit format specification for Judge assessment sections in either the protocol or the linter.

Mismatch 4: Perspective ID width — protocol vs linter

Component	Behavior
Agent prompt	Says `[PERSPECTIVE Pnn: brief label]` — implies 2-digit
Linter	Regex `r"(?i)\[\sPERSPECTIVE\s+P(\d{2})\s:"` — strictly 2-digit
Risk	If an agent writes `[PERSPECTIVE P1: ...]` (1 digit), the linter silently ignores it

Root cause: The agent prompt template uses Pnn which looks like a template placeholder, not a format directive. Agents may use P1 instead of P01. The linter won't parse single-digit IDs, leading to missing entries in the Perspectives Inventory.

Mismatch 5: Scoreboard row regex fragility

The linter's scoreboard row regex:

r"\|\s*([🧁💙]?\s*\w+)\s*\|\s*(\d+)\s*\|\s*(\d+)\s*\|\s*(\d+)\s*\|\s*(\d+)\s*\|\s*\*\*(\d+)\*\*\s*\|"

This assumes:

Agent name is a single \w+ word — fails for "Multi Word" names or emoji-prefixed without space
Score columns are bare digits — fails if agent writes 3/3 or adds notes
Total is bold **N** — fails if agent doesn't bold it

Root cause: The regex is tightly coupled to the scaffold's exact output format. Any variation by the Judge when updating scores breaks parsing. The scoreboard-math check then silently passes (no data to verify) rather than failing.

Mismatch 6: No format contract between components

The deepest root cause: there is no shared format contract. The generator, Judge protocol, and linter were built independently. Each encodes its own assumptions:

Generator assumes its scaffold format is canonical
Judge protocol instructs a slightly different format (emoji-first headers)
Linter validates against its own regex patterns

There is no single source of truth for "what a valid dialogue looks like."

Summary

#	Mismatch	Severity	Status
1	Round numbering: generator and linter used 1-based, should be 0-based	Major ×2	Fixed
2	Agent header emoji order	Minor	Open
3	Judge assessment section format	Minor	Open
4	Perspective ID width (P1 vs P01)	Major	Open
5	Scoreboard regex fragility	Minor	Open
6	No shared format contract	Systemic	Open

Recommendation

Recommends implementation for the remaining open mismatches:

~~Round numbering~~ — Fixed. Generator, linter, and tests updated to 0-based.
Settle emoji order: Pick name-first (matches generator + linter regex) and update Judge protocol line 989 to match
Specify Judge assessment format: Add ## 💙 Judge: Round N Assessment to the protocol as mandatory format
Clarify perspective ID format: Change agent prompt from Pnn to explicit "P01, P02, ... (always two digits)"
Loosen scoreboard regex: Handle multi-word agent names and emoji-prefixed names

5.8 KiB Raw Blame History Unescape Escape