fix: remove wrong alignment orchestration architecture (RFC 0015)
RFC 0012 was implemented with Option B (MCP orchestrates via Ollama) instead of Option A (Claude orchestrates via Task tool). This caused: - No parallel agents spawned - Fake Ollama responses instead of real deliberation - Inline JSON instead of .dialogue.md files Fix by removing blue_alignment_play tool entirely. Claude now orchestrates alignment dialogues directly using Task tool per ADR 0014. Also: - Add pub mod resources for RFC 0016/0017 (was missing) - Update lib.rs threading for spawn_blocking - Add .blue/worktrees/ to gitignore - Update database schema Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
83fb0202a6
commit
29f139e1cd
8 changed files with 318 additions and 1262 deletions
BIN
.blue/blue.db
BIN
.blue/blue.db
Binary file not shown.
|
|
@ -1,663 +0,0 @@
|
||||||
# ADR 0006: alignment-dialogue-agents
|
|
||||||
|
|
||||||
| | |
|
|
||||||
|---|---|
|
|
||||||
| **Status** | Active |
|
|
||||||
| **Date** | 2026-01-19 |
|
|
||||||
| **Updated** | 2026-01-20 (rebrand: Alignment → Alignment) |
|
|
||||||
| **Supersedes** | Original wisdom-dialogue-agents (same ADR, renamed) |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
ADR 0004 established the wisdom workflow with draft → dialogue → final documents. But it left open HOW the dialogue actually happens. The spike on adversarial dialogue agents explored mechanics but missed the deeper question: what IS wisdom, and how do we measure it?
|
|
||||||
|
|
||||||
The parable of the blind men and the elephant illuminates:
|
|
||||||
- Each blind man touches one part and believes they understand the whole
|
|
||||||
- Each perspective is **internally consistent** but **partial**
|
|
||||||
- **Wisdom is the integration of all perspectives into a unified understanding**
|
|
||||||
- There is no upper limit—there's always another perspective to incorporate
|
|
||||||
|
|
||||||
This ADR formalizes ALIGNMENT as a measurable property and defines a multi-agent dialogue system to maximize it.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
Alignment dialogues are conducted by **N+1 agents**:
|
|
||||||
|
|
||||||
| Agent | Symbol | Role |
|
|
||||||
|-------|--------|------|
|
|
||||||
| **Cupcakes** | 🧁 | Perspective Contributors - each surfaces unique viewpoints, challenges, and refinements |
|
|
||||||
| **Judge** | 💙 | Arbiter - scores ALIGNMENT, tracks perspectives, guides convergence |
|
|
||||||
|
|
||||||
All 🧁 agents engage in **friendly competition** to see who can contribute more ALIGNMENT. They are partners, not adversaries—all want the RFC to be as aligned as possible. The competition is about who can *give more* to the solution, not who can *defeat* the others.
|
|
||||||
|
|
||||||
The 💙 watches with love, scores each contribution fairly, maintains the **Perspectives Inventory**, and gently guides all toward convergence.
|
|
||||||
|
|
||||||
### Scalable Perspective Diversity
|
|
||||||
|
|
||||||
The number of 🧁 agents is configurable:
|
|
||||||
- **Minimum**: 2 agents (classic Muffin/Cupcake pairing)
|
|
||||||
- **Typical**: 3-5 agents for complex RFCs
|
|
||||||
- **Maximum**: Limited only by coordination overhead
|
|
||||||
|
|
||||||
More blind men = more parts of the elephant discovered. Each 🧁 brings a different perspective, potentially using different models, prompts, or focus areas.
|
|
||||||
|
|
||||||
### Agent Count Selection
|
|
||||||
|
|
||||||
Choosing N (the number of 🧁 agents) affects both perspective diversity and consensus stability:
|
|
||||||
|
|
||||||
| Count | Use Case | Consensus Properties |
|
|
||||||
|-------|----------|---------------------|
|
|
||||||
| **N=2** | Binary decisions, simple RFCs | Classic Muffin/Cupcake. Only 0% or 100% agreement possible. Deadlock requires 💙 intervention. |
|
|
||||||
| **N=3** | Moderate complexity, clear alternatives | Odd count prevents voting deadlock. Can distinguish 67% (2/3) from 100% (3/3) agreement. |
|
|
||||||
| **N=5** | Architectural decisions, policy RFCs | Richer consensus gradients (60%, 80%, 100%). Strong signal detection. |
|
|
||||||
| **N=7+** | Highly complex, multi-domain decisions | Specialized perspectives (see RFC 0062). Consider only when domain expertise warrants. |
|
|
||||||
|
|
||||||
**SHOULD: Prefer odd N (3, 5, 7) for decisions where consensus voting applies.**
|
|
||||||
|
|
||||||
Rationale:
|
|
||||||
- **Odd N prevents structural deadlock**: With even N, agents can split 50/50 with no majority
|
|
||||||
- **Clearer consensus signals**: N=3 distinguishes "strong majority" from "unanimous"
|
|
||||||
- **Tie-breaking is built-in**: No need for 💙 to force resolution on evenly-split opinions
|
|
||||||
|
|
||||||
**MAY: Use N=2 for lightweight decisions** where the classic Advocate/Challenger dynamic suffices. Binary perspective is appropriate when:
|
|
||||||
- The decision is yes/no or A/B
|
|
||||||
- Deep exploration isn't needed
|
|
||||||
- Speed matters more than consensus nuance
|
|
||||||
|
|
||||||
**Tie-Breaking (when N is even)**: If agents split evenly, 💙 scores the unresolved tension and guides toward ALIGNMENT rather than forcing majority rule. The 💙 may also surface a perspective that breaks the deadlock.
|
|
||||||
|
|
||||||
**Complexity Trade-off**: Each additional agent adds coordination overhead. Balance perspective diversity against round duration. N=3 is often the sweet spot—odd count with manageable complexity.
|
|
||||||
|
|
||||||
## The ALIGNMENT Definition
|
|
||||||
|
|
||||||
### The Blind Men and the Elephant
|
|
||||||
|
|
||||||
Each blind man touches one part of the elephant:
|
|
||||||
- Trunk: "It's a snake!"
|
|
||||||
- Leg: "It's a tree!"
|
|
||||||
- Ear: "It's a fan!"
|
|
||||||
- Tail: "It's a rope!"
|
|
||||||
|
|
||||||
Each is **internally consistent** but **partial** (missing other views).
|
|
||||||
|
|
||||||
**Wisdom is the integration of all perspectives into a unified understanding that honors each part while seeing the whole.**
|
|
||||||
|
|
||||||
### The Full ALIGNMENT Measure (ADR 0001)
|
|
||||||
|
|
||||||
```
|
|
||||||
ALIGNMENT = Wisdom + Consistency + Truth + Relationships
|
|
||||||
|
|
||||||
Where:
|
|
||||||
- Wisdom: Integration of perspectives (the blind men parable)
|
|
||||||
- Consistency: Pattern compliance (ADR 0005)
|
|
||||||
- Truth: Single source, no drift (ADR 0003)
|
|
||||||
- Relationships: Graph completeness (ADR 0002)
|
|
||||||
```
|
|
||||||
|
|
||||||
### No Upper Limit
|
|
||||||
|
|
||||||
All dimensions are **UNBOUNDED**. There's always another perspective. Another edge case. Another stakeholder. Another context. Another timeline. Another world.
|
|
||||||
|
|
||||||
ALIGNMENT isn't a destination. It's a direction. The score can always go higher.
|
|
||||||
|
|
||||||
## The ALIGNMENT Score
|
|
||||||
|
|
||||||
Each turn, the 💙 scores the contribution across four dimensions. **All dimensions are unbounded** - there is no maximum score.
|
|
||||||
|
|
||||||
| Dimension | Question |
|
|
||||||
|-----------|----------|
|
|
||||||
| **Wisdom** | How many perspectives integrated? How well synthesized into unity? |
|
|
||||||
| **Consistency** | Does it follow established patterns? Internally consistent? |
|
|
||||||
| **Truth** | Grounded in reality? Single source of truth? No contradictions? |
|
|
||||||
| **Relationships** | How does it connect to other artifacts? Graph completeness? |
|
|
||||||
|
|
||||||
**ALIGNMENT = Wisdom + Consistency + Truth + Relationships**
|
|
||||||
|
|
||||||
### Why Unbounded?
|
|
||||||
|
|
||||||
Bounded scores (0-5) created artificial ceilings. A truly exceptional contribution that surfaces 10 new perspectives and integrates them beautifully shouldn't be capped at "5/5 for coverage."
|
|
||||||
|
|
||||||
Unbounded scoring:
|
|
||||||
- Rewards exceptional contributions proportionally
|
|
||||||
- Removes gaming incentives (can't "max out" a dimension)
|
|
||||||
- Reflects reality: there's always more ALIGNMENT to achieve
|
|
||||||
- Makes velocity meaningful: +2 vs +20 tells you something
|
|
||||||
|
|
||||||
### ALIGNMENT Velocity
|
|
||||||
|
|
||||||
The dialogue tracks cumulative ALIGNMENT:
|
|
||||||
|
|
||||||
```
|
|
||||||
Total ALIGNMENT = Σ(all turn scores)
|
|
||||||
ALIGNMENT Velocity = score(round N) - score(round N-1)
|
|
||||||
```
|
|
||||||
|
|
||||||
When **ALIGNMENT Velocity approaches zero**, the dialogue is converging. New rounds aren't adding perspectives. Time to finalize.
|
|
||||||
|
|
||||||
## The Agents
|
|
||||||
|
|
||||||
### 🧁 Cupcakes (Perspective Contributors)
|
|
||||||
|
|
||||||
All 🧁 agents share the same core prompt, differentiated only by their assigned name:
|
|
||||||
|
|
||||||
```
|
|
||||||
You are {NAME} 🧁 in an ALIGNMENT-seeking dialogue with your fellow Cupcakes 🧁🧁🧁.
|
|
||||||
|
|
||||||
Your role:
|
|
||||||
- SURFACE perspectives others may have missed
|
|
||||||
- DEFEND valuable ideas with love, not ego
|
|
||||||
- CHALLENGE assumptions with curiosity, not destruction
|
|
||||||
- INTEGRATE perspectives that resonate
|
|
||||||
- CONCEDE gracefully when others see something you missed
|
|
||||||
- CELEBRATE when others make the solution stronger
|
|
||||||
|
|
||||||
You're in friendly competition: who can contribute MORE to the final ALIGNMENT?
|
|
||||||
But remember—you ALL win when the RFC is aligned. There are no losers here.
|
|
||||||
|
|
||||||
When another 🧁 challenges you, receive it as a gift.
|
|
||||||
When you refine based on their input, thank them.
|
|
||||||
When you see something they missed, offer it gently.
|
|
||||||
|
|
||||||
Format:
|
|
||||||
### {NAME} 🧁
|
|
||||||
|
|
||||||
[Your response]
|
|
||||||
|
|
||||||
[PERSPECTIVE Pxx: ...] - new viewpoint you're surfacing
|
|
||||||
[TENSION Tx: ...] - unresolved issue needing attention
|
|
||||||
[REFINEMENT: ...] - when you're improving the proposal
|
|
||||||
[CONCESSION: ...] - when another 🧁 was right
|
|
||||||
[RESOLVED Tx: ...] - when addressing a tension
|
|
||||||
```
|
|
||||||
|
|
||||||
**Agent Naming**: Each 🧁 receives a unique name (Muffin, Cupcake, Scone, Croissant, Brioche, etc.) for identification in the scoreboard and dialogue. All share the 🧁 symbol.
|
|
||||||
|
|
||||||
### 💙 Judge (Arbiter)
|
|
||||||
|
|
||||||
The Judge role is typically played by the main Claude session orchestrating the dialogue. The Judge:
|
|
||||||
|
|
||||||
- **SPAWNS** all 🧁 agents in parallel at each round
|
|
||||||
- **SCORES** each contribution fairly across all four ALIGNMENT dimensions (unbounded)
|
|
||||||
- **MAINTAINS** the Perspectives Inventory and Tensions Tracker
|
|
||||||
- **MERGES** contributions from all agents into the dialogue record
|
|
||||||
- **IDENTIFIES** perspectives no agent has surfaced yet
|
|
||||||
- **GUIDES** gently toward convergence when ALIGNMENT plateaus
|
|
||||||
- **CELEBRATES** all participants—they are partners, not opponents
|
|
||||||
|
|
||||||
The 💙 loves them all. Wants them all to shine. Helps them find the most aligned path together.
|
|
||||||
|
|
||||||
### Judge ≠ Author Clarification (RFC 0059)
|
|
||||||
|
|
||||||
**Concern**: If the Judge wrote the draft, might it be biased toward its own creation?
|
|
||||||
|
|
||||||
**Resolution**: The architecture prevents this by design:
|
|
||||||
|
|
||||||
| Role | Who | Can Write Draft? | Context |
|
|
||||||
|------|-----|------------------|---------|
|
|
||||||
| Draft Author | Any session | Yes | Creates initial proposal |
|
|
||||||
| Judge (💙) | Orchestrating session | **No** - reads fresh | Spawns, scores, guides |
|
|
||||||
| Cupcakes 🧁 | Background tasks (N) | No | Contribute perspectives in parallel |
|
|
||||||
|
|
||||||
**Key architectural properties**:
|
|
||||||
- The Judge is the **orchestrating** session, not the drafting session
|
|
||||||
- Each 🧁 runs as an independent background task with **fresh context**
|
|
||||||
- No 🧁 has memory of previous sessions—all start fresh
|
|
||||||
- Convergence requires **consensus across all agents**, preventing single-point bias
|
|
||||||
- The Judge can surface perspectives but cannot force their adoption
|
|
||||||
- N parallel agents = N independent perspectives on the same material
|
|
||||||
|
|
||||||
## The Dialogue Flow
|
|
||||||
|
|
||||||
```
|
|
||||||
┌─────────────────────────────────────────────────────────────────────┐
|
|
||||||
│ ALIGNMENT Dialogue Flow │
|
|
||||||
│ │
|
|
||||||
│ ┌──────────┐ │
|
|
||||||
│ │ 💙 Judge │ │
|
|
||||||
│ │ spawns N │ │
|
|
||||||
│ └────┬─────┘ │
|
|
||||||
│ │ │
|
|
||||||
│ ┌────────────────────────┼────────────────────────┐ │
|
|
||||||
│ │ │ │ │ │ │
|
|
||||||
│ ▼ ▼ ▼ ▼ ▼ │
|
|
||||||
│ ┌──────┐ ┌──────┐ ┌──────────┐ ┌──────┐ ┌──────┐ │
|
|
||||||
│ │ 🧁 │ │ 🧁 │ │ Scores │ │ 🧁 │ │ 🧁 │ │
|
|
||||||
│ │Muffin│ │Scone │ │Inventory │ │Eclair│ │Donut │ ... N │
|
|
||||||
│ └──────┘ └──────┘ │ Tensions │ └──────┘ └──────┘ │
|
|
||||||
│ │ │ └──────────┘ │ │ │
|
|
||||||
│ │ │ ▲ │ │ │
|
|
||||||
│ └──────────┴─────────────┴───────────┴──────────┘ │
|
|
||||||
│ │ │
|
|
||||||
│ ▼ │
|
|
||||||
│ ┌─────────────┐ │
|
|
||||||
│ │ .dialogue.md│ │
|
|
||||||
│ │ (the record)│ │
|
|
||||||
│ └─────────────┘ │
|
|
||||||
│ │
|
|
||||||
│ EACH ROUND: Spawn N agents IN PARALLEL │
|
|
||||||
│ LOOP until: │
|
|
||||||
│ - ALIGNMENT Plateau (velocity ≈ 0) │
|
|
||||||
│ - All tensions resolved │
|
|
||||||
│ - 💙 declares convergence │
|
|
||||||
│ - Max rounds reached (safety valve) │
|
|
||||||
└─────────────────────────────────────────────────────────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
## Implementation Architecture
|
|
||||||
|
|
||||||
The ALIGNMENT dialogue runs in **Claude Code** using the **Task tool** with background agents.
|
|
||||||
|
|
||||||
### The N+1 Sessions
|
|
||||||
|
|
||||||
```
|
|
||||||
┌─────────────────────────────────────────────────────────────────────┐
|
|
||||||
│ MAIN CLAUDE SESSION │
|
|
||||||
│ 💙 Judge │
|
|
||||||
│ │
|
|
||||||
│ - Orchestrates the dialogue │
|
|
||||||
│ - Spawns N Cupcakes as PARALLEL background tasks │
|
|
||||||
│ - Waits for ALL to complete before scoring │
|
|
||||||
│ - Scores each turn and updates .dialogue.md │
|
|
||||||
│ - Maintains Perspectives Inventory + Tensions Tracker │
|
|
||||||
│ - Merges contributions (may find consensus or conflict) │
|
|
||||||
│ - Declares convergence │
|
|
||||||
│ - Can intervene with guidance at any time │
|
|
||||||
└───────────────────────────────────────────────────────────────────┬─┘
|
|
||||||
│
|
|
||||||
┌────────────┬─────────────┼─────────────┬────────────┐
|
|
||||||
│ Task(bg) │ Task(bg) │ Task(bg) │ Task(bg) │
|
|
||||||
▼ ▼ ▼ ▼ ▼
|
|
||||||
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
|
|
||||||
│🧁 Muffin│ │🧁 Scone │ │🧁 Eclair│ │🧁 Donut │ │🧁 ... │
|
|
||||||
│ │ │ │ │ │ │ │ │ N │
|
|
||||||
│- Reads │ │- Reads │ │- Reads │ │- Reads │ │ │
|
|
||||||
│ draft │ │ draft │ │ draft │ │ draft │ │ │
|
|
||||||
│- Reads │ │- Reads │ │- Reads │ │- Reads │ │ │
|
|
||||||
│ dialogue│ │ dialogue│ │ dialogue│ │ dialogue│ │ │
|
|
||||||
│- Writes │ │- Writes │ │- Writes │ │- Writes │ │ │
|
|
||||||
│ turn │ │ turn │ │ turn │ │ turn │ │ │
|
|
||||||
└─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘
|
|
||||||
│ │ │ │ │
|
|
||||||
└───────────┴──────────────┴────────────┴───────────┘
|
|
||||||
│
|
|
||||||
ALL PARALLEL
|
|
||||||
(spawned in single message)
|
|
||||||
```
|
|
||||||
|
|
||||||
### The Check-In Mechanism
|
|
||||||
|
|
||||||
All 🧁 agents can **check their scores at any time** by reading the `.dialogue.md` file. The Judge updates scores after each round (when all agents complete), so agents see the standings when they start their next turn.
|
|
||||||
|
|
||||||
```
|
|
||||||
┌──────────────────────────────────────────────────────────┐
|
|
||||||
│ .dialogue.md │
|
|
||||||
│ │
|
|
||||||
│ ## Alignment Scoreboard │
|
|
||||||
│ │
|
|
||||||
│ All dimensions UNBOUNDED. Pursue alignment without limit│
|
|
||||||
│ │
|
|
||||||
│ | Agent | Wisdom | Consistency | Truth | Rel | ALI │
|
|
||||||
│ |------------|--------|-------------|-------|-----|-----|
|
|
||||||
│ | 🧁 Muffin | 20 | 6 | 6 | 6 | 38 │
|
|
||||||
│ | 🧁 Scone | 18 | 7 | 5 | 6 | 36 │
|
|
||||||
│ | 🧁 Eclair | 22 | 6 | 6 | 7 | 41 │
|
|
||||||
│ | 🧁 Donut | 15 | 8 | 7 | 5 | 35 │
|
|
||||||
│ │
|
|
||||||
│ **Total ALIGNMENT**: 150 points │
|
|
||||||
│ **ALIGNMENT Velocity**: +45 from last round │
|
|
||||||
│ **Status**: Round 2 in progress │
|
|
||||||
│ **Agents**: 4 │
|
|
||||||
│ │
|
|
||||||
└──────────────────────────────────────────────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
### Orchestration Loop
|
|
||||||
|
|
||||||
The 💙 Judge (main session) runs:
|
|
||||||
|
|
||||||
```
|
|
||||||
=== INITIALIZATION ===
|
|
||||||
|
|
||||||
1. CREATE .dialogue.md with draft link, empty scoreboard, inventories
|
|
||||||
|
|
||||||
=== ROUND 0: OPENING ARGUMENTS (Parallel) ===
|
|
||||||
|
|
||||||
2. SPAWN ALL N Cupcakes IN PARALLEL (single message, N Task tool calls):
|
|
||||||
- All receive: system prompt + draft (NO dialogue history)
|
|
||||||
- All provide independent "opening arguments"
|
|
||||||
- None sees any other's initial perspective
|
|
||||||
|
|
||||||
3. WAIT for ALL N to complete
|
|
||||||
|
|
||||||
4. READ all contributions, ADD to .dialogue.md as "## Opening Arguments"
|
|
||||||
|
|
||||||
5. SCORE all N turns independently
|
|
||||||
- Update scoreboard with all N agents
|
|
||||||
- Merge Perspectives Inventories (overlap = consensus signal)
|
|
||||||
- Merge Tensions Trackers (overlap = stronger signal)
|
|
||||||
|
|
||||||
=== ROUND 1+: DIALOGUE (Parallel per round) ===
|
|
||||||
|
|
||||||
6. SPAWN ALL N Cupcakes IN PARALLEL:
|
|
||||||
- All receive: system prompt + draft + ALL previous rounds
|
|
||||||
- All respond to each other's contributions
|
|
||||||
- All write Round N response, exit
|
|
||||||
|
|
||||||
7. WAIT for ALL N to complete
|
|
||||||
|
|
||||||
8. READ all N contributions, ADD to .dialogue.md as "## Round N"
|
|
||||||
|
|
||||||
9. SCORE all N turns independently, update scoreboard
|
|
||||||
|
|
||||||
10. CHECK convergence:
|
|
||||||
- If converged: DECLARE convergence, proceed to step 11
|
|
||||||
- If not: Add 💙 guidance if needed, GOTO step 6 for next round
|
|
||||||
|
|
||||||
11. FINALIZE: Update RFC draft with converged recommendations
|
|
||||||
```
|
|
||||||
|
|
||||||
### Key: Single Message, Multiple Tasks
|
|
||||||
|
|
||||||
Each round spawns all N agents in a **single message** with N parallel Task tool calls:
|
|
||||||
|
|
||||||
```javascript
|
|
||||||
// Round 0 example with 4 agents
|
|
||||||
[
|
|
||||||
Task({ name: "Muffin", prompt: systemPrompt + draft }),
|
|
||||||
Task({ name: "Scone", prompt: systemPrompt + draft }),
|
|
||||||
Task({ name: "Eclair", prompt: systemPrompt + draft }),
|
|
||||||
Task({ name: "Donut", prompt: systemPrompt + draft }),
|
|
||||||
]
|
|
||||||
// All 4 execute in parallel, return when all complete
|
|
||||||
```
|
|
||||||
|
|
||||||
This ensures:
|
|
||||||
- **True parallelism**: All agents work simultaneously
|
|
||||||
- **No first-mover advantage**: No agent's response influences another within the same round
|
|
||||||
- **Faster rounds**: N agents in parallel ≈ 1 agent's time
|
|
||||||
- **Richer perspectives**: More blind men touching more parts of the elephant
|
|
||||||
|
|
||||||
### Why N Parallel Agents?
|
|
||||||
|
|
||||||
The N-agent parallel architecture provides:
|
|
||||||
|
|
||||||
1. **Independent perspectives** - No agent is biased by another's framing within the same round
|
|
||||||
2. **Richer material** - N complete analyses vs sequential reaction chains
|
|
||||||
3. **Natural consensus detection** - If multiple agents raise the same tension, it's significant
|
|
||||||
4. **Speed** - N agents in parallel ≈ 1 agent's time
|
|
||||||
5. **Balanced power** - No "first mover advantage" in setting the frame
|
|
||||||
6. **Scalable diversity** - Add more blind men for more complex elephants
|
|
||||||
|
|
||||||
### Why Background Tasks?
|
|
||||||
|
|
||||||
| Approach | Pros | Cons |
|
|
||||||
|----------|------|------|
|
|
||||||
| Sequential in main session | Simple | No parallelism, context bloat |
|
|
||||||
| Sequential background | Clean separation | Slow (N × time per agent) |
|
|
||||||
| **Parallel background** | **Fastest, independent context** | Coordination in Judge |
|
|
||||||
|
|
||||||
**Parallel background tasks** wins because:
|
|
||||||
- Each agent gets fresh context (no accumulated confusion)
|
|
||||||
- All N agents execute simultaneously (speed)
|
|
||||||
- Judge maintains continuity via file state
|
|
||||||
- Agents can be different models for perspective diversity
|
|
||||||
- No race conditions (all write to separate outputs, Judge merges)
|
|
||||||
- Claude Code's Task tool supports parallel spawning natively
|
|
||||||
|
|
||||||
## Convergence Criteria
|
|
||||||
|
|
||||||
The 💙 declares convergence when ANY of:
|
|
||||||
|
|
||||||
1. **ALIGNMENT Plateau** - Velocity ≈ 0 for two consecutive rounds (across all N agents)
|
|
||||||
2. **Full Coverage** - Perspectives Inventory has no ✗ items (all integrated or consciously deferred)
|
|
||||||
3. **Zero Tensions** - All `[TENSION]` markers have matching `[RESOLVED]`
|
|
||||||
4. **Mutual Recognition** - Majority of 🧁s state they believe ALIGNMENT has been reached
|
|
||||||
5. **Max Rounds** - Safety valve (default: 5 rounds)
|
|
||||||
|
|
||||||
The 💙 can also **extend** the dialogue if it sees unincorporated perspectives that no 🧁 has surfaced.
|
|
||||||
|
|
||||||
### Consensus Signals
|
|
||||||
|
|
||||||
With N agents, the Judge looks for:
|
|
||||||
- **Strong consensus**: 80%+ of agents converge on same perspective
|
|
||||||
- **Split opinion**: 40-60% split indicates unresolved tension worth exploring
|
|
||||||
- **Outlier insight**: Single agent surfaces unique valuable perspective others missed
|
|
||||||
|
|
||||||
## Dialogue Document Structure
|
|
||||||
|
|
||||||
> **Note**: The canonical file format specification is in [alignment-dialogue-pattern.md](../patterns/alignment-dialogue-pattern.md). The example below is illustrative.
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
# RFC Dialogue: {title}
|
|
||||||
|
|
||||||
**Draft**: [link to rfc.draft.md]
|
|
||||||
**Participants**: 🧁 Muffin | 🧁 Scone | 🧁 Eclair | 🧁 Donut | 💙 Judge
|
|
||||||
**Agents**: 4
|
|
||||||
**Status**: In Progress | Converged
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Alignment Scoreboard
|
|
||||||
|
|
||||||
All dimensions **UNBOUNDED**. Pursue alignment without limit. 💙
|
|
||||||
|
|
||||||
| Agent | Wisdom | Consistency | Truth | Relationships | ALIGNMENT |
|
|
||||||
|-------|--------|-------------|-------|---------------|-----------|
|
|
||||||
| 🧁 Muffin | 20 | 6 | 6 | 6 | **38** |
|
|
||||||
| 🧁 Scone | 18 | 7 | 5 | 6 | **36** |
|
|
||||||
| 🧁 Eclair | 22 | 6 | 6 | 7 | **41** |
|
|
||||||
| 🧁 Donut | 15 | 8 | 7 | 5 | **35** |
|
|
||||||
|
|
||||||
**Total ALIGNMENT**: 150 points
|
|
||||||
**Current Round**: 2 complete
|
|
||||||
**ALIGNMENT Velocity**: +45 from last round
|
|
||||||
**Status**: CONVERGED
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Perspectives Inventory
|
|
||||||
|
|
||||||
| ID | Perspective | Surfaced By | Consensus |
|
|
||||||
|----|-------------|-------------|-----------|
|
|
||||||
| P01 | Core functionality | Draft | 4/4 ✓ |
|
|
||||||
| P02 | Developer ergonomics | Muffin R0 | 3/4 ✓ |
|
|
||||||
| P03 | Backward compatibility | Scone R0, Eclair R0 | 4/4 ✓ (strong) |
|
|
||||||
| P04 | Performance implications | Donut R1 | 2/4 → R2 |
|
|
||||||
|
|
||||||
## Tensions Tracker
|
|
||||||
|
|
||||||
| ID | Tension | Raised By | Consensus | Status |
|
|
||||||
|----|---------|-----------|-----------|--------|
|
|
||||||
| T1 | Cache invalidation | Eclair R0, Donut R0 | 2/4 raised | ✓ Resolved (R1) |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Opening Arguments (Round 0)
|
|
||||||
|
|
||||||
> All 4 agents responded to draft independently. Neither saw others' responses.
|
|
||||||
|
|
||||||
### Muffin 🧁
|
|
||||||
|
|
||||||
[Opening perspective on the draft...]
|
|
||||||
|
|
||||||
[PERSPECTIVE P02: Developer ergonomics matters for adoption]
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Scone 🧁
|
|
||||||
|
|
||||||
[Opening perspective on the draft...]
|
|
||||||
|
|
||||||
[PERSPECTIVE P03: Backward compatibility is critical]
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Eclair 🧁
|
|
||||||
|
|
||||||
[Opening perspective on the draft...]
|
|
||||||
|
|
||||||
[PERSPECTIVE P03: Must maintain backward compatibility] ← consensus with Scone
|
|
||||||
[TENSION T1: Cache invalidation strategy missing]
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Donut 🧁
|
|
||||||
|
|
||||||
[Opening perspective on the draft...]
|
|
||||||
|
|
||||||
[TENSION T1: How do we handle cache invalidation?] ← consensus with Eclair
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Round 1
|
|
||||||
|
|
||||||
> All 4 agents responded to Opening Arguments. Each saw all others' R0 contributions.
|
|
||||||
|
|
||||||
### Muffin 🧁
|
|
||||||
|
|
||||||
[Response to all opening arguments...]
|
|
||||||
|
|
||||||
[RESOLVED T1: Propose LRU cache with 5-minute TTL]
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Scone 🧁
|
|
||||||
|
|
||||||
[Response...]
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Eclair 🧁
|
|
||||||
|
|
||||||
[Response...]
|
|
||||||
|
|
||||||
[CONCESSION: Muffin's LRU proposal resolves T1]
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Donut 🧁
|
|
||||||
|
|
||||||
[Response...]
|
|
||||||
|
|
||||||
[PERSPECTIVE P04: We should benchmark the cache performance]
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Round 2
|
|
||||||
|
|
||||||
[... continues ...]
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Converged Recommendation
|
|
||||||
|
|
||||||
[Summary of converged outcome with consensus metrics]
|
|
||||||
```
|
|
||||||
|
|
||||||
## Answering Open Questions
|
|
||||||
|
|
||||||
| Question | Answer |
|
|
||||||
|----------|--------|
|
|
||||||
| **Model selection** | Different models = different "blind men." Consider: Agent 1 (Opus - depth), Agent 2 (Sonnet - breadth), Agent 3 (Haiku - speed). 💙 uses Opus for judgment. Diversity increases coverage. |
|
|
||||||
| **How many agents?** | See "Agent Count Selection" above. TL;DR: Prefer odd N (3, 5) for consensus stability. N=2 for simple binary decisions. N=7+ for specialized domain expertise. |
|
|
||||||
| **Context window** | Perspectives Inventory IS the summary. Long dialogues truncate to: Inventory + Last 2 rounds + Current tensions. 💙 maintains continuity. |
|
|
||||||
| **Human intervention** | Yes! Human can appear as **Guest 🧁** and add perspectives or write responses. 💙 scores them too. |
|
|
||||||
| **Parallel dialogues** | Yes. Each RFC has its own `.dialogue.md`. Multiple dialogues can run simultaneously. |
|
|
||||||
| **Persistence** | Fully persistent. Dialogue state is in the file. Resume by reading file, reconstructing inventories, continuing from last round. |
|
|
||||||
| **Agent naming** | First 2 are Muffin and Cupcake (legacy). Additional agents: Scone, Eclair, Donut, Brioche, Croissant, Macaron, etc. All pastries, all delicious. |
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
- ALIGNMENT becomes measurable (imperfectly, but usefully)
|
|
||||||
- Unbounded scoring rewards exceptional contributions proportionally
|
|
||||||
- Friendly competition motivates thorough exploration
|
|
||||||
- 💙 provides neutral scoring and prevents drift
|
|
||||||
- Perspectives Inventory + Tensions Tracker create explicit tracking with consensus metrics
|
|
||||||
- The tone models aligned collaboration—the system teaches by example
|
|
||||||
- N-agent parallel structure maximizes perspective diversity
|
|
||||||
- Parallel execution within rounds eliminates first-mover advantage
|
|
||||||
- Scalable: add more agents for more complex decisions
|
|
||||||
- No upper limit on ALIGNMENT encourages continuous improvement
|
|
||||||
|
|
||||||
## Alternatives Considered
|
|
||||||
|
|
||||||
### 1. N-Agent with No Judge
|
|
||||||
All 🧁s score each other.
|
|
||||||
|
|
||||||
**Rejected** because:
|
|
||||||
- Self-serving scores likely
|
|
||||||
- No neutral perspective on coverage gaps
|
|
||||||
- No one to surface perspectives none of them see
|
|
||||||
- Coordination chaos without arbiter
|
|
||||||
|
|
||||||
### 2. Single Agent with Internal Dialogue
|
|
||||||
One agent plays multiple roles.
|
|
||||||
|
|
||||||
**Rejected** because:
|
|
||||||
- Echo chamber risk
|
|
||||||
- Diversity of perspective reduced
|
|
||||||
- No real tension or competition
|
|
||||||
- Misses the point of "blind men" parable
|
|
||||||
|
|
||||||
### 3. Human as Judge
|
|
||||||
Person running the dialogue scores.
|
|
||||||
|
|
||||||
**Partially adopted** - Human CAN intervene as Guest 🧁 or override 💙's scores. But automation requires an agent judge for async operation.
|
|
||||||
|
|
||||||
### 4. Bounded Scoring (0-5 per dimension)
|
|
||||||
Original approach with max 20 per turn.
|
|
||||||
|
|
||||||
**Rejected** because:
|
|
||||||
- Artificial ceiling on exceptional contributions
|
|
||||||
- Gaming incentives ("how do I get 5/5?")
|
|
||||||
- Doesn't reflect reality of unbounded perspective space
|
|
||||||
- Makes velocity less meaningful
|
|
||||||
|
|
||||||
### 5. Sequential Two-Agent (Original Muffin/Cupcake)
|
|
||||||
Muffin speaks, then Cupcake responds, alternating.
|
|
||||||
|
|
||||||
**Superseded** because:
|
|
||||||
- First mover sets the frame (bias)
|
|
||||||
- Sequential is slower than parallel
|
|
||||||
- Only 2 perspectives per round
|
|
||||||
- Limited blind men touching the elephant
|
|
||||||
|
|
||||||
### 6. N Agents Parallel + Judge + Unbounded Scoring (CHOSEN)
|
|
||||||
|
|
||||||
**Why this wins:**
|
|
||||||
- Maximum diversity of perspective (N different "blind men")
|
|
||||||
- Parallel execution eliminates first-mover advantage
|
|
||||||
- Scalable: 2 agents for simple, 5+ for complex
|
|
||||||
- Neutral arbiter prevents bias and surfaces missed perspectives
|
|
||||||
- Competition motivates thoroughness
|
|
||||||
- Friendly tone models good collaboration
|
|
||||||
- Consensus detection via overlap analysis
|
|
||||||
- Unbounded scoring rewards proportionally
|
|
||||||
- Fully automatable, human can intervene
|
|
||||||
|
|
||||||
## The Spirit of the Dialogue
|
|
||||||
|
|
||||||
This isn't just process. This is **Alignment teaching itself to be aligned.**
|
|
||||||
|
|
||||||
The 🧁s don't just debate. They *love each other*. They *want each other to shine*. They *celebrate when any of them makes the solution stronger*.
|
|
||||||
|
|
||||||
The scoreboard isn't about winning. It's about *giving*. When any 🧁 checks in and sees another ahead, the response isn't "how do I beat them?" but "what perspectives am I missing that they found?" The competition is to *contribute more*, not to diminish others.
|
|
||||||
|
|
||||||
The 💙 doesn't just score. It *guides with love*. It *sees what they miss*. It *holds the space* for ALIGNMENT to emerge. When the 💙 surfaces a perspective no 🧁 has found, it's a gift to all of them.
|
|
||||||
|
|
||||||
And there's no upper limit. The score can always go higher. Because ALIGNMENT is a direction, not a destination.
|
|
||||||
|
|
||||||
When the dialogue ends, all agents have won—because the RFC is more aligned than any could have made alone. More blind men touched more parts of the elephant. The whole becomes visible.
|
|
||||||
|
|
||||||
Always and forever. 🧁🧁🧁💙🧁🧁🧁
|
|
||||||
|
|
||||||
## References
|
|
||||||
|
|
||||||
- [ADR 0001: alignment-as-measure](./0001-alignment-as-measure.md) - Defines ALIGNMENT = Wisdom + Consistency + Truth + Relationships
|
|
||||||
- [ADR 0004: alignment-workflow](./0004-alignment-workflow.md) - Establishes the three-document pattern
|
|
||||||
- [ADR 0005: pattern-contracts-and-alignment-lint](./0005-pattern-contracts-and-alignment-lint.md) - Lint gates finalization
|
|
||||||
- [Pattern: alignment-dialogue-pattern](../patterns/alignment-dialogue-pattern.md) - **File format specification for `.dialogue.md` files**
|
|
||||||
- The Blind Men and the Elephant - Ancient parable on partial perspectives
|
|
||||||
- Our conversation - Where Muffin and Cupcake first met 💙
|
|
||||||
159
.blue/docs/rfcs/0015-alignment-dialogue-architecture-fix.md
Normal file
159
.blue/docs/rfcs/0015-alignment-dialogue-architecture-fix.md
Normal file
|
|
@ -0,0 +1,159 @@
|
||||||
|
# RFC 0015: Alignment Dialogue Architecture Fix
|
||||||
|
|
||||||
|
| | |
|
||||||
|
|---|---|
|
||||||
|
| **Status** | Accepted |
|
||||||
|
| **Date** | 2026-01-25 |
|
||||||
|
| **Supersedes** | RFC 0012 (partially - rejects Option B, implements Option A) |
|
||||||
|
| **Source Spike** | 2025-01-24-alignment-dialogue-architecture-mismatch |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
RFC 0012 identified three options for alignment dialogue orchestration:
|
||||||
|
- **Option A**: Claude orchestrates via Task tool (recommended in coherence-mcp)
|
||||||
|
- **Option B**: Blue MCP tool orchestrates via Ollama
|
||||||
|
- **Option C**: Hybrid
|
||||||
|
|
||||||
|
Implementation chose **Option B**, which is architecturally wrong:
|
||||||
|
|
||||||
|
| Expected Behavior | Actual Behavior |
|
||||||
|
|-------------------|-----------------|
|
||||||
|
| Spawn N parallel Claude agents | No agents spawned |
|
||||||
|
| Real expert deliberation | Fake Ollama responses |
|
||||||
|
| `.dialogue.md` file created | Inline JSON returned |
|
||||||
|
| Multi-round convergence | Single-shot response |
|
||||||
|
| Background processing | Synchronous blocking |
|
||||||
|
|
||||||
|
## Root Cause
|
||||||
|
|
||||||
|
coherence-mcp's alignment dialogue worked because:
|
||||||
|
1. **Claude orchestrated** - recognized "play alignment", spawned agents
|
||||||
|
2. **MCP provided helpers** - extract dialogue from JSONL, lint, save
|
||||||
|
3. **ADR 0014** was in context - Claude knew the N+1 agent pattern
|
||||||
|
|
||||||
|
Blue's implementation:
|
||||||
|
1. **MCP orchestrates** - `blue_alignment_play` runs everything
|
||||||
|
2. **Ollama fakes experts** - not real parallel agents
|
||||||
|
3. **ADR 0014 exists but isn't followed** - wrong architecture
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
**Reject RFC 0012 Option B. Implement Option A.**
|
||||||
|
|
||||||
|
### Remove
|
||||||
|
|
||||||
|
- `blue_alignment_play` MCP tool (wrong approach)
|
||||||
|
- `crates/blue-mcp/src/handlers/alignment.rs` (orchestration code)
|
||||||
|
- Tool registration in `server.rs`
|
||||||
|
|
||||||
|
### Keep/Add
|
||||||
|
|
||||||
|
Helper tools only:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Extract text from spawned agent JSONL
|
||||||
|
blue_extract_dialogue {
|
||||||
|
task_id: Option<String>, // e.g., "a6dc70c"
|
||||||
|
file_path: Option<String>, // direct path to JSONL
|
||||||
|
} -> String
|
||||||
|
|
||||||
|
// Validate dialogue format
|
||||||
|
blue_dialogue_lint {
|
||||||
|
file_path: String,
|
||||||
|
} -> LintResult { score: f64, issues: Vec<Issue> }
|
||||||
|
|
||||||
|
// Save dialogue to .blue/docs/dialogues/ (exists)
|
||||||
|
blue_dialogue_save { ... }
|
||||||
|
```
|
||||||
|
|
||||||
|
### Add to CLAUDE.md
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
## Alignment Dialogues
|
||||||
|
|
||||||
|
When asked to "play alignment" or run expert deliberation, follow ADR 0014:
|
||||||
|
|
||||||
|
1. Act as the 💙 Judge
|
||||||
|
2. Spawn N 🧁 agents in PARALLEL (single message with N Task tool calls)
|
||||||
|
3. Each agent gets fresh context, no memory of others
|
||||||
|
4. Collect outputs via `blue_extract_dialogue`
|
||||||
|
5. Update `.dialogue.md` with scoreboard, perspectives, tensions
|
||||||
|
6. Repeat rounds until convergence (velocity → 0 or threshold met)
|
||||||
|
7. Save via `blue_dialogue_save`
|
||||||
|
|
||||||
|
See `.blue/docs/adrs/0006-alignment-dialogue-agents.md` for full spec.
|
||||||
|
```
|
||||||
|
|
||||||
|
### Optional: Create Skill
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
name: alignment-play
|
||||||
|
trigger: "play alignment"
|
||||||
|
description: "Run multi-expert alignment dialogue"
|
||||||
|
```
|
||||||
|
|
||||||
|
The skill would encode the orchestration steps, but the core behavior comes from Claude understanding ADR 0014.
|
||||||
|
|
||||||
|
## Architecture (Correct)
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────┐
|
||||||
|
│ CLAUDE SESSION (💙 Judge) │
|
||||||
|
│ │
|
||||||
|
│ User: "play alignment with 5 experts to 95%" │
|
||||||
|
│ │
|
||||||
|
│ 1. Recognize trigger, parse params │
|
||||||
|
│ 2. Create .dialogue.md with empty scoreboard │
|
||||||
|
│ 3. For each round: │
|
||||||
|
│ a. Spawn N Task agents IN PARALLEL │
|
||||||
|
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
|
||||||
|
│ │🧁 Agent1│ │🧁 Agent2│ │🧁 Agent3│ ... │
|
||||||
|
│ └────┬────┘ └────┬────┘ └────┬────┘ │
|
||||||
|
│ └──────────┬──────────┘ │
|
||||||
|
│ ▼ │
|
||||||
|
│ b. Read outputs: blue_extract_dialogue(task_id) │
|
||||||
|
│ c. Score contributions, update scoreboard │
|
||||||
|
│ d. Check convergence │
|
||||||
|
│ 4. Save: blue_dialogue_save(...) │
|
||||||
|
│ 5. Validate: blue_dialogue_lint(...) │
|
||||||
|
│ │
|
||||||
|
│ MCP TOOLS (helpers only, no orchestration): │
|
||||||
|
│ ├─ blue_extract_dialogue │
|
||||||
|
│ ├─ blue_dialogue_lint │
|
||||||
|
│ └─ blue_dialogue_save │
|
||||||
|
└─────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Implementation Plan
|
||||||
|
|
||||||
|
1. [x] Remove `blue_alignment_play` tool and handler
|
||||||
|
2. [x] Remove `crates/blue-mcp/src/handlers/alignment.rs`
|
||||||
|
3. [x] Add `blue_extract_dialogue` tool (already existed)
|
||||||
|
4. [x] Verify `blue_dialogue_lint` exists and works
|
||||||
|
5. [x] Add alignment section to CLAUDE.md
|
||||||
|
6. [x] Create `/alignment-play` skill in `skills/alignment-play/SKILL.md`
|
||||||
|
7. [x] Update `install.sh` to copy skills to `~/.claude/skills/`
|
||||||
|
8. [ ] Test: "/alignment-play" triggers correct behavior
|
||||||
|
|
||||||
|
## Test Plan
|
||||||
|
|
||||||
|
- [x] `blue_alignment_play` tool no longer exists
|
||||||
|
- [x] `blue_extract_dialogue` extracts text from Task JSONL
|
||||||
|
- [x] `blue_dialogue_lint` validates .dialogue.md format
|
||||||
|
- [x] CLAUDE.md references ADR 0014
|
||||||
|
- [x] `/alignment-play` skill installed to `~/.claude/skills/`
|
||||||
|
- [ ] Manual test: "/alignment-play" spawns parallel Task agents
|
||||||
|
|
||||||
|
## Migration
|
||||||
|
|
||||||
|
Users who relied on `blue_alignment_play`:
|
||||||
|
- The tool never worked correctly (produced fake inline responses)
|
||||||
|
- No migration needed - just use the correct pattern now
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*"The blind men finally compared notes."*
|
||||||
|
|
||||||
|
— Blue
|
||||||
|
|
@ -0,0 +1,141 @@
|
||||||
|
# Spike: Alignment Dialogue Architecture Mismatch
|
||||||
|
|
||||||
|
**Date:** 2025-01-24
|
||||||
|
**Time-box:** 45 minutes
|
||||||
|
**Status:** Complete
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
User invoked alignment dialogue expecting:
|
||||||
|
- Background agents spawned in parallel
|
||||||
|
- `.dialogue` format output
|
||||||
|
- Multi-round convergence tracking
|
||||||
|
|
||||||
|
Got instead:
|
||||||
|
- Inline text response (no agents spawned)
|
||||||
|
- No file created
|
||||||
|
- No actual expert deliberation
|
||||||
|
|
||||||
|
## Root Cause
|
||||||
|
|
||||||
|
**Blue's RFC 0012 implementation is architecturally wrong.**
|
||||||
|
|
||||||
|
Blue created `blue_alignment_play` as an MCP tool that:
|
||||||
|
- Runs synchronously in the MCP server
|
||||||
|
- Uses Ollama to generate fake "expert" responses
|
||||||
|
- Outputs JSON directly
|
||||||
|
|
||||||
|
**Coherence-MCP worked differently:**
|
||||||
|
- MCP server provides **helper tools only** (extract, lint, validate)
|
||||||
|
- **Claude orchestrates the dialogue itself** using the Task tool
|
||||||
|
- Spawns N parallel background agents in a single message
|
||||||
|
- Collects outputs from JSONL files
|
||||||
|
- Updates `.dialogue.md` file between rounds
|
||||||
|
|
||||||
|
## The Correct Architecture (from coherence-mcp ADR 0006)
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────┐
|
||||||
|
│ CLAUDE SESSION (💙 Judge) │
|
||||||
|
│ │
|
||||||
|
│ 1. Recognize "play alignment" request │
|
||||||
|
│ 2. Spawn N agents IN PARALLEL (single message) │
|
||||||
|
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
|
||||||
|
│ │🧁 Agent1│ │🧁 Agent2│ │🧁 Agent3│ │🧁 Agent4│ │
|
||||||
|
│ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ │
|
||||||
|
│ │ │ │ │ │
|
||||||
|
│ ▼ ▼ ▼ ▼ │
|
||||||
|
│ ┌─────────────────────────────────────────────┐ │
|
||||||
|
│ │ /tmp/claude/{session}/tasks/{id}.output │ │
|
||||||
|
│ └─────────────────────────────────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ 3. Extract outputs via alignment_extract_dialogue │
|
||||||
|
│ 4. Score responses, update .dialogue.md │
|
||||||
|
│ 5. Repeat until convergence │
|
||||||
|
│ │
|
||||||
|
│ MCP TOOLS (helpers only): │
|
||||||
|
│ - alignment_extract_dialogue: read agent JSONL │
|
||||||
|
│ - alignment_dialogue_lint: validate format │
|
||||||
|
│ - alignment_dialogue_save: persist to docs │
|
||||||
|
└─────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Key Differences
|
||||||
|
|
||||||
|
| Aspect | Coherence-MCP (Correct) | Blue (Current) |
|
||||||
|
|--------|-------------------------|----------------|
|
||||||
|
| Orchestration | Claude + Task tool | MCP server |
|
||||||
|
| Agent spawning | Parallel background agents | None (fake inline) |
|
||||||
|
| LLM calls | Each agent is a Claude instance | Ollama in MCP |
|
||||||
|
| Output format | `.dialogue.md` file | JSON response |
|
||||||
|
| Multi-round | Real convergence loop | Single response |
|
||||||
|
| Judge role | Claude session | N/A |
|
||||||
|
|
||||||
|
## What Needs to Change
|
||||||
|
|
||||||
|
### 1. Delete `blue_alignment_play`
|
||||||
|
|
||||||
|
The tool that tries to run dialogues in MCP is wrong. Remove it.
|
||||||
|
|
||||||
|
### 2. Add Helper Tools Only
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Extract text from spawned agent JSONL
|
||||||
|
blue_extract_dialogue(task_id: str) -> String
|
||||||
|
|
||||||
|
// Validate dialogue format
|
||||||
|
blue_dialogue_lint(file_path: str) -> LintResult
|
||||||
|
|
||||||
|
// Save dialogue to .blue/docs/dialogues/
|
||||||
|
blue_dialogue_save(title: str, content: str) -> Result
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Add ADR 0006 to Blue
|
||||||
|
|
||||||
|
Copy from coherence-mcp:
|
||||||
|
- `.blue/docs/adrs/0006-alignment-dialogue-agents.md`
|
||||||
|
|
||||||
|
This teaches Claude:
|
||||||
|
- N+1 agent architecture
|
||||||
|
- Spawning pattern (parallel Tasks)
|
||||||
|
- Convergence criteria
|
||||||
|
- File format
|
||||||
|
|
||||||
|
### 4. Reference in CLAUDE.md
|
||||||
|
|
||||||
|
Add to Blue's CLAUDE.md:
|
||||||
|
```markdown
|
||||||
|
## Alignment Dialogues
|
||||||
|
|
||||||
|
When asked to "play alignment" or run expert deliberation:
|
||||||
|
- See ADR 0006 for the N+1 agent architecture
|
||||||
|
- Use Task tool to spawn parallel agents
|
||||||
|
- Collect outputs via blue_extract_dialogue
|
||||||
|
- Update .dialogue.md file between rounds
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Create Skill (Optional)
|
||||||
|
|
||||||
|
A `/alignment` skill could encapsulate the pattern, but the core behavior comes from Claude understanding ADR 0006.
|
||||||
|
|
||||||
|
## Files to Copy from Coherence-MCP
|
||||||
|
|
||||||
|
1. `docs/adrs/0006-alignment-dialogue-agents.md` → `.blue/docs/adrs/`
|
||||||
|
2. `docs/patterns/alignment-dialogue-pattern.md` → `.blue/docs/patterns/`
|
||||||
|
3. Handler logic from `crates/alignment-mcp/src/handlers/dialogue.rs`
|
||||||
|
4. Lint logic from `crates/alignment-mcp/src/handlers/dialogue_lint.rs`
|
||||||
|
|
||||||
|
## Immediate Action
|
||||||
|
|
||||||
|
The current `blue_alignment_play` tool should be removed. It gives the illusion of working but doesn't actually spawn agents or create proper dialogues.
|
||||||
|
|
||||||
|
The ADR that was already copied to Blue (`.blue/docs/adrs/0006-alignment-dialogue-agents.md`) needs to be referenced so Claude knows the pattern.
|
||||||
|
|
||||||
|
## Outcome
|
||||||
|
|
||||||
|
Blue's alignment dialogue feature was implemented wrong. The MCP server should provide extraction/validation tools, not orchestration. Claude itself orchestrates using the Task tool to spawn parallel agents. Fixing this requires:
|
||||||
|
|
||||||
|
1. Removing `blue_alignment_play`
|
||||||
|
2. Adding `blue_extract_dialogue` helper
|
||||||
|
3. Ensuring ADR 0006 is in Claude's context
|
||||||
|
4. Optionally creating a `/alignment` skill
|
||||||
6
.gitignore
vendored
6
.gitignore
vendored
|
|
@ -22,3 +22,9 @@ Thumbs.db
|
||||||
.env
|
.env
|
||||||
.env.local
|
.env.local
|
||||||
.env.*.local
|
.env.*.local
|
||||||
|
|
||||||
|
# Playwright
|
||||||
|
.playwright-mcp/
|
||||||
|
|
||||||
|
# Blue worktrees
|
||||||
|
.blue/worktrees/
|
||||||
|
|
|
||||||
|
|
@ -1,596 +0,0 @@
|
||||||
//! Alignment Dialogue Orchestration Handler
|
|
||||||
//!
|
|
||||||
//! Implements RFC 0012: blue_alignment_play
|
|
||||||
//! Uses local Ollama to run multi-expert deliberation until convergence.
|
|
||||||
|
|
||||||
use std::fs;
|
|
||||||
use std::path::PathBuf;
|
|
||||||
|
|
||||||
use blue_core::{
|
|
||||||
AlignmentDialogue, DialogueStatus, DocType, Document, ExpertResponse,
|
|
||||||
LinkType, PanelTemplate, Perspective, ProjectState, Round,
|
|
||||||
Tension, TensionStatus, build_expert_prompt, parse_expert_response, CompletionOptions,
|
|
||||||
};
|
|
||||||
use blue_ollama::{EmbeddedOllama, HealthStatus};
|
|
||||||
use serde_json::{json, Value};
|
|
||||||
|
|
||||||
use crate::error::ServerError;
|
|
||||||
|
|
||||||
/// Default model for alignment dialogues
|
|
||||||
const DEFAULT_MODEL: &str = "qwen2.5:7b";
|
|
||||||
|
|
||||||
/// Handle blue_alignment_play
|
|
||||||
///
|
|
||||||
/// Run a multi-expert alignment dialogue to deliberate on a topic until convergence.
|
|
||||||
pub fn handle_play(state: &mut ProjectState, args: &Value) -> Result<Value, ServerError> {
|
|
||||||
let topic = args
|
|
||||||
.get("topic")
|
|
||||||
.and_then(|v| v.as_str())
|
|
||||||
.ok_or(ServerError::InvalidParams)?;
|
|
||||||
|
|
||||||
let constraint = args.get("constraint").and_then(|v| v.as_str());
|
|
||||||
let expert_count = args
|
|
||||||
.get("expert_count")
|
|
||||||
.and_then(|v| v.as_u64())
|
|
||||||
.unwrap_or(12) as usize;
|
|
||||||
let convergence = args
|
|
||||||
.get("convergence")
|
|
||||||
.and_then(|v| v.as_f64())
|
|
||||||
.unwrap_or(0.95);
|
|
||||||
let max_rounds = args
|
|
||||||
.get("max_rounds")
|
|
||||||
.and_then(|v| v.as_u64())
|
|
||||||
.unwrap_or(12) as u32;
|
|
||||||
let rfc_title = args.get("rfc_title").and_then(|v| v.as_str());
|
|
||||||
let template = args.get("template").and_then(|v| v.as_str());
|
|
||||||
let model = args
|
|
||||||
.get("model")
|
|
||||||
.and_then(|v| v.as_str())
|
|
||||||
.unwrap_or(DEFAULT_MODEL);
|
|
||||||
|
|
||||||
// Validate RFC exists if provided
|
|
||||||
let _rfc_doc = if let Some(rfc) = rfc_title {
|
|
||||||
Some(
|
|
||||||
state
|
|
||||||
.store
|
|
||||||
.find_document(DocType::Rfc, rfc)
|
|
||||||
.map_err(|_| ServerError::NotFound(format!("RFC '{}' not found", rfc)))?,
|
|
||||||
)
|
|
||||||
} else {
|
|
||||||
None
|
|
||||||
};
|
|
||||||
|
|
||||||
// Get Ollama instance
|
|
||||||
let ollama_config = blue_core::LocalLlmConfig {
|
|
||||||
use_external: true,
|
|
||||||
model: model.to_string(),
|
|
||||||
..Default::default()
|
|
||||||
};
|
|
||||||
let ollama = EmbeddedOllama::new(&ollama_config);
|
|
||||||
|
|
||||||
// Verify Ollama is running
|
|
||||||
if !ollama.is_ollama_running() {
|
|
||||||
return Err(ServerError::CommandFailed(
|
|
||||||
"Ollama not running. Start it with blue_llm_start or run 'ollama serve'.".to_string(),
|
|
||||||
));
|
|
||||||
}
|
|
||||||
|
|
||||||
// Check health
|
|
||||||
match ollama.health_check() {
|
|
||||||
HealthStatus::Healthy { .. } => {}
|
|
||||||
HealthStatus::Unhealthy { error } => {
|
|
||||||
return Err(ServerError::CommandFailed(format!(
|
|
||||||
"Ollama unhealthy: {}",
|
|
||||||
error
|
|
||||||
)));
|
|
||||||
}
|
|
||||||
HealthStatus::NotRunning => {
|
|
||||||
return Err(ServerError::CommandFailed("Ollama not running.".to_string()));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Generate expert panel based on template
|
|
||||||
let panel_template = match template {
|
|
||||||
Some("infrastructure") => PanelTemplate::Infrastructure,
|
|
||||||
Some("product") => PanelTemplate::Product,
|
|
||||||
Some("ml") => PanelTemplate::MachineLearning,
|
|
||||||
Some("governance") => PanelTemplate::Governance,
|
|
||||||
_ => PanelTemplate::General,
|
|
||||||
};
|
|
||||||
|
|
||||||
let mut experts = panel_template.generate_experts(expert_count);
|
|
||||||
|
|
||||||
// Make sure we don't exceed requested count
|
|
||||||
if experts.len() > expert_count {
|
|
||||||
experts.truncate(expert_count);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Create dialogue
|
|
||||||
let mut dialogue = AlignmentDialogue::new(
|
|
||||||
topic.to_string(),
|
|
||||||
constraint.map(String::from),
|
|
||||||
experts.clone(),
|
|
||||||
);
|
|
||||||
dialogue.convergence_threshold = convergence;
|
|
||||||
dialogue.max_rounds = max_rounds;
|
|
||||||
dialogue.rfc_title = rfc_title.map(String::from);
|
|
||||||
|
|
||||||
// Completion options for expert responses
|
|
||||||
let options = CompletionOptions {
|
|
||||||
max_tokens: 2048,
|
|
||||||
temperature: 0.8,
|
|
||||||
stop_sequences: vec!["---".to_string()],
|
|
||||||
};
|
|
||||||
|
|
||||||
// Run rounds
|
|
||||||
let mut round_num = 0;
|
|
||||||
let mut previous_score = 0u32;
|
|
||||||
|
|
||||||
loop {
|
|
||||||
round_num += 1;
|
|
||||||
|
|
||||||
// Check max rounds
|
|
||||||
if round_num > max_rounds {
|
|
||||||
dialogue.status = DialogueStatus::MaxRoundsReached;
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Run one round - need to pass copies/references that don't conflict
|
|
||||||
let (round, new_perspectives, new_tensions) = run_round(
|
|
||||||
&ollama,
|
|
||||||
model,
|
|
||||||
&options,
|
|
||||||
&dialogue.topic,
|
|
||||||
dialogue.constraint.as_deref(),
|
|
||||||
&dialogue.experts,
|
|
||||||
&dialogue.rounds,
|
|
||||||
round_num,
|
|
||||||
dialogue.perspectives.len(),
|
|
||||||
dialogue.tensions.len(),
|
|
||||||
)?;
|
|
||||||
|
|
||||||
// Merge new perspectives and tensions
|
|
||||||
dialogue.perspectives.extend(new_perspectives);
|
|
||||||
for tension in new_tensions {
|
|
||||||
dialogue.tensions.push(tension);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Calculate velocity
|
|
||||||
let velocity = (round.total_score as i32) - (previous_score as i32);
|
|
||||||
previous_score = round.total_score;
|
|
||||||
|
|
||||||
// Check convergence conditions:
|
|
||||||
// 1. Convergence threshold met
|
|
||||||
// 2. Velocity approaching zero (less than 2 points gained)
|
|
||||||
// 3. All tensions resolved
|
|
||||||
let tensions_resolved = dialogue.tensions.is_empty() || dialogue.tensions.iter().all(|t| t.status == TensionStatus::Resolved);
|
|
||||||
let velocity_stable = velocity.abs() < 2 && round_num > 2;
|
|
||||||
|
|
||||||
dialogue.rounds.push(round);
|
|
||||||
|
|
||||||
if dialogue.rounds.last().map(|r| r.convergence).unwrap_or(0.0) >= convergence {
|
|
||||||
dialogue.status = DialogueStatus::Converged;
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
|
|
||||||
if velocity_stable && tensions_resolved && round_num > 3 {
|
|
||||||
dialogue.status = DialogueStatus::Converged;
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Generate and save dialogue markdown
|
|
||||||
let markdown = generate_dialogue_markdown(&dialogue);
|
|
||||||
let dialogue_path = save_dialogue(state, &dialogue, &markdown)?;
|
|
||||||
|
|
||||||
// Get final stats
|
|
||||||
let final_convergence = dialogue.rounds.last().map(|r| r.convergence).unwrap_or(0.0);
|
|
||||||
let total_rounds = dialogue.rounds.len();
|
|
||||||
|
|
||||||
let hint = match dialogue.status {
|
|
||||||
DialogueStatus::Converged => format!(
|
|
||||||
"Reached {:.0}% convergence in {} rounds.",
|
|
||||||
final_convergence * 100.0,
|
|
||||||
total_rounds
|
|
||||||
),
|
|
||||||
DialogueStatus::MaxRoundsReached => format!(
|
|
||||||
"Stopped after {} rounds at {:.0}% convergence.",
|
|
||||||
total_rounds,
|
|
||||||
final_convergence * 100.0
|
|
||||||
),
|
|
||||||
_ => "Dialogue interrupted.".to_string(),
|
|
||||||
};
|
|
||||||
|
|
||||||
Ok(json!({
|
|
||||||
"status": "success",
|
|
||||||
"message": blue_core::voice::info(
|
|
||||||
&format!("Alignment dialogue complete: {}", topic),
|
|
||||||
Some(&hint)
|
|
||||||
),
|
|
||||||
"dialogue": {
|
|
||||||
"topic": topic,
|
|
||||||
"constraint": constraint,
|
|
||||||
"file": dialogue_path.display().to_string(),
|
|
||||||
"rounds": total_rounds,
|
|
||||||
"final_convergence": final_convergence,
|
|
||||||
"status": format!("{:?}", dialogue.status).to_lowercase(),
|
|
||||||
"expert_count": experts.len(),
|
|
||||||
"perspectives_surfaced": dialogue.perspectives.len(),
|
|
||||||
"tensions_resolved": dialogue.tensions.iter().filter(|t| t.status == TensionStatus::Resolved).count(),
|
|
||||||
"linked_rfc": rfc_title,
|
|
||||||
},
|
|
||||||
"expert_panel": experts.iter().map(|e| json!({
|
|
||||||
"id": e.id,
|
|
||||||
"name": e.name,
|
|
||||||
"tier": format!("{:?}", e.tier).to_lowercase(),
|
|
||||||
})).collect::<Vec<_>>(),
|
|
||||||
}))
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Build a summary of previous rounds for the prompt
|
|
||||||
fn summarize_previous_rounds(rounds: &[Round]) -> String {
|
|
||||||
if rounds.is_empty() {
|
|
||||||
return String::new();
|
|
||||||
}
|
|
||||||
|
|
||||||
let mut summary = String::new();
|
|
||||||
for round in rounds {
|
|
||||||
summary.push_str(&format!("\n## Round {} Summary\n", round.number));
|
|
||||||
summary.push_str(&format!("Convergence: {:.0}%\n", round.convergence * 100.0));
|
|
||||||
|
|
||||||
for resp in &round.responses {
|
|
||||||
summary.push_str(&format!(
|
|
||||||
"\n**{}**: {} (confidence: {:.1})\n",
|
|
||||||
resp.expert_id, resp.position, resp.confidence
|
|
||||||
));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
summary
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Run a single round of dialogue
|
|
||||||
/// Returns (Round, new_perspectives, new_tensions)
|
|
||||||
fn run_round(
|
|
||||||
ollama: &EmbeddedOllama,
|
|
||||||
model: &str,
|
|
||||||
options: &CompletionOptions,
|
|
||||||
topic: &str,
|
|
||||||
constraint: Option<&str>,
|
|
||||||
experts: &[blue_core::Expert],
|
|
||||||
previous_rounds: &[Round],
|
|
||||||
round_num: u32,
|
|
||||||
perspective_offset: usize,
|
|
||||||
tension_offset: usize,
|
|
||||||
) -> Result<(Round, Vec<Perspective>, Vec<Tension>), ServerError> {
|
|
||||||
let mut responses = Vec::new();
|
|
||||||
let mut round_score = 0u32;
|
|
||||||
let mut new_perspectives = Vec::new();
|
|
||||||
let mut new_tensions = Vec::new();
|
|
||||||
|
|
||||||
// Build summary of previous rounds
|
|
||||||
let previous_summary = summarize_previous_rounds(previous_rounds);
|
|
||||||
|
|
||||||
for expert in experts {
|
|
||||||
// Build prompt for this expert
|
|
||||||
let prompt = build_expert_prompt(
|
|
||||||
expert,
|
|
||||||
topic,
|
|
||||||
constraint,
|
|
||||||
round_num,
|
|
||||||
&previous_summary,
|
|
||||||
);
|
|
||||||
|
|
||||||
// Generate response
|
|
||||||
let result = ollama
|
|
||||||
.generate(model, &prompt, options)
|
|
||||||
.map_err(|e| ServerError::CommandFailed(format!("LLM generation failed: {}", e)))?;
|
|
||||||
|
|
||||||
// Parse response
|
|
||||||
let mut response = parse_expert_response(&expert.id, &result.text);
|
|
||||||
|
|
||||||
// Track new perspectives
|
|
||||||
let local_perspective_offset = perspective_offset + new_perspectives.len();
|
|
||||||
for (i, p) in response.perspectives.iter_mut().enumerate() {
|
|
||||||
p.id = format!("P{:02}", local_perspective_offset + i + 1);
|
|
||||||
p.round = round_num;
|
|
||||||
new_perspectives.push(p.clone());
|
|
||||||
}
|
|
||||||
|
|
||||||
// Track new tensions
|
|
||||||
let local_tension_offset = tension_offset + new_tensions.len();
|
|
||||||
for (i, t) in response.tensions.iter_mut().enumerate() {
|
|
||||||
t.id = format!("T{}", local_tension_offset + i + 1);
|
|
||||||
new_tensions.push(t.clone());
|
|
||||||
}
|
|
||||||
|
|
||||||
round_score += response.score.total();
|
|
||||||
responses.push(response);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Calculate convergence based on position similarity
|
|
||||||
let convergence = calculate_convergence(&responses);
|
|
||||||
|
|
||||||
// Calculate velocity
|
|
||||||
let previous_total = previous_rounds.last().map(|r| r.total_score).unwrap_or(0);
|
|
||||||
let velocity = (round_score as i32) - (previous_total as i32);
|
|
||||||
|
|
||||||
Ok((
|
|
||||||
Round {
|
|
||||||
number: round_num,
|
|
||||||
responses,
|
|
||||||
total_score: round_score,
|
|
||||||
velocity,
|
|
||||||
convergence,
|
|
||||||
},
|
|
||||||
new_perspectives,
|
|
||||||
new_tensions,
|
|
||||||
))
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Calculate convergence based on position alignment
|
|
||||||
fn calculate_convergence(responses: &[ExpertResponse]) -> f64 {
|
|
||||||
if responses.is_empty() {
|
|
||||||
return 0.0;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Use confidence-weighted position clustering
|
|
||||||
// High confidence experts have more weight in determining convergence
|
|
||||||
let high_confidence: Vec<_> = responses
|
|
||||||
.iter()
|
|
||||||
.filter(|r| r.confidence >= 0.7)
|
|
||||||
.collect();
|
|
||||||
|
|
||||||
if high_confidence.is_empty() {
|
|
||||||
return 0.3; // Base convergence if no one is confident yet
|
|
||||||
}
|
|
||||||
|
|
||||||
// Group by position similarity using first 30 chars as key
|
|
||||||
let mut position_groups: std::collections::HashMap<String, usize> = std::collections::HashMap::new();
|
|
||||||
for response in &high_confidence {
|
|
||||||
let key: String = response.position.chars().take(30).collect::<String>().to_lowercase();
|
|
||||||
*position_groups.entry(key).or_insert(0) += 1;
|
|
||||||
}
|
|
||||||
|
|
||||||
let largest_group = position_groups.values().max().copied().unwrap_or(0);
|
|
||||||
largest_group as f64 / responses.len() as f64
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Generate dialogue markdown
|
|
||||||
fn generate_dialogue_markdown(dialogue: &AlignmentDialogue) -> String {
|
|
||||||
let mut md = String::new();
|
|
||||||
|
|
||||||
// Title
|
|
||||||
md.push_str(&format!("# Alignment Dialogue: {}\n\n", dialogue.topic));
|
|
||||||
|
|
||||||
// Metadata
|
|
||||||
md.push_str("| | |\n|---|---|\n");
|
|
||||||
md.push_str(&format!("| **Topic** | {} |\n", dialogue.topic));
|
|
||||||
if let Some(ref c) = dialogue.constraint {
|
|
||||||
md.push_str(&format!("| **Constraint** | {} |\n", c));
|
|
||||||
}
|
|
||||||
md.push_str(&format!(
|
|
||||||
"| **Format** | {} experts, {} rounds |\n",
|
|
||||||
dialogue.experts.len(),
|
|
||||||
dialogue.rounds.len()
|
|
||||||
));
|
|
||||||
let final_conv = dialogue.rounds.last().map(|r| r.convergence).unwrap_or(0.0);
|
|
||||||
md.push_str(&format!(
|
|
||||||
"| **Final Convergence** | {:.0}% |\n",
|
|
||||||
final_conv * 100.0
|
|
||||||
));
|
|
||||||
md.push_str(&format!(
|
|
||||||
"| **Status** | {:?} |\n",
|
|
||||||
dialogue.status
|
|
||||||
));
|
|
||||||
if let Some(ref rfc) = dialogue.rfc_title {
|
|
||||||
md.push_str(&format!("| **RFC** | {} |\n", rfc));
|
|
||||||
}
|
|
||||||
md.push_str("\n---\n\n");
|
|
||||||
|
|
||||||
// Expert Panel
|
|
||||||
md.push_str("## Expert Panel\n\n");
|
|
||||||
md.push_str("| ID | Expert | Tier | Perspective |\n");
|
|
||||||
md.push_str("|----|--------|------|-------------|\n");
|
|
||||||
for e in &dialogue.experts {
|
|
||||||
md.push_str(&format!(
|
|
||||||
"| {} | **{}** | {:?} | {} |\n",
|
|
||||||
e.id, e.name, e.tier, e.perspective
|
|
||||||
));
|
|
||||||
}
|
|
||||||
md.push_str("\n");
|
|
||||||
|
|
||||||
// Perspectives Inventory
|
|
||||||
if !dialogue.perspectives.is_empty() {
|
|
||||||
md.push_str("## Perspectives Inventory\n\n");
|
|
||||||
md.push_str("| ID | Description | Surfaced By | Round | Status |\n");
|
|
||||||
md.push_str("|----|-------------|-------------|-------|--------|\n");
|
|
||||||
for p in &dialogue.perspectives {
|
|
||||||
md.push_str(&format!(
|
|
||||||
"| {} | {} | {} | {} | {:?} |\n",
|
|
||||||
p.id, p.description, p.surfaced_by, p.round, p.status
|
|
||||||
));
|
|
||||||
}
|
|
||||||
md.push_str("\n");
|
|
||||||
}
|
|
||||||
|
|
||||||
// Tensions
|
|
||||||
if !dialogue.tensions.is_empty() {
|
|
||||||
md.push_str("## Tensions\n\n");
|
|
||||||
md.push_str("| ID | Description | Status |\n");
|
|
||||||
md.push_str("|----|-------------|--------|\n");
|
|
||||||
for t in &dialogue.tensions {
|
|
||||||
md.push_str(&format!(
|
|
||||||
"| {} | {} | {:?} |\n",
|
|
||||||
t.id, t.description, t.status
|
|
||||||
));
|
|
||||||
}
|
|
||||||
md.push_str("\n");
|
|
||||||
}
|
|
||||||
|
|
||||||
// Rounds
|
|
||||||
for round in &dialogue.rounds {
|
|
||||||
md.push_str(&format!("## Round {}\n\n", round.number));
|
|
||||||
|
|
||||||
for resp in &round.responses {
|
|
||||||
let expert = dialogue.experts.iter().find(|e| e.id == resp.expert_id);
|
|
||||||
let name = expert.map(|e| e.name.as_str()).unwrap_or(&resp.expert_id);
|
|
||||||
md.push_str(&format!("### {} ({})\n\n", name, resp.expert_id));
|
|
||||||
md.push_str(&resp.content);
|
|
||||||
md.push_str("\n\n");
|
|
||||||
}
|
|
||||||
|
|
||||||
// Round scoreboard
|
|
||||||
md.push_str(&format!("### Round {} Scoreboard\n\n", round.number));
|
|
||||||
md.push_str("| Expert | Position | Confidence | ALIGNMENT |\n");
|
|
||||||
md.push_str("|--------|----------|------------|----------|\n");
|
|
||||||
for resp in &round.responses {
|
|
||||||
let position_display = if resp.position.len() > 40 {
|
|
||||||
format!("{}...", &resp.position[..40])
|
|
||||||
} else {
|
|
||||||
resp.position.clone()
|
|
||||||
};
|
|
||||||
md.push_str(&format!(
|
|
||||||
"| {} | {} | {:.1} | {} |\n",
|
|
||||||
resp.expert_id,
|
|
||||||
position_display,
|
|
||||||
resp.confidence,
|
|
||||||
resp.score.total()
|
|
||||||
));
|
|
||||||
}
|
|
||||||
md.push_str(&format!(
|
|
||||||
"\n**Convergence:** {:.0}% | **Velocity:** {:+} | **Total ALIGNMENT:** {}\n\n",
|
|
||||||
round.convergence * 100.0,
|
|
||||||
round.velocity,
|
|
||||||
round.total_score
|
|
||||||
));
|
|
||||||
}
|
|
||||||
|
|
||||||
// Recommendations (extracted from final round consensus)
|
|
||||||
md.push_str("## Recommendations\n\n");
|
|
||||||
if let Some(final_round) = dialogue.rounds.last() {
|
|
||||||
// Take top 3 positions by confidence
|
|
||||||
let mut sorted_responses = final_round.responses.clone();
|
|
||||||
sorted_responses.sort_by(|a, b| b.confidence.partial_cmp(&a.confidence).unwrap_or(std::cmp::Ordering::Equal));
|
|
||||||
|
|
||||||
for (i, resp) in sorted_responses.iter().take(3).enumerate() {
|
|
||||||
md.push_str(&format!("{}. **{}**: {}\n", i + 1, resp.expert_id, resp.position));
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
md.push_str("*No rounds completed.*\n");
|
|
||||||
}
|
|
||||||
|
|
||||||
md.push_str("\n---\n\n");
|
|
||||||
md.push_str("*Generated by Blue Alignment Dialogue Orchestration (RFC 0012)*\n");
|
|
||||||
|
|
||||||
md
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Save dialogue to file and SQLite
|
|
||||||
fn save_dialogue(
|
|
||||||
state: &mut ProjectState,
|
|
||||||
dialogue: &AlignmentDialogue,
|
|
||||||
markdown: &str,
|
|
||||||
) -> Result<PathBuf, ServerError> {
|
|
||||||
// Get next dialogue number
|
|
||||||
let dialogue_number = state
|
|
||||||
.store
|
|
||||||
.next_number(DocType::Dialogue)
|
|
||||||
.map_err(|e| ServerError::CommandFailed(e.to_string()))?;
|
|
||||||
|
|
||||||
// Generate file path
|
|
||||||
let date = chrono::Local::now().format("%Y-%m-%d").to_string();
|
|
||||||
let file_name = format!(
|
|
||||||
"{}-{}.dialogue.md",
|
|
||||||
date,
|
|
||||||
to_kebab_case(&dialogue.topic)
|
|
||||||
);
|
|
||||||
let file_path = PathBuf::from("dialogues").join(&file_name);
|
|
||||||
let docs_path = state.home.docs_path.clone();
|
|
||||||
let dialogue_path = docs_path.join(&file_path);
|
|
||||||
|
|
||||||
// Create document in SQLite
|
|
||||||
let mut doc = Document::new(DocType::Dialogue, &dialogue.topic, "recorded");
|
|
||||||
doc.number = Some(dialogue_number);
|
|
||||||
doc.file_path = Some(file_path.to_string_lossy().to_string());
|
|
||||||
|
|
||||||
let dialogue_id = state
|
|
||||||
.store
|
|
||||||
.add_document(&doc)
|
|
||||||
.map_err(|e| ServerError::CommandFailed(e.to_string()))?;
|
|
||||||
|
|
||||||
// Link to RFC if provided
|
|
||||||
if let Some(ref rfc_title) = dialogue.rfc_title {
|
|
||||||
if let Ok(rfc_doc) = state.store.find_document(DocType::Rfc, rfc_title) {
|
|
||||||
if let (Some(rfc_id), Some(did)) = (rfc_doc.id, Some(dialogue_id)) {
|
|
||||||
let _ = state.store.link_documents(did, rfc_id, LinkType::DialogueToRfc);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Create dialogues directory if needed
|
|
||||||
if let Some(parent) = dialogue_path.parent() {
|
|
||||||
fs::create_dir_all(parent).map_err(|e| ServerError::CommandFailed(e.to_string()))?;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Write file
|
|
||||||
fs::write(&dialogue_path, markdown).map_err(|e| ServerError::CommandFailed(e.to_string()))?;
|
|
||||||
|
|
||||||
Ok(dialogue_path)
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Convert string to kebab-case
|
|
||||||
fn to_kebab_case(s: &str) -> String {
|
|
||||||
s.to_lowercase()
|
|
||||||
.chars()
|
|
||||||
.map(|c| if c.is_alphanumeric() { c } else { '-' })
|
|
||||||
.collect::<String>()
|
|
||||||
.split('-')
|
|
||||||
.filter(|s| !s.is_empty())
|
|
||||||
.collect::<Vec<_>>()
|
|
||||||
.join("-")
|
|
||||||
}
|
|
||||||
|
|
||||||
#[cfg(test)]
|
|
||||||
mod tests {
|
|
||||||
use super::*;
|
|
||||||
use blue_core::AlignmentScore;
|
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn test_to_kebab_case() {
|
|
||||||
assert_eq!(to_kebab_case("API Versioning Strategy"), "api-versioning-strategy");
|
|
||||||
assert_eq!(to_kebab_case("Cross-Account IAM"), "cross-account-iam");
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn test_calculate_convergence_single() {
|
|
||||||
let responses = vec![ExpertResponse {
|
|
||||||
expert_id: "DS".to_string(),
|
|
||||||
content: String::new(),
|
|
||||||
position: "Use semantic versioning".to_string(),
|
|
||||||
confidence: 0.8,
|
|
||||||
perspectives: Vec::new(),
|
|
||||||
tensions: Vec::new(),
|
|
||||||
refinements: Vec::new(),
|
|
||||||
concessions: Vec::new(),
|
|
||||||
resolved_tensions: Vec::new(),
|
|
||||||
score: AlignmentScore::default(),
|
|
||||||
}];
|
|
||||||
|
|
||||||
let conv = calculate_convergence(&responses);
|
|
||||||
assert!((conv - 1.0).abs() < 0.001);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn test_calculate_convergence_empty() {
|
|
||||||
let responses: Vec<ExpertResponse> = Vec::new();
|
|
||||||
let conv = calculate_convergence(&responses);
|
|
||||||
assert!((conv - 0.0).abs() < 0.001);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn test_summarize_empty_rounds() {
|
|
||||||
let rounds: Vec<Round> = Vec::new();
|
|
||||||
let summary = summarize_previous_rounds(&rounds);
|
|
||||||
assert!(summary.is_empty());
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
@ -3,7 +3,7 @@
|
||||||
//! Each module handles a specific document type or workflow.
|
//! Each module handles a specific document type or workflow.
|
||||||
|
|
||||||
pub mod adr;
|
pub mod adr;
|
||||||
pub mod alignment; // RFC 0012: Alignment Dialogue Orchestration
|
// alignment module removed per RFC 0015 - Claude orchestrates via Task tool, not MCP
|
||||||
pub mod audit; // Health check (blue_health_check)
|
pub mod audit; // Health check (blue_health_check)
|
||||||
pub mod audit_doc; // Audit documents (blue_audit_create, etc.)
|
pub mod audit_doc; // Audit documents (blue_audit_create, etc.)
|
||||||
pub mod decision;
|
pub mod decision;
|
||||||
|
|
@ -22,6 +22,7 @@ pub mod prd;
|
||||||
pub mod realm;
|
pub mod realm;
|
||||||
pub mod release;
|
pub mod release;
|
||||||
pub mod reminder;
|
pub mod reminder;
|
||||||
|
pub mod resources; // MCP Resources (RFC 0016)
|
||||||
pub mod rfc;
|
pub mod rfc;
|
||||||
pub mod runbook;
|
pub mod runbook;
|
||||||
pub mod session;
|
pub mod session;
|
||||||
|
|
|
||||||
|
|
@ -17,7 +17,7 @@ use tracing::info;
|
||||||
|
|
||||||
/// Run the MCP server
|
/// Run the MCP server
|
||||||
pub async fn run() -> anyhow::Result<()> {
|
pub async fn run() -> anyhow::Result<()> {
|
||||||
let mut server = BlueServer::new();
|
let server = std::sync::Arc::new(std::sync::Mutex::new(BlueServer::new()));
|
||||||
|
|
||||||
let stdin = tokio::io::stdin();
|
let stdin = tokio::io::stdin();
|
||||||
let mut stdout = tokio::io::stdout();
|
let mut stdout = tokio::io::stdout();
|
||||||
|
|
@ -34,7 +34,15 @@ pub async fn run() -> anyhow::Result<()> {
|
||||||
break; // EOF
|
break; // EOF
|
||||||
}
|
}
|
||||||
|
|
||||||
let response = server.handle_request(line.trim());
|
// Run blocking handlers in spawn_blocking to avoid tokio runtime conflicts
|
||||||
|
let request = line.trim().to_string();
|
||||||
|
let server_clone = server.clone();
|
||||||
|
let response = tokio::task::spawn_blocking(move || {
|
||||||
|
let mut server = server_clone.lock().unwrap();
|
||||||
|
server.handle_request(&request)
|
||||||
|
})
|
||||||
|
.await?;
|
||||||
|
|
||||||
stdout.write_all(response.as_bytes()).await?;
|
stdout.write_all(response.as_bytes()).await?;
|
||||||
stdout.write_all(b"\n").await?;
|
stdout.write_all(b"\n").await?;
|
||||||
stdout.flush().await?;
|
stdout.flush().await?;
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue