Eric Garcia d7db9c667d feat: RFC 0048 expert pool implementation and documentation batch

## RFC 0048 Expert Pool Implementation
- Added tiered expert pools (Core/Adjacent/Wildcard) to dialogue handlers
- Implemented weighted random sampling for panel selection
- Added blue_dialogue_sample_panel MCP tool for manual round control
- Updated alignment-play skill with pool design instructions

## New RFCs
- 0044: RFC matching and auto-status (draft)
- 0045: MCP tool enforcement (draft)
- 0046: Judge-defined expert panels (superseded)
- 0047: Expert pool sampling architecture (superseded)
- 0048: Alignment expert pools (implemented)
- 0050: Graduated panel rotation (draft)

## Dialogues Recorded
- 2026-02-01T2026Z: Test expert pool feature
- 2026-02-01T2105Z: SQLite vs flat files
- 2026-02-01T2214Z: Guard command architecture

## Other Changes
- Added TODO.md for tracking work
- Updated expert-pools.md knowledge doc
- Removed deprecated alignment-expert agent
- Added spikes for SQLite assets and SDLC workflow gaps

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-01 19:26:41 -05:00

16 KiB

Raw Blame History

RFC 0047: Expert Pool Sampling Architecture


Status	Superseded
Superseded By	RFC 0048 (Alignment Expert Pools)
Date	2026-02-01
ADRs	0014 (Alignment Dialogue Agents)
Extends	RFC 0046 (Judge Defined Expert Panels)

Summary

Extend the alignment dialogue system to support larger expert pools from which panels are sampled. Currently, blue_dialogue_create(agents=12) creates exactly 12 fixed agents. This RFC introduces a two-phase architecture: (1) Judge defines a domain-appropriate pool of 15-30 experts, (2) MCP server samples N experts from this pool, with optional per-round rotation.

Problem

RFC 0046 addresses the issue of inappropriate auto-selected roles. However, it still creates a fixed panel of exactly N agents who participate in all rounds. This misses opportunities:

Perspective diversity: A larger pool enables rotation, bringing fresh perspectives each round
Stochastic exploration: Weighted random sampling may surface unexpected insights
Tiered expertise: Core experts can be retained while Wildcards rotate
Demonstrable reasoning: Users see the Judge's domain analysis reflected in pool design

Current architecture:

blue_dialogue_create(agents=12, expert_panel=[...12 roles...])
  → Creates 12 fixed agents
  → Same 12 participate in all rounds

Proposed architecture:

blue_dialogue_create(expert_pool=[...24 roles...], panel_size=12, rotation="wildcards")
  → Creates 24-expert domain pool
  → Samples 12 for Round 0
  → Rotates Wildcards each round (retains Core/Adjacent)

Design

Expert Pool Structure

The Judge creates a pool with tiered relevance:

{
  "expert_pool": {
    "domain": "Investment Analysis",
    "question": "Should Acme Trust add NVIDIA by trimming NVAI?",
    "experts": [
      { "role": "Value Analyst", "tier": "Core", "relevance": 0.95 },
      { "role": "Growth Analyst", "tier": "Core", "relevance": 0.90 },
      { "role": "Risk Manager", "tier": "Core", "relevance": 0.85 },
      { "role": "Portfolio Strategist", "tier": "Core", "relevance": 0.80 },
      { "role": "ESG Analyst", "tier": "Adjacent", "relevance": 0.70 },
      { "role": "Quant Strategist", "tier": "Adjacent", "relevance": 0.65 },
      { "role": "Technical Analyst", "tier": "Adjacent", "relevance": 0.60 },
      { "role": "Behavioral Analyst", "tier": "Adjacent", "relevance": 0.55 },
      { "role": "Income Analyst", "tier": "Adjacent", "relevance": 0.50 },
      { "role": "Macro Economist", "tier": "Wildcard", "relevance": 0.40 },
      { "role": "Credit Analyst", "tier": "Wildcard", "relevance": 0.35 },
      { "role": "Contrarian", "tier": "Wildcard", "relevance": 0.30 },
      { "role": "Geopolitical Analyst", "tier": "Wildcard", "relevance": 0.25 },
      { "role": "Market Historian", "tier": "Wildcard", "relevance": 0.22 },
      { "role": "Options Strategist", "tier": "Wildcard", "relevance": 0.20 }
    ]
  },
  "panel_size": 12,
  "rotation": "wildcards"
}

Tier Distribution

For a pool of P experts with panel size N:

Tier	Pool %	Panel %	Purpose
Core	~25%	~33%	Domain essentials, always selected
Adjacent	~40%	~42%	Related expertise, high selection probability
Wildcard	~35%	~25%	Fresh perspectives, rotation candidates

Sampling Algorithm

/// Sample N experts from pool for a round
fn sample_panel(pool: &ExpertPool, panel_size: usize, round: usize, rotation: RotationMode) -> Vec<PastryAgent> {
    let (core_n, adj_n, wc_n) = tier_split(panel_size);

    match rotation {
        RotationMode::None => {
            // Round 0 selection persists all rounds (current behavior)
            if round == 0 {
                weighted_sample(&pool.core, core_n)
                    .chain(weighted_sample(&pool.adjacent, adj_n))
                    .chain(weighted_sample(&pool.wildcard, wc_n))
            } else {
                // Return same panel as round 0
                load_round_0_panel()
            }
        }
        RotationMode::Wildcards => {
            // Core/Adjacent persist, Wildcards resample each round
            let core = if round == 0 { weighted_sample(&pool.core, core_n) } else { load_core_panel() };
            let adjacent = if round == 0 { weighted_sample(&pool.adjacent, adj_n) } else { load_adjacent_panel() };
            let wc_remaining = pool.wildcard.iter()
                .filter(|e| !used_wildcards.contains(&e.role))
                .collect();
            let wildcards = weighted_sample(&wc_remaining, wc_n);

            core.chain(adjacent).chain(wildcards)
        }
        RotationMode::Full => {
            // Complete resample each round (respects relevance weights)
            weighted_sample(&pool.all, panel_size)
        }
    }
}

/// Weighted random sampling without replacement
fn weighted_sample(experts: &[Expert], n: usize) -> Vec<Expert> {
    let total_weight: f64 = experts.iter().map(|e| e.relevance).sum();
    let probs: Vec<f64> = experts.iter().map(|e| e.relevance / total_weight).collect();

    // Reservoir sampling with weights
    weighted_reservoir_sample(experts, probs, n)
}

API Changes

`blue_dialogue_create` Extended Parameters

{
  "title": "NVIDIA Investment Decision",
  "alignment": true,
  "expert_pool": {
    "domain": "Investment Analysis",
    "experts": [
      { "role": "Value Analyst", "tier": "Core", "relevance": 0.95, "focus": "Intrinsic value, margin of safety" },
      // ... 15-30 experts
    ]
  },
  "panel_size": 12,
  "rotation": "wildcards"  // "none" | "wildcards" | "full"
}

Backward Compatibility

expert_panel (RFC 0046) still works → creates fixed panel, no pool
expert_pool (this RFC) → creates pool with sampling

if let Some(pool) = args.get("expert_pool") {
    // New pool-based architecture
    create_with_pool(pool, panel_size, rotation)
} else if let Some(panel) = args.get("expert_panel") {
    // RFC 0046 behavior - fixed panel
    create_with_fixed_panel(panel)
} else {
    // Error - alignment requires either pool or panel
    Err(ServerError::InvalidParams)
}

New Tool: `blue_dialogue_sample_panel`

For manual round-by-round control:

{
  "name": "blue_dialogue_sample_panel",
  "description": "Sample a new panel from the expert pool for the next round",
  "params": {
    "dialogue_id": "nvidia-investment-decision",
    "round": 1,
    "retain_experts": ["muffin", "cupcake", "scone"],  // Optional: keep specific experts
    "exclude_experts": ["beignet"]  // Optional: exclude specific experts
  }
}

Pool Persistence

Expert pools are stored per-dialogue:

{output_dir}/
├── expert-pool.json      ← Full pool definition (Judge writes)
├── round-0/
│   ├── panel.json        ← Sampled panel for this round
│   └── *.md              ← Agent responses
├── round-1/
│   ├── panel.json        ← May differ if rotation enabled
│   └── *.md
└── scoreboard.md

Judge Workflow

Analyze problem: Read RFC/topic, identify required expertise domains
Design pool: Create 15-30 experts across Core/Adjacent/Wildcard tiers
Create dialogue: Call blue_dialogue_create with expert_pool
Run rounds: MCP server handles sampling automatically
Review selections: Pool and panel visible in output files

ADR 0014 Amendment

Add to ADR 0014:

### Expert Pools (RFC 0047)

The Judge may create a **larger expert pool** from which panels are sampled:

| Concept | Description |
|---------|-------------|
| **Pool** | 15-30 domain-appropriate experts defined by Judge |
| **Panel** | N experts sampled from pool for a given round |
| **Sampling** | Weighted random selection respecting relevance scores |
| **Rotation** | Optional: Wildcards may rotate between rounds |

Pool design is a Judge responsibility. The Judge understands the problem domain after reading the RFC/topic and designs experts accordingly.

**Tier Distribution**:
- **Core** (~25% of pool, 33% of panel): Essential domain experts, always selected
- **Adjacent** (~40% of pool, 42% of panel): Related expertise, high probability
- **Wildcard** (~35% of pool, 25% of panel): Fresh perspectives, rotation candidates

**Rotation Modes**:
- `none`: Fixed panel (current behavior)
- `wildcards`: Core/Adjacent persist, Wildcards resample each round
- `full`: Complete resample each round (experimental)

Skill Update: alignment-play

## Phase 1: Pool Design

Before creating the dialogue, the Judge:
1. Reads the topic/RFC thoroughly
2. Identifies the **domain** (e.g., "Investment Analysis", "System Architecture")
3. Designs **15-30 experts** appropriate to the domain:
   - **Core (4-8)**: Essential perspectives for this specific problem
   - **Adjacent (6-12)**: Related expertise that adds depth
   - **Wildcard (5-10)**: Fresh perspectives, contrarians, cross-domain insight
4. Assigns **relevance scores** (0.20-0.95) based on expected contribution
5. Calls `blue_dialogue_create` with the `expert_pool`

## Phase 2: Round Execution

The MCP server:
1. Samples `panel_size` experts from pool using weighted random selection
2. Higher relevance = higher selection probability
3. Core experts almost always selected; Wildcards provide variety
4. If rotation enabled, Wildcards resample each round

## Phase 3: Convergence

Same as current: velocity → 0 or tensions resolved

Implementation

Changes to `dialogue.rs`

/// Expert pool with tiered structure
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ExpertPool {
    pub domain: String,
    pub experts: Vec<PoolExpert>,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct PoolExpert {
    pub role: String,
    pub tier: ExpertTier,
    pub relevance: f64,
    pub focus: Option<String>,
    pub bias: Option<String>,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum ExpertTier {
    Core,
    Adjacent,
    Wildcard,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum RotationMode {
    None,       // Fixed panel all rounds
    Wildcards,  // Core/Adjacent fixed, Wildcards rotate
    Full,       // Complete resample each round
}

/// Handle blue_dialogue_create with pool support
pub fn handle_create(state: &mut ProjectState, args: &Value) -> Result<Value, ServerError> {
    // ... existing validation ...

    if let Some(pool_json) = args.get("expert_pool") {
        let pool: ExpertPool = serde_json::from_value(pool_json.clone())?;
        let panel_size = args.get("panel_size").and_then(|v| v.as_u64()).unwrap_or(12) as usize;
        let rotation: RotationMode = args.get("rotation")
            .and_then(|v| v.as_str())
            .map(|s| match s {
                "wildcards" => RotationMode::Wildcards,
                "full" => RotationMode::Full,
                _ => RotationMode::None,
            })
            .unwrap_or(RotationMode::None);

        // Sample initial panel
        let agents = sample_panel_from_pool(&pool, panel_size);

        // Persist pool to output directory
        let pool_path = format!("{}/expert-pool.json", output_dir);
        fs::write(&pool_path, serde_json::to_string_pretty(&pool)?)?;

        // ... rest of dialogue creation ...
    }
}

New Handler: `handle_sample_panel`

pub fn handle_sample_panel(state: &ProjectState, args: &Value) -> Result<Value, ServerError> {
    let dialogue_id = args.get("dialogue_id").and_then(|v| v.as_str())
        .ok_or(ServerError::InvalidParams)?;
    let round = args.get("round").and_then(|v| v.as_u64())
        .ok_or(ServerError::InvalidParams)? as usize;

    // Load pool from dialogue directory
    let pool_path = format!("/tmp/blue-dialogue/{}/expert-pool.json", dialogue_id);
    let pool: ExpertPool = serde_json::from_str(&fs::read_to_string(&pool_path)?)?;

    // Parse retain/exclude lists
    let retain: Vec<String> = args.get("retain_experts")
        .and_then(|v| v.as_array())
        .map(|arr| arr.iter().filter_map(|v| v.as_str().map(String::from)).collect())
        .unwrap_or_default();

    // Sample new panel
    let panel = sample_panel_with_constraints(&pool, 12, round, &retain);

    // Persist panel for this round
    let panel_path = format!("/tmp/blue-dialogue/{}/round-{}/panel.json", dialogue_id, round);
    fs::write(&panel_path, serde_json::to_string_pretty(&panel)?)?;

    Ok(json!({
        "status": "success",
        "round": round,
        "panel": panel,
    }))
}

Test Plan

blue_dialogue_create with expert_pool creates pool file
Initial panel respects tier distribution (33/42/25)
Weighted sampling: higher relevance = higher selection probability
rotation: "wildcards" keeps Core/Adjacent, rotates Wildcards
rotation: "none" uses same panel all rounds
blue_dialogue_sample_panel respects retain_experts
Pool persists across rounds in output directory
Backward compatibility: expert_panel (RFC 0046) still works

Visualization (for demo)

The demo page can show:

┌─────────────────────────────────────────────────────────────┐
│  INVESTMENT EXPERT POOL                            24 total │
├─────────────────────────────────────────────────────────────┤
│  CORE (6)          ████████████████████  rel: 0.80-0.95     │
│  ✓ Value Analyst   ✓ Growth Analyst   ✓ Risk Manager        │
│  ✓ Portfolio Strat ○ Fundamental      ○ Tax Specialist      │
├─────────────────────────────────────────────────────────────┤
│  ADJACENT (10)     ████████████████      rel: 0.50-0.70     │
│  ✓ ESG Analyst     ✓ Quant Strategist ✓ Technical Analyst   │
│  ✓ Behavioral      ✓ Income Analyst   ○ Credit Analyst      │
│  ○ Governance      ○ Competitive      ○ Regulatory          │
│  ○ Momentum Trader                                          │
├─────────────────────────────────────────────────────────────┤
│  WILDCARD (8)      ████████              rel: 0.20-0.40     │
│  ✓ Macro Economist ✓ Contrarian       ✓ Geopolitical        │
│  ○ Market Historian ○ Options Strat   ○ Ethicist            │
│  ○ Retail Sentiment ○ Academic                              │
└─────────────────────────────────────────────────────────────┘
  ✓ = Selected for Round 0    ○ = Available in pool

  [🎲 Resample Panel] ← Click to see stochastic selection

Philosophy

"The Judge sees the elephant. The Judge summons the right blind men."

The alignment dialogue system embodies the parable of the blind men and the elephant. Each expert touches a different part. Wisdom emerges from integration.

With expert pools:

The Judge designs the population of potential perspectives
The MCP server samples fairly from that population
Rotation refreshes the conversation with new viewpoints
The final verdict reflects multiple samplings of the elephant

This is ALIGNMENT by design: more blind men, more parts touched, more wisdom integrated.

Blue

16 KiB Raw Blame History