## RFC 0048 Expert Pool Implementation - Added tiered expert pools (Core/Adjacent/Wildcard) to dialogue handlers - Implemented weighted random sampling for panel selection - Added blue_dialogue_sample_panel MCP tool for manual round control - Updated alignment-play skill with pool design instructions ## New RFCs - 0044: RFC matching and auto-status (draft) - 0045: MCP tool enforcement (draft) - 0046: Judge-defined expert panels (superseded) - 0047: Expert pool sampling architecture (superseded) - 0048: Alignment expert pools (implemented) - 0050: Graduated panel rotation (draft) ## Dialogues Recorded - 2026-02-01T2026Z: Test expert pool feature - 2026-02-01T2105Z: SQLite vs flat files - 2026-02-01T2214Z: Guard command architecture ## Other Changes - Added TODO.md for tracking work - Updated expert-pools.md knowledge doc - Removed deprecated alignment-expert agent - Added spikes for SQLite assets and SDLC workflow gaps Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
422 lines
16 KiB
Markdown
422 lines
16 KiB
Markdown
# RFC 0047: Expert Pool Sampling Architecture
|
|
|
|
| | |
|
|
|---|---|
|
|
| **Status** | Superseded |
|
|
| **Superseded By** | RFC 0048 (Alignment Expert Pools) |
|
|
| **Date** | 2026-02-01 |
|
|
| **ADRs** | 0014 (Alignment Dialogue Agents) |
|
|
| **Extends** | RFC 0046 (Judge Defined Expert Panels) |
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
Extend the alignment dialogue system to support **larger expert pools** from which panels are sampled. Currently, `blue_dialogue_create(agents=12)` creates exactly 12 fixed agents. This RFC introduces a two-phase architecture: (1) Judge defines a domain-appropriate pool of 15-30 experts, (2) MCP server samples N experts from this pool, with optional per-round rotation.
|
|
|
|
## Problem
|
|
|
|
RFC 0046 addresses the issue of inappropriate auto-selected roles. However, it still creates a **fixed panel** of exactly N agents who participate in all rounds. This misses opportunities:
|
|
|
|
1. **Perspective diversity**: A larger pool enables rotation, bringing fresh perspectives each round
|
|
2. **Stochastic exploration**: Weighted random sampling may surface unexpected insights
|
|
3. **Tiered expertise**: Core experts can be retained while Wildcards rotate
|
|
4. **Demonstrable reasoning**: Users see the Judge's domain analysis reflected in pool design
|
|
|
|
Current architecture:
|
|
```
|
|
blue_dialogue_create(agents=12, expert_panel=[...12 roles...])
|
|
→ Creates 12 fixed agents
|
|
→ Same 12 participate in all rounds
|
|
```
|
|
|
|
Proposed architecture:
|
|
```
|
|
blue_dialogue_create(expert_pool=[...24 roles...], panel_size=12, rotation="wildcards")
|
|
→ Creates 24-expert domain pool
|
|
→ Samples 12 for Round 0
|
|
→ Rotates Wildcards each round (retains Core/Adjacent)
|
|
```
|
|
|
|
## Design
|
|
|
|
### Expert Pool Structure
|
|
|
|
The Judge creates a pool with tiered relevance:
|
|
|
|
```json
|
|
{
|
|
"expert_pool": {
|
|
"domain": "Investment Analysis",
|
|
"question": "Should Acme Trust add NVIDIA by trimming NVAI?",
|
|
"experts": [
|
|
{ "role": "Value Analyst", "tier": "Core", "relevance": 0.95 },
|
|
{ "role": "Growth Analyst", "tier": "Core", "relevance": 0.90 },
|
|
{ "role": "Risk Manager", "tier": "Core", "relevance": 0.85 },
|
|
{ "role": "Portfolio Strategist", "tier": "Core", "relevance": 0.80 },
|
|
{ "role": "ESG Analyst", "tier": "Adjacent", "relevance": 0.70 },
|
|
{ "role": "Quant Strategist", "tier": "Adjacent", "relevance": 0.65 },
|
|
{ "role": "Technical Analyst", "tier": "Adjacent", "relevance": 0.60 },
|
|
{ "role": "Behavioral Analyst", "tier": "Adjacent", "relevance": 0.55 },
|
|
{ "role": "Income Analyst", "tier": "Adjacent", "relevance": 0.50 },
|
|
{ "role": "Macro Economist", "tier": "Wildcard", "relevance": 0.40 },
|
|
{ "role": "Credit Analyst", "tier": "Wildcard", "relevance": 0.35 },
|
|
{ "role": "Contrarian", "tier": "Wildcard", "relevance": 0.30 },
|
|
{ "role": "Geopolitical Analyst", "tier": "Wildcard", "relevance": 0.25 },
|
|
{ "role": "Market Historian", "tier": "Wildcard", "relevance": 0.22 },
|
|
{ "role": "Options Strategist", "tier": "Wildcard", "relevance": 0.20 }
|
|
]
|
|
},
|
|
"panel_size": 12,
|
|
"rotation": "wildcards"
|
|
}
|
|
```
|
|
|
|
### Tier Distribution
|
|
|
|
For a pool of P experts with panel size N:
|
|
|
|
| Tier | Pool % | Panel % | Purpose |
|
|
|------|--------|---------|---------|
|
|
| **Core** | ~25% | ~33% | Domain essentials, always selected |
|
|
| **Adjacent** | ~40% | ~42% | Related expertise, high selection probability |
|
|
| **Wildcard** | ~35% | ~25% | Fresh perspectives, rotation candidates |
|
|
|
|
### Sampling Algorithm
|
|
|
|
```rust
|
|
/// Sample N experts from pool for a round
|
|
fn sample_panel(pool: &ExpertPool, panel_size: usize, round: usize, rotation: RotationMode) -> Vec<PastryAgent> {
|
|
let (core_n, adj_n, wc_n) = tier_split(panel_size);
|
|
|
|
match rotation {
|
|
RotationMode::None => {
|
|
// Round 0 selection persists all rounds (current behavior)
|
|
if round == 0 {
|
|
weighted_sample(&pool.core, core_n)
|
|
.chain(weighted_sample(&pool.adjacent, adj_n))
|
|
.chain(weighted_sample(&pool.wildcard, wc_n))
|
|
} else {
|
|
// Return same panel as round 0
|
|
load_round_0_panel()
|
|
}
|
|
}
|
|
RotationMode::Wildcards => {
|
|
// Core/Adjacent persist, Wildcards resample each round
|
|
let core = if round == 0 { weighted_sample(&pool.core, core_n) } else { load_core_panel() };
|
|
let adjacent = if round == 0 { weighted_sample(&pool.adjacent, adj_n) } else { load_adjacent_panel() };
|
|
let wc_remaining = pool.wildcard.iter()
|
|
.filter(|e| !used_wildcards.contains(&e.role))
|
|
.collect();
|
|
let wildcards = weighted_sample(&wc_remaining, wc_n);
|
|
|
|
core.chain(adjacent).chain(wildcards)
|
|
}
|
|
RotationMode::Full => {
|
|
// Complete resample each round (respects relevance weights)
|
|
weighted_sample(&pool.all, panel_size)
|
|
}
|
|
}
|
|
}
|
|
|
|
/// Weighted random sampling without replacement
|
|
fn weighted_sample(experts: &[Expert], n: usize) -> Vec<Expert> {
|
|
let total_weight: f64 = experts.iter().map(|e| e.relevance).sum();
|
|
let probs: Vec<f64> = experts.iter().map(|e| e.relevance / total_weight).collect();
|
|
|
|
// Reservoir sampling with weights
|
|
weighted_reservoir_sample(experts, probs, n)
|
|
}
|
|
```
|
|
|
|
### API Changes
|
|
|
|
#### `blue_dialogue_create` Extended Parameters
|
|
|
|
```json
|
|
{
|
|
"title": "NVIDIA Investment Decision",
|
|
"alignment": true,
|
|
"expert_pool": {
|
|
"domain": "Investment Analysis",
|
|
"experts": [
|
|
{ "role": "Value Analyst", "tier": "Core", "relevance": 0.95, "focus": "Intrinsic value, margin of safety" },
|
|
// ... 15-30 experts
|
|
]
|
|
},
|
|
"panel_size": 12,
|
|
"rotation": "wildcards" // "none" | "wildcards" | "full"
|
|
}
|
|
```
|
|
|
|
#### Backward Compatibility
|
|
|
|
- `expert_panel` (RFC 0046) still works → creates fixed panel, no pool
|
|
- `expert_pool` (this RFC) → creates pool with sampling
|
|
|
|
```rust
|
|
if let Some(pool) = args.get("expert_pool") {
|
|
// New pool-based architecture
|
|
create_with_pool(pool, panel_size, rotation)
|
|
} else if let Some(panel) = args.get("expert_panel") {
|
|
// RFC 0046 behavior - fixed panel
|
|
create_with_fixed_panel(panel)
|
|
} else {
|
|
// Error - alignment requires either pool or panel
|
|
Err(ServerError::InvalidParams)
|
|
}
|
|
```
|
|
|
|
### New Tool: `blue_dialogue_sample_panel`
|
|
|
|
For manual round-by-round control:
|
|
|
|
```json
|
|
{
|
|
"name": "blue_dialogue_sample_panel",
|
|
"description": "Sample a new panel from the expert pool for the next round",
|
|
"params": {
|
|
"dialogue_id": "nvidia-investment-decision",
|
|
"round": 1,
|
|
"retain_experts": ["muffin", "cupcake", "scone"], // Optional: keep specific experts
|
|
"exclude_experts": ["beignet"] // Optional: exclude specific experts
|
|
}
|
|
}
|
|
```
|
|
|
|
### Pool Persistence
|
|
|
|
Expert pools are stored per-dialogue:
|
|
|
|
```
|
|
{output_dir}/
|
|
├── expert-pool.json ← Full pool definition (Judge writes)
|
|
├── round-0/
|
|
│ ├── panel.json ← Sampled panel for this round
|
|
│ └── *.md ← Agent responses
|
|
├── round-1/
|
|
│ ├── panel.json ← May differ if rotation enabled
|
|
│ └── *.md
|
|
└── scoreboard.md
|
|
```
|
|
|
|
### Judge Workflow
|
|
|
|
1. **Analyze problem**: Read RFC/topic, identify required expertise domains
|
|
2. **Design pool**: Create 15-30 experts across Core/Adjacent/Wildcard tiers
|
|
3. **Create dialogue**: Call `blue_dialogue_create` with `expert_pool`
|
|
4. **Run rounds**: MCP server handles sampling automatically
|
|
5. **Review selections**: Pool and panel visible in output files
|
|
|
|
## ADR 0014 Amendment
|
|
|
|
Add to ADR 0014:
|
|
|
|
```markdown
|
|
### Expert Pools (RFC 0047)
|
|
|
|
The Judge may create a **larger expert pool** from which panels are sampled:
|
|
|
|
| Concept | Description |
|
|
|---------|-------------|
|
|
| **Pool** | 15-30 domain-appropriate experts defined by Judge |
|
|
| **Panel** | N experts sampled from pool for a given round |
|
|
| **Sampling** | Weighted random selection respecting relevance scores |
|
|
| **Rotation** | Optional: Wildcards may rotate between rounds |
|
|
|
|
Pool design is a Judge responsibility. The Judge understands the problem domain after reading the RFC/topic and designs experts accordingly.
|
|
|
|
**Tier Distribution**:
|
|
- **Core** (~25% of pool, 33% of panel): Essential domain experts, always selected
|
|
- **Adjacent** (~40% of pool, 42% of panel): Related expertise, high probability
|
|
- **Wildcard** (~35% of pool, 25% of panel): Fresh perspectives, rotation candidates
|
|
|
|
**Rotation Modes**:
|
|
- `none`: Fixed panel (current behavior)
|
|
- `wildcards`: Core/Adjacent persist, Wildcards resample each round
|
|
- `full`: Complete resample each round (experimental)
|
|
```
|
|
|
|
## Skill Update: alignment-play
|
|
|
|
```markdown
|
|
## Phase 1: Pool Design
|
|
|
|
Before creating the dialogue, the Judge:
|
|
1. Reads the topic/RFC thoroughly
|
|
2. Identifies the **domain** (e.g., "Investment Analysis", "System Architecture")
|
|
3. Designs **15-30 experts** appropriate to the domain:
|
|
- **Core (4-8)**: Essential perspectives for this specific problem
|
|
- **Adjacent (6-12)**: Related expertise that adds depth
|
|
- **Wildcard (5-10)**: Fresh perspectives, contrarians, cross-domain insight
|
|
4. Assigns **relevance scores** (0.20-0.95) based on expected contribution
|
|
5. Calls `blue_dialogue_create` with the `expert_pool`
|
|
|
|
## Phase 2: Round Execution
|
|
|
|
The MCP server:
|
|
1. Samples `panel_size` experts from pool using weighted random selection
|
|
2. Higher relevance = higher selection probability
|
|
3. Core experts almost always selected; Wildcards provide variety
|
|
4. If rotation enabled, Wildcards resample each round
|
|
|
|
## Phase 3: Convergence
|
|
|
|
Same as current: velocity → 0 or tensions resolved
|
|
```
|
|
|
|
## Implementation
|
|
|
|
### Changes to `dialogue.rs`
|
|
|
|
```rust
|
|
/// Expert pool with tiered structure
|
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
|
pub struct ExpertPool {
|
|
pub domain: String,
|
|
pub experts: Vec<PoolExpert>,
|
|
}
|
|
|
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
|
pub struct PoolExpert {
|
|
pub role: String,
|
|
pub tier: ExpertTier,
|
|
pub relevance: f64,
|
|
pub focus: Option<String>,
|
|
pub bias: Option<String>,
|
|
}
|
|
|
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
|
pub enum ExpertTier {
|
|
Core,
|
|
Adjacent,
|
|
Wildcard,
|
|
}
|
|
|
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
|
pub enum RotationMode {
|
|
None, // Fixed panel all rounds
|
|
Wildcards, // Core/Adjacent fixed, Wildcards rotate
|
|
Full, // Complete resample each round
|
|
}
|
|
|
|
/// Handle blue_dialogue_create with pool support
|
|
pub fn handle_create(state: &mut ProjectState, args: &Value) -> Result<Value, ServerError> {
|
|
// ... existing validation ...
|
|
|
|
if let Some(pool_json) = args.get("expert_pool") {
|
|
let pool: ExpertPool = serde_json::from_value(pool_json.clone())?;
|
|
let panel_size = args.get("panel_size").and_then(|v| v.as_u64()).unwrap_or(12) as usize;
|
|
let rotation: RotationMode = args.get("rotation")
|
|
.and_then(|v| v.as_str())
|
|
.map(|s| match s {
|
|
"wildcards" => RotationMode::Wildcards,
|
|
"full" => RotationMode::Full,
|
|
_ => RotationMode::None,
|
|
})
|
|
.unwrap_or(RotationMode::None);
|
|
|
|
// Sample initial panel
|
|
let agents = sample_panel_from_pool(&pool, panel_size);
|
|
|
|
// Persist pool to output directory
|
|
let pool_path = format!("{}/expert-pool.json", output_dir);
|
|
fs::write(&pool_path, serde_json::to_string_pretty(&pool)?)?;
|
|
|
|
// ... rest of dialogue creation ...
|
|
}
|
|
}
|
|
```
|
|
|
|
### New Handler: `handle_sample_panel`
|
|
|
|
```rust
|
|
pub fn handle_sample_panel(state: &ProjectState, args: &Value) -> Result<Value, ServerError> {
|
|
let dialogue_id = args.get("dialogue_id").and_then(|v| v.as_str())
|
|
.ok_or(ServerError::InvalidParams)?;
|
|
let round = args.get("round").and_then(|v| v.as_u64())
|
|
.ok_or(ServerError::InvalidParams)? as usize;
|
|
|
|
// Load pool from dialogue directory
|
|
let pool_path = format!("/tmp/blue-dialogue/{}/expert-pool.json", dialogue_id);
|
|
let pool: ExpertPool = serde_json::from_str(&fs::read_to_string(&pool_path)?)?;
|
|
|
|
// Parse retain/exclude lists
|
|
let retain: Vec<String> = args.get("retain_experts")
|
|
.and_then(|v| v.as_array())
|
|
.map(|arr| arr.iter().filter_map(|v| v.as_str().map(String::from)).collect())
|
|
.unwrap_or_default();
|
|
|
|
// Sample new panel
|
|
let panel = sample_panel_with_constraints(&pool, 12, round, &retain);
|
|
|
|
// Persist panel for this round
|
|
let panel_path = format!("/tmp/blue-dialogue/{}/round-{}/panel.json", dialogue_id, round);
|
|
fs::write(&panel_path, serde_json::to_string_pretty(&panel)?)?;
|
|
|
|
Ok(json!({
|
|
"status": "success",
|
|
"round": round,
|
|
"panel": panel,
|
|
}))
|
|
}
|
|
```
|
|
|
|
## Test Plan
|
|
|
|
- [ ] `blue_dialogue_create` with `expert_pool` creates pool file
|
|
- [ ] Initial panel respects tier distribution (33/42/25)
|
|
- [ ] Weighted sampling: higher relevance = higher selection probability
|
|
- [ ] `rotation: "wildcards"` keeps Core/Adjacent, rotates Wildcards
|
|
- [ ] `rotation: "none"` uses same panel all rounds
|
|
- [ ] `blue_dialogue_sample_panel` respects `retain_experts`
|
|
- [ ] Pool persists across rounds in output directory
|
|
- [ ] Backward compatibility: `expert_panel` (RFC 0046) still works
|
|
|
|
## Visualization (for demo)
|
|
|
|
The demo page can show:
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ INVESTMENT EXPERT POOL 24 total │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ CORE (6) ████████████████████ rel: 0.80-0.95 │
|
|
│ ✓ Value Analyst ✓ Growth Analyst ✓ Risk Manager │
|
|
│ ✓ Portfolio Strat ○ Fundamental ○ Tax Specialist │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ ADJACENT (10) ████████████████ rel: 0.50-0.70 │
|
|
│ ✓ ESG Analyst ✓ Quant Strategist ✓ Technical Analyst │
|
|
│ ✓ Behavioral ✓ Income Analyst ○ Credit Analyst │
|
|
│ ○ Governance ○ Competitive ○ Regulatory │
|
|
│ ○ Momentum Trader │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ WILDCARD (8) ████████ rel: 0.20-0.40 │
|
|
│ ✓ Macro Economist ✓ Contrarian ✓ Geopolitical │
|
|
│ ○ Market Historian ○ Options Strat ○ Ethicist │
|
|
│ ○ Retail Sentiment ○ Academic │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
✓ = Selected for Round 0 ○ = Available in pool
|
|
|
|
[🎲 Resample Panel] ← Click to see stochastic selection
|
|
```
|
|
|
|
---
|
|
|
|
## Philosophy
|
|
|
|
> "The Judge sees the elephant. The Judge summons the right blind men."
|
|
|
|
The alignment dialogue system embodies the parable of the blind men and the elephant. Each expert touches a different part. Wisdom emerges from integration.
|
|
|
|
With expert pools:
|
|
- The Judge **designs** the population of potential perspectives
|
|
- The MCP server **samples** fairly from that population
|
|
- Rotation **refreshes** the conversation with new viewpoints
|
|
- The final verdict reflects **multiple samplings** of the elephant
|
|
|
|
This is ALIGNMENT by design: more blind men, more parts touched, more wisdom integrated.
|
|
|
|
---
|
|
|
|
*Blue*
|