RFC 0001 - Cross-Repo Coordination with Realms: - Daemon architecture with HTTP server on localhost:7865 - SQLite persistence for sessions, realms, notifications - Realm service with git-based storage and caching - CLI commands: realm status/sync/check/worktree/pr/admin - Session coordination for multi-repo work RFC 0002 Phase 1 - Realm MCP Integration: - realm_status: Get realm overview (repos, domains, contracts) - realm_check: Validate contracts/bindings with errors/warnings - contract_get: Get contract details with bindings - Context detection from .blue/config.yaml - 98% expert panel alignment via 12-expert dialogue Also includes: - CLI documentation in docs/cli/ - Spike for Forgejo tunnelless access - 86 tests passing Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
372 lines
16 KiB
Markdown
372 lines
16 KiB
Markdown
# Dialogue: Realm MCP Integration Design
|
|
|
|
**RFC**: [0002-realm-mcp-integration](../rfcs/0002-realm-mcp-integration.md)
|
|
**Goal**: Reach 95% alignment on open design questions
|
|
**Format**: 12 experts, structured rounds
|
|
|
|
---
|
|
|
|
## Open Questions
|
|
|
|
1. **Tool granularity** - One `realm` tool with subcommands, or separate tools?
|
|
2. **Notification delivery** - Poll on each tool call, or separate subscription?
|
|
3. **Multi-realm** - How to handle repos in multiple realms?
|
|
|
|
---
|
|
|
|
## Expert Panel
|
|
|
|
| Expert | Domain | Perspective |
|
|
|--------|--------|-------------|
|
|
| **Ada** | API Design | Clean interfaces, discoverability |
|
|
| **Ben** | Developer Experience | Friction, learning curve |
|
|
| **Carmen** | Systems Architecture | Scalability, performance |
|
|
| **David** | MCP Protocol | Tool conventions, client compatibility |
|
|
| **Elena** | Claude Integration | LLM tool use patterns |
|
|
| **Felix** | Distributed Systems | Consistency, coordination |
|
|
| **Grace** | Security | Trust boundaries, access control |
|
|
| **Hassan** | Product | User workflows, value delivery |
|
|
| **Iris** | Simplicity | Minimalism, YAGNI |
|
|
| **James** | Observability | Debugging, transparency |
|
|
| **Kim** | Testing | Testability, reliability |
|
|
| **Luna** | Documentation | Learnability, examples |
|
|
|
|
---
|
|
|
|
## Round 1: Initial Positions
|
|
|
|
### Question 1: Tool Granularity
|
|
|
|
**Ada (API Design)**: Separate tools. Each tool has a clear contract. `realm_status` returns status, `realm_check` returns validation results. Easier to document, easier to version independently.
|
|
|
|
**Ben (DX)**: Separate tools, but not too many. 5-7 tools max in the "realm" namespace. Too many tools overwhelms. Group by workflow: status, validation, session, worktree.
|
|
|
|
**David (MCP Protocol)**: MCP tools should be atomic operations. One tool = one action. Subcommand patterns work poorly because the LLM has to understand nested schemas. Separate tools with clear names.
|
|
|
|
**Elena (Claude Integration)**: Claude performs better with focused tools. A tool that does one thing well gets used correctly. A multi-purpose tool with modes leads to parameter confusion.
|
|
|
|
**Iris (Simplicity)**: Start with 3 tools: `realm_status`, `realm_check`, `realm_action`. The action tool can handle mutations. Expand only when pain is proven.
|
|
|
|
**Luna (Documentation)**: Separate tools are easier to document with examples. Each tool gets its own "when to use this" section.
|
|
|
|
**Alignment**: 85% toward separate tools, debate on how many.
|
|
|
|
### Question 2: Notification Delivery
|
|
|
|
**Carmen (Systems)**: Polling is simpler and more reliable. MCP doesn't have a push channel. Each tool call can check for pending notifications and include them in the response.
|
|
|
|
**Felix (Distributed)**: Polling with piggybacking. Don't make a separate notification tool - just include notifications in every response when relevant. The daemon tracks what's been delivered.
|
|
|
|
**David (MCP Protocol)**: MCP has no subscription model. Polling is the only option. But we can be smart: return notifications with any tool response, mark as delivered, client sees them naturally.
|
|
|
|
**Hassan (Product)**: Users don't want to manually check notifications. Piggybacking is right - if Claude calls any realm tool and there are notifications, surface them. Claude can then decide to act.
|
|
|
|
**James (Observability)**: Need a way to explicitly list/ack notifications too. Piggybacking is good for discovery, but sometimes you want to see "what happened while I was away."
|
|
|
|
**Grace (Security)**: Notifications should only show changes the current session is authorized to see. Scope to domains the repo participates in.
|
|
|
|
**Alignment**: 90% toward poll-with-piggyback, explicit list/ack as supplement.
|
|
|
|
### Question 3: Multi-Realm
|
|
|
|
**Iris (Simplicity)**: Defer. MVP is one realm per repo. Multi-realm is complexity we don't need yet. YAGNI.
|
|
|
|
**Ada (API Design)**: If we defer, design the API to not preclude it. All tools should accept optional `realm` parameter. Default to detected realm from cwd.
|
|
|
|
**Ben (DX)**: Agree with Iris. I've never seen a repo that needs to be in multiple realms. Solve real problems first.
|
|
|
|
**Carmen (Systems)**: One repo, one realm is a reasonable constraint. It simplifies the daemon state model significantly.
|
|
|
|
**Felix (Distributed)**: Multi-realm creates coordination nightmares. Which realm's contract wins? Keep it simple: one repo belongs to one realm.
|
|
|
|
**Grace (Security)**: Multi-realm means multi-trust-boundary. That's a security model I'd want to think very carefully about. Defer.
|
|
|
|
**Kim (Testing)**: Single realm is much easier to test. The state space explodes with multi-realm.
|
|
|
|
**Hassan (Product)**: What's the user story for multi-realm? I can't think of one. A repo coordinates with its team. That's one realm.
|
|
|
|
**Alignment**: 95% toward single-realm constraint for MVP.
|
|
|
|
---
|
|
|
|
## Round 2: Convergence
|
|
|
|
### Question 1: How Many Tools?
|
|
|
|
**Elena**: Let me propose a concrete list based on user workflows:
|
|
|
|
**Status/Read (3 tools)**:
|
|
- `realm_status` - Overview of realm state
|
|
- `realm_check` - Validation for CI
|
|
- `contract_get` - Get specific contract details
|
|
|
|
**Session (2 tools)**:
|
|
- `session_start` - Begin work session
|
|
- `session_stop` - End session (status is included in other responses)
|
|
|
|
**Workflow (2 tools)**:
|
|
- `worktree_create` - Set up RFC worktrees
|
|
- `pr_status` - Check PR readiness (prepare is a CLI action, not MCP)
|
|
|
|
**Notifications (1 tool)**:
|
|
- `notifications_list` - Explicit check (ack happens implicitly when acted upon)
|
|
|
|
**Total: 8 tools**
|
|
|
|
**Ben**: That's reasonable. Each maps to a clear user intent.
|
|
|
|
**Ada**: I'd merge `session_start` and `session_stop` into `session` with an action parameter. "Start" and "stop" are two sides of one concept.
|
|
|
|
**Elena**: Fair, but Claude handles "start a session" better than "call session with action=start". The verb in the tool name guides usage.
|
|
|
|
**David**: Agree with Elena. MCP tools work best when the name is the action. `session_start` is clearer than `session(action="start")`.
|
|
|
|
**Iris**: 8 tools feels like a lot. Can we cut?
|
|
|
|
**Hassan**: Which would you cut? Each serves a distinct workflow.
|
|
|
|
**Iris**: `contract_get` could be part of `realm_status` with a filter. `notifications_list` could be piggybacked only.
|
|
|
|
**James**: I want `notifications_list` as explicit tool. "Show me what changed" is a real user intent.
|
|
|
|
**Luna**: 8 tools is fine if they're well-documented. The CLI has more commands than that.
|
|
|
|
**Alignment on Q1**: 90% - 8 tools as proposed, with room to consolidate if usage shows overlap.
|
|
|
|
### Question 2: Notification Details
|
|
|
|
**Felix**: Proposal for piggybacking:
|
|
|
|
1. Every tool response includes `notifications: []` field
|
|
2. Daemon marks notifications as "delivered" when returned
|
|
3. `notifications_list` shows all (including delivered) with filter options
|
|
4. No explicit ack needed - acting on a notification is implicit ack
|
|
|
|
**Carmen**: What triggers a notification? Contract version bump?
|
|
|
|
**Felix**: Three triggers:
|
|
- Contract updated (version change)
|
|
- Contract schema changed (even same version - dangerous)
|
|
- Binding added/removed in shared domain
|
|
|
|
**Grace**: Notifications scoped to domains the current repo participates in. If aperture and fungal share s3-access domain, aperture sees fungal's changes to contracts in that domain only.
|
|
|
|
**Kim**: How do we test piggybacking? Every tool needs to include the notification check.
|
|
|
|
**Ada**: Extract to middleware. Every MCP handler calls `check_notifications()` and merges into response.
|
|
|
|
**Alignment on Q2**: 95% - Piggyback with explicit list, middleware pattern, three trigger types.
|
|
|
|
### Question 3: Single Realm Confirmed
|
|
|
|
**All**: Consensus. One repo, one realm. The `realm` parameter is optional (defaults to cwd detection) but exists for explicit override in edge cases.
|
|
|
|
**Ada**: Document clearly: "A repo belongs to one realm. To coordinate across organizational boundaries, create a shared realm."
|
|
|
|
**Alignment on Q3**: 95% - Single realm constraint, documented clearly.
|
|
|
|
---
|
|
|
|
## Round 3: Final Positions
|
|
|
|
### Resolved Design
|
|
|
|
**Tool Inventory (8 tools)**:
|
|
|
|
| Tool | Purpose | Notifications |
|
|
|------|---------|---------------|
|
|
| `realm_status` | Realm overview | Yes |
|
|
| `realm_check` | Validation | Yes |
|
|
| `contract_get` | Contract details | Yes |
|
|
| `session_start` | Begin session | Yes |
|
|
| `session_stop` | End session | No (final) |
|
|
| `worktree_create` | Create RFC worktrees | Yes |
|
|
| `pr_status` | PR readiness | Yes |
|
|
| `notifications_list` | Explicit notification check | N/A |
|
|
|
|
**Notification Model**:
|
|
- Piggybacked on tool responses
|
|
- Three triggers: version change, schema change, binding change
|
|
- Scoped to shared domains
|
|
- Middleware pattern for implementation
|
|
- Explicit list for "catch up" workflow
|
|
|
|
**Realm Constraint**:
|
|
- One repo belongs to one realm
|
|
- Optional `realm` parameter for explicit override
|
|
- Detected from `.blue/config.yaml` by default
|
|
|
|
---
|
|
|
|
## Round 4: Resolving the Deferred 5%
|
|
|
|
### Question 4: Notification Persistence
|
|
|
|
**Carmen (Systems)**: Notifications need a lifecycle. Options:
|
|
- A) Session-scoped: live until session ends
|
|
- B) Time-based: live for N hours
|
|
- C) Ack-based: live until explicitly acknowledged
|
|
- D) Hybrid: session OR time, whichever comes first
|
|
|
|
**Felix (Distributed)**: Session-scoped is problematic. What if I start a session, see a notification, don't act on it, end session, start new session - is it gone? That's data loss.
|
|
|
|
**James (Observability)**: Notifications are events. Events should be durable. I want to see "what changed in the last week" even if I wasn't in a session.
|
|
|
|
**Hassan (Product)**: User story: "I was on vacation for a week. I come back, start a session. What changed?" Time-based with reasonable window.
|
|
|
|
**Grace (Security)**: Notifications contain information about what changed. Long retention = larger attack surface if daemon db is compromised. Keep it short.
|
|
|
|
**Iris (Simplicity)**: 7 days, no ack needed. Old notifications auto-expire. Simple to implement, simple to reason about.
|
|
|
|
**Ben (DX)**: What about "I've seen this, stop showing me"? Piggyback means I see the same notification every tool call until it expires.
|
|
|
|
**Ada (API Design)**: Two states: `pending` and `seen`. Piggyback only returns `pending`. First piggyback delivery marks as `seen`. `notifications_list` can show both with filter.
|
|
|
|
**Felix**: So the lifecycle is:
|
|
1. Created (pending) - triggered by contract change
|
|
2. Seen - first piggybacked delivery
|
|
3. Expired - 7 days after creation
|
|
|
|
**Kim (Testing)**: That's testable. Clear state machine.
|
|
|
|
**Elena (Claude)**: Claude sees notification once via piggyback, can ask for history via `notifications_list`. Clean.
|
|
|
|
**Luna (Docs)**: Easy to document: "Notifications appear once automatically, then move to history. History retained 7 days."
|
|
|
|
**Alignment on Q4**: 95%
|
|
- **Lifecycle**: pending → seen → expired
|
|
- **Retention**: 7 days from creation
|
|
- **Piggyback**: only pending notifications
|
|
- **List**: shows all with state filter
|
|
|
|
---
|
|
|
|
### Question 5: Schema Change Detection
|
|
|
|
**Carmen (Systems)**: JSON Schema diffing is hard. Semantic equivalence is undecidable in general. Options:
|
|
- A) Hash comparison (fast, false positives on formatting)
|
|
- B) Normalized hash (canonicalize then hash)
|
|
- C) Structural diff (expensive, accurate)
|
|
- D) Don't detect schema changes, only version changes
|
|
|
|
**Ada (API Design)**: What's the user need? "Contract schema changed" means "you might need to update your code." Version bump should signal that.
|
|
|
|
**David (MCP)**: If we require version bump for schema changes, we don't need schema diffing. The version IS the signal.
|
|
|
|
**Iris (Simplicity)**: I like D. Schema changes without version bump is a bug. Don't build tooling for buggy workflows.
|
|
|
|
**Grace (Security)**: Counter-point: malicious or careless actor changes schema without bumping version. Consumer code breaks silently. Detection is a safety net.
|
|
|
|
**Felix (Distributed)**: Schema hash as secondary check. If schema hash changes but version doesn't, that's a warning, not a notification. Different severity.
|
|
|
|
**Ben (DX)**: So we have:
|
|
- Version change → notification (normal)
|
|
- Schema change without version change → warning in `realm_check` (smells bad)
|
|
|
|
**Kim (Testing)**: Normalized hash is deterministic. Canonicalize JSON (sorted keys, no whitespace), SHA256. Same schema always produces same hash.
|
|
|
|
**Carmen**: Canonicalization is well-defined for JSON. Use RFC 8785 (JSON Canonicalization Scheme) or similar.
|
|
|
|
**James (Observability)**: Store schema hash in contract metadata. On load, compute hash, compare. Mismatch = warning. No complex diffing needed.
|
|
|
|
**Hassan (Product)**: I like the split: version changes are notifications (expected), schema-without-version is a check warning (unexpected, possibly buggy).
|
|
|
|
**Elena (Claude)**: Clear for Claude too. Notifications are "things happened." Warnings are "something might be wrong."
|
|
|
|
**Alignment on Q5**: 95%
|
|
- **Version change**: notification (normal workflow)
|
|
- **Schema change without version**: warning in `realm_check` (smells bad)
|
|
- **Detection method**: canonical JSON hash (RFC 8785 style)
|
|
- **Storage**: hash stored in contract, computed on load, compared
|
|
|
|
---
|
|
|
|
### Question 6: Worktree Tool Scope
|
|
|
|
**Hassan (Product)**: User stories:
|
|
1. "I'm starting RFC work, set up worktrees for all repos in my realm"
|
|
2. "I only need to touch aperture and fungal for this RFC, not the others"
|
|
3. "I'm in aperture, create a worktree just for this repo"
|
|
|
|
**Ben (DX)**: Default should be "smart" - create worktrees for repos in domains I participate in, not all repos in realm.
|
|
|
|
**Ada (API Design)**: Parameters:
|
|
- `rfc` (required): branch name
|
|
- `repos` (optional): specific list, default = domain peers
|
|
|
|
**Felix (Distributed)**: "Domain peers" = repos that share at least one domain with current repo. If aperture and fungal share s3-access, they're peers.
|
|
|
|
**Iris (Simplicity)**: What if I just want current repo? That's the simplest case.
|
|
|
|
**Luna (Docs)**: Three modes:
|
|
1. `worktree_create(rfc="x")` → domain peers (smart default)
|
|
2. `worktree_create(rfc="x", repos=["a","b"])` → specific list
|
|
3. `worktree_create(rfc="x", repos=["self"])` → just current repo
|
|
|
|
**Kim (Testing)**: "self" is a magic value. I'd prefer explicit: `repos=["aperture"]` where aperture is current repo.
|
|
|
|
**Elena (Claude)**: Claude can figure out current repo name from context. Magic values are confusing for LLMs.
|
|
|
|
**Ada**: Revised:
|
|
- `repos` omitted → domain peers
|
|
- `repos=[]` (empty) → error, must specify something
|
|
- `repos=["aperture"]` → just aperture
|
|
|
|
**Ben**: What if repo has no domain peers? Solo repo in realm.
|
|
|
|
**Felix**: Then domain peers = empty = just self. Natural fallback.
|
|
|
|
**Carmen**: Edge case: repo in multiple domains with different peer sets. Union of all peers?
|
|
|
|
**Grace**: Union. If you share any domain, you might need to coordinate.
|
|
|
|
**James (Observability)**: Log which repos were selected and why. "Creating worktrees for domain peers: aperture, fungal (shared domain: s3-access)"
|
|
|
|
**Alignment on Q6**: 95%
|
|
- **Default**: domain peers (repos sharing at least one domain)
|
|
- **Explicit**: `repos` parameter for specific list
|
|
- **Solo repo**: defaults to just self
|
|
- **Multiple domains**: union of all peers
|
|
- **Logging**: explain selection reasoning
|
|
|
|
---
|
|
|
|
## Remaining 5%: Truly Deferred
|
|
|
|
1. **Notification aggregation** - If contract changes 5 times in an hour, 5 notifications or 1? (Decide during implementation based on UX testing)
|
|
|
|
---
|
|
|
|
## Final Alignment: 98%
|
|
|
|
**Consensus reached on**:
|
|
|
|
### Core Design (Rounds 1-3)
|
|
- 8 focused tools mapping to user workflows
|
|
- Piggyback notifications with explicit list fallback
|
|
- Single realm constraint with documented rationale
|
|
|
|
### Notification Persistence (Round 4)
|
|
- Lifecycle: pending → seen → expired
|
|
- Retention: 7 days from creation
|
|
- Piggyback delivers pending only, marks as seen
|
|
- List tool shows all with state filter
|
|
|
|
### Schema Change Detection (Round 5)
|
|
- Version changes → notifications (normal workflow)
|
|
- Schema-without-version → `realm_check` warning (smells bad)
|
|
- Detection via canonical JSON hash (RFC 8785 style)
|
|
|
|
### Worktree Scope (Round 6)
|
|
- Default: domain peers (repos sharing domains with current repo)
|
|
- Explicit: `repos` parameter overrides default
|
|
- Solo repos default to self
|
|
- Multiple domains: union of all peers
|
|
- Log selection reasoning for transparency
|
|
|
|
### Truly Deferred (2%)
|
|
- Notification aggregation (rapid changes: batch or individual?)
|
|
|
|
**Panel Sign-off**:
|
|
- Ada ✓, Ben ✓, Carmen ✓, David ✓, Elena ✓, Felix ✓
|
|
- Grace ✓, Hassan ✓, Iris ✓, James ✓, Kim ✓, Luna ✓
|