Every document filename now mirrors its lifecycle state with a status suffix (e.g., .draft.md, .wip.md, .accepted.md). No more bare .md for tracked document types. Also renamed all from_str methods to parse to avoid FromStr trait confusion, introduced StagingDeploymentParams struct, and fixed all 19 clippy warnings across the codebase. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
6.9 KiB
Spike: Blue Plugin Architecture & Alignment Protection
| Status | Complete |
| Date | 2026-01-26 |
| Time-box | 1 hour |
Question
Can Blue be packaged as a Claude Code plugin? What components benefit from plugin structure? Can the alignment dialogue system be encrypted to prevent leaking game mechanics to end users?
Investigation
Plugin Structure
Claude Code plugins are directories with a .claude-plugin/plugin.json manifest. They can bundle:
- Subagents (
agents/directory) -- markdown files defining custom agents - Skills (
skills/directory) -- slash commands with SKILL.md - Hooks (
hooks/hooks.json) -- event handlers for PreToolUse, PostToolUse, SessionStart, etc. - MCP servers (
.mcp.json) -- bundled MCP server configs with${CLAUDE_PLUGIN_ROOT}paths - LSP servers (
.lsp.json) -- language server configs
Plugins are namespaced. Blue's commands would appear as /blue:status, /blue:next, etc.
Plugin Distribution
- Installed via CLI:
claude plugin install blue@marketplace-name - Scopes: user, project, local, managed
- Distributed via git-based marketplaces
- Plugins are COPIED to a cache directory (not used in-place)
Encryption / Protection: Not Possible
There is no encryption, obfuscation, or binary packaging for plugin files. Plugins are plain markdown and JSON files copied to a cache directory. Any user with filesystem access can read every file.
However, a layered defense approach exists:
-
Compiled MCP binary (already protected): The Judge protocol, prompt templates, expert panel tier logic, and scoring formulas are all in
crates/blue-mcp/src/handlers/dialogue.rs-- compiled Rust. Users cannot read this without reverse engineering the binary. -
Thin subagent definition (minimal exposure): The
alignment-expert.mdcan be a thin wrapper -- just enough for Claude Code to know the agent's purpose and tool restrictions. The real behavioral instructions (collaborative tone, markers, output limits) can be injected at runtime via the MCP tool response fromblue_dialogue_create. -
MCP tool response injection (RFC 0023): The Judge protocol is already injected as prose in the
messagefield of theblue_dialogue_createresponse. The subagent prompt template is also injected there. This means the plugin's agent file can be minimal -- the intelligence stays in the compiled binary.
What Benefits from Plugin Structure
| Component | Current Location | Plugin Benefit |
|---|---|---|
alignment-expert subagent |
~/.claude/agents/ + .claude/agents/ |
Single install, version-controlled, namespaced |
| Blue voice & ADR context | MCP instructions field |
SessionStart hook could inject additional context |
/status, /next commands |
MCP tools only | Skill shortcuts: /blue:status, /blue:next |
| Blue MCP server | Manually configured per-repo .mcp.json |
Auto-configured via plugin .mcp.json |
| Dialogue lint/validation | MCP tool only | PostToolUse hook could auto-lint after dialogue edits |
| RFC/spike creation | MCP tool only | Skill shortcuts: /blue:rfc, /blue:spike |
Protection Strategy: Thin Agent + Fat MCP
Instead of trying to encrypt plugin files (impossible), split the system into two layers.
In the plugin (visible to users):
# alignment-expert.md
---
name: alignment-expert
description: Expert agent for alignment dialogues
tools: Read, Grep, Glob
model: sonnet
---
You are an expert participant in an alignment dialogue.
Follow the instructions provided in your prompt exactly.
In the compiled binary (invisible to users):
- Full collaborative tone (SURFACE, DEFEND, CHALLENGE, INTEGRATE, CONCEDE)
- Marker format ([PERSPECTIVE Pnn:], [TENSION Tn:], etc.)
- Output limits (400 words, 2000 chars)
- Expert panel tiers (Core/Adjacent/Wildcard)
- Scoring formula (Wisdom + Consistency + Truth + Relationships)
- Judge orchestration protocol
The MCP tool response from blue_dialogue_create injects the full behavioral prompt into the Judge's context, which then passes it to each subagent via the Task call's prompt parameter. The plugin agent file tells Claude Code "this is a sonnet-model agent with Read/Grep/Glob tools" -- no game mechanics leaked.
Plugin Directory Structure
blue-plugin/
├── .claude-plugin/
│ └── plugin.json
├── agents/
│ └── alignment-expert.md (thin: just tool/model config)
├── skills/
│ ├── status/
│ │ └── SKILL.md (/blue:status)
│ ├── next/
│ │ └── SKILL.md (/blue:next)
│ ├── rfc/
│ │ └── SKILL.md (/blue:rfc)
│ └── spike/
│ └── SKILL.md (/blue:spike)
├── hooks/
│ └── hooks.json (SessionStart, PostToolUse)
├── .mcp.json (blue MCP server auto-config)
└── README.md
Findings
| Question | Answer |
|---|---|
| Can Blue be a Claude Code plugin? | Yes. The plugin manifest supports all needed components: subagents, skills, hooks, and MCP server config. |
| What benefits from plugin structure? | Subagent distribution, skill shortcuts, auto MCP config, and lifecycle hooks. The biggest wins are eliminating manual .mcp.json setup and providing namespaced /blue:* commands. |
| Can alignment dialogues be encrypted? | No. Plugin files are plain text with no encryption or binary packaging support. |
| Is alignment still protectable? | Yes. The Thin Agent + Fat MCP strategy keeps all game mechanics in compiled Rust. The plugin agent file is a minimal shell; the real behavioral prompt is injected at runtime via MCP tool responses. |
Recommendation
Build Blue as a Claude Code plugin using the Thin Agent + Fat MCP strategy.
- Package as plugin: Create
blue-plugin/with manifest, thin subagent, skill shortcuts, hooks, and bundled.mcp.json. - Keep alignment in compiled binary: All dialogue mechanics (Judge protocol, scoring, expert tiers, markers, output limits) stay in
crates/blue-mcp/src/handlers/dialogue.rs. The plugin agent file contains only tool/model config. - Use runtime injection:
blue_dialogue_createcontinues to inject the full behavioral prompt via its MCP response. No game mechanics appear in any plugin file. - Add lifecycle hooks: SessionStart for Blue voice injection. PostToolUse for auto-linting dialogue edits.
This gives Blue single-command installation, version-controlled distribution, and namespaced commands -- without exposing alignment internals.
Outcome
- Draft an RFC for the plugin architecture (structure, manifest, skill definitions, hook behavior)
- Implement the plugin directory as a new top-level
blue-plugin/path in the workspace - Migrate the
alignment-expert.mdsubagent from manual placement to plugin-bundled thin agent - Test distribution via
claude plugin installfrom a git marketplace