Eric Garcia 0fea499957 feat: lifecycle suffixes for all document states + resolve all clippy warnings

Every document filename now mirrors its lifecycle state with a status
suffix (e.g., .draft.md, .wip.md, .accepted.md). No more bare .md for
tracked document types. Also renamed all from_str methods to parse to
avoid FromStr trait confusion, introduced StagingDeploymentParams struct,
and fixed all 19 clippy warnings across the codebase.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-26 12:19:46 -05:00

8.6 KiB

Raw Blame History

RFC 0004: ADR Adherence


Status	Implemented
Date	2026-01-24
Source Spike	adr-adherence
ADRs	0004 (Evidence), 0007 (Integrity), 0008 (Honor)

Summary

No mechanism to surface relevant ADRs during work, track ADR citations, or verify adherence to testable architectural decisions.

Philosophy

Guide, don't block. ADRs are beliefs, not bureaucracy. Blue should:

Help you find relevant ADRs
Make citing ADRs easy
Verify testable ADRs optionally
Never require ADR approval to proceed

Proposal

Layer 1: Awareness (Passive)

`blue_adr_list`

List all ADRs with summaries:

blue_adr_list

Returns:

{
  "adrs": [
    { "number": 0, "title": "Never Give Up", "summary": "The only rule we need" },
    { "number": 4, "title": "Evidence", "summary": "Show, don't tell" },
    ...
  ]
}

`blue_adr_get`

Get full ADR content:

blue_adr_get number=4

Returns ADR markdown and metadata.

Layer 2: Contextual Relevance (Active)

`blue_adr_relevant`

Given context, use AI to suggest relevant ADRs:

blue_adr_relevant context="testing strategy"

Returns:

{
  "relevant": [
    {
      "number": 4,
      "title": "Evidence",
      "confidence": 0.95,
      "why": "Testing is the primary form of evidence that code works. This ADR's core principle 'show, don't tell' directly applies to test strategy decisions."
    },
    {
      "number": 7,
      "title": "Integrity",
      "confidence": 0.82,
      "why": "Tests verify structural wholeness - that the system holds together under various conditions."
    }
  ]
}

AI-Powered Relevance:

Keyword matching fails for philosophical ADRs. "Courage" won't match "deleting legacy code" even though ADR 0009 is highly relevant.

The AI evaluator:

Receives the full context (RFC title, problem, code diff, etc.)
Reads all ADR content (cached in prompt)
Determines semantic relevance with reasoning
Returns confidence scores and explanations

Prompt Structure:

You are evaluating which ADRs are relevant to this work.

Context: {user_context}

ADRs:
{all_adr_summaries}

For each ADR, determine:
1. Is it relevant? (yes/no)
2. Confidence (0.0-1.0)
3. Why is it relevant? (1-2 sentences)

Only return ADRs with confidence > 0.7.

Model Selection:

Use fast/cheap model (Haiku) for relevance checks
Results are suggestions, not authoritative
User can override or ignore

Graceful Degradation:

Condition	Behavior
API key configured, API up	AI relevance (default)
API key configured, API down	Fallback to keywords + warning
No API key	Keywords only (no warning)
`--no-ai` flag	Keywords only (explicit)

Response Metadata:

{
  "method": "ai",        // or "keyword"
  "cached": false,
  "latency_ms": 287,
  "relevant": [...]
}

Privacy:

Only context string sent to API (not code, not files)
No PII should be in context string
User controls what context to send

RFC ADR Suggestions

When creating an RFC, Blue suggests relevant ADRs based on title/problem:

blue_rfc_create title="testing-framework" ...

→ "Consider these ADRs: 0004 (Evidence), 0010 (No Dead Code)"

ADR Citations in Documents

RFCs can cite ADRs in frontmatter:

| **ADRs** | 0004, 0007, 0010 |

Or inline:

Per ADR 0004 (Evidence), we require test coverage > 80%.

Layer 3: Lightweight Verification (Optional)

`blue_adr_audit`

Scan for potential ADR violations. Only for testable ADRs:

blue_adr_audit

Returns:

{
  "findings": [
    {
      "adr": 10,
      "title": "No Dead Code",
      "type": "warning",
      "message": "3 unused exports in src/utils.rs",
      "locations": ["src/utils.rs:45", "src/utils.rs:67", "src/utils.rs:89"]
    },
    {
      "adr": 4,
      "title": "Evidence",
      "type": "info",
      "message": "Test coverage at 72% (threshold: 80%)"
    }
  ],
  "passed": [
    { "adr": 5, "title": "Single Source", "message": "No duplicate definitions found" }
  ]
}

Testable ADRs:

ADR	Check
0004 Evidence	Test coverage, assertion ratios
0005 Single Source	Duplicate definitions, copy-paste detection
0010 No Dead Code	Unused exports, unreachable branches

Non-testable ADRs (human judgment):

ADR	Guidance
0001 Purpose	Does this serve meaning?
0002 Presence	Are we actually here?
0009 Courage	Are we acting rightly?
0013 Overflow	Building from fullness?

Layer 4: Documentation Trail

ADR-Document Links

Store citations in document_links table:

INSERT INTO document_links (source_id, target_id, link_type)
VALUES (rfc_id, adr_doc_id, 'cites_adr');

Search by ADR

blue_search query="adr:0004"

Returns all documents citing ADR 0004.

ADR "Referenced By"

blue_adr_get number=4

Includes:

{
  "referenced_by": [
    { "type": "rfc", "title": "testing-framework", "date": "2026-01-20" },
    { "type": "decision", "title": "require-integration-tests", "date": "2026-01-15" }
  ]
}

ADR Metadata Enhancement

Add to each ADR:

## Applies When

- Writing or modifying tests
- Reviewing pull requests
- Evaluating technical claims

## Anti-Patterns

- Claiming code works without tests
- Trusting documentation over running code
- Accepting "it works on my machine"

This gives the AI richer context for relevance matching. Anti-patterns are particularly useful - they help identify when work might be violating an ADR.

Implementation

Add ADR document type and loader
Implement blue_adr_list and blue_adr_get
Implement AI relevance evaluator:
- Load all ADRs into prompt context
- Send context + ADRs to LLM (Haiku for speed/cost)
- Parse structured response with confidence scores
- Cache ADR summaries to minimize token usage
Implement blue_adr_relevant using AI evaluator
Add ADR citation parsing to RFC creation
Implement blue_adr_audit for testable ADRs
Add "referenced_by" to ADR responses
Extend blue_search for ADR queries

AI Integration Notes:

Blue MCP server needs LLM access (API key in .blue/config.yaml)
Use streaming for responsiveness
Fallback to keyword matching if AI unavailable
Cache relevance results per context hash (5 min TTL)

Caching Strategy:

CREATE TABLE adr_relevance_cache (
    context_hash TEXT PRIMARY KEY,
    adr_versions_hash TEXT,  -- Invalidate if ADRs change
    result_json TEXT,
    created_at TEXT,
    expires_at TEXT
);

Testing AI Relevance:

Golden test cases with expected ADRs (fuzzy match)
Confidence thresholds: 0004 should be > 0.8 for "testing"
Mock AI responses in unit tests
Integration tests hit real API (rate limited)

Test Plan

List all ADRs returns correct count and summaries
Get specific ADR returns full content
AI relevance: "testing" context suggests 0004 (Evidence)
AI relevance: "deleting old code" suggests 0009 (Courage), 0010 (No Dead Code)
AI relevance: confidence scores are reasonable (0.7-1.0 range)
AI relevance: explanations are coherent
Fallback: keyword matching works when AI unavailable
RFC with | **ADRs** | 0004 | creates document link
Search adr:0004 finds citing documents
Audit detects unused exports (ADR 0010)
Audit reports test coverage (ADR 0004)
Non-testable ADRs not included in audit findings
Caching: repeated same context uses cached result
Cache invalidation: ADR content change clears relevant cache
--no-ai flag forces keyword matching
Response includes method (ai/keyword), cached, latency
Graceful degradation when API unavailable

FAQ

Q: Will this block my PRs? A: No. All ADR features are informational. Nothing blocks.

Q: Do I have to cite ADRs in every RFC? A: No. Citations are optional but encouraged for significant decisions.

Q: What if I disagree with an ADR? A: ADRs can be superseded. Create a new ADR documenting why.

Q: How do I add a new ADR? A: blue_adr_create (future work) or manually add to docs/adrs/.

Q: Why use AI for relevance instead of keywords? A: Keywords fail for philosophical ADRs. "Courage" won't match "deleting legacy code" but ADR 0009 is highly relevant. AI understands semantic meaning.

Q: What if I don't have an API key configured? A: Falls back to keyword matching. Less accurate but still functional.

Q: How much does the AI relevance check cost? A: Uses Haiku (~$0.00025 per check). Cached for 5 minutes per unique context.

"The beliefs that guide us, made visible."

— Blue

8.6 KiB Raw Blame History

RFC 0004: ADR Adherence

Summary

Philosophy

Proposal

Layer 1: Awareness (Passive)

blue_adr_list

blue_adr_get

Layer 2: Contextual Relevance (Active)

blue_adr_relevant

RFC ADR Suggestions

ADR Citations in Documents

Layer 3: Lightweight Verification (Optional)

blue_adr_audit

Layer 4: Documentation Trail

ADR-Document Links

Search by ADR

ADR "Referenced By"

ADR Metadata Enhancement

Implementation

Test Plan

FAQ

8.6 KiB

Raw Blame History

`blue_adr_list`

`blue_adr_get`

`blue_adr_relevant`

`blue_adr_audit`