blue/.blue/docs/spikes/2026-01-30T1503Z-blue-mcp-server-on-superviber-infrastructure.wip.md
Eric Garcia 02901dfec7 chore: batch commit - ADRs, RFCs, dialogues, spikes, and code updates
ADRs:
- Update 0008-honor, 0009-courage, 0013-overflow, 0015-plausibility
- Add 0017-hosted-coding-assistant-architecture

RFCs:
- 0032: per-repo AWS profile configuration (draft)
- 0033: round-scoped dialogue files (impl + plan)
- 0034: comprehensive config architecture (accepted)
- 0036: expert output discipline (impl)
- 0037: single source protocol authority (draft)
- 0038: SDLC workflow discipline (draft)
- 0039: ADR architecture greenfield clarifications (impl)
- 0040: divorce financial analysis (draft)
- 0042: alignment dialogue defensive publication (draft)

Spikes:
- Read tool token limit on assembled dialogues
- RFC ID collision root cause
- Expert agent output too long
- Judge writes expert outputs
- Blue MCP server on superviber infrastructure
- Playwright MCP multiple window isolation

Dialogues: 16 alignment dialogue records

Code:
- blue-core: forge module enhancements
- blue-mcp: env handlers and server updates
- alignment-expert agent improvements
- alignment-play skill refinements
- install.sh script

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 16:28:31 -05:00

11 KiB

Spike: Blue MCP Server on Superviber Infrastructure

Status In Progress
Date 2026-01-30
Time Box 1 hour

Question

How can we run the Blue MCP server on Superviber infrastructure while maintaining client data sovereignty, encryption, and revocable access?


Context

The current architecture (Appendix A of the financial portfolio doc) assumes Blue MCP runs in the client's AWS account. However, running MCP on Superviber infrastructure offers benefits:

  • Simpler client onboarding: No deployment required in client account
  • Centralized updates: Push new features without client coordination
  • Operational visibility: Better observability and debugging
  • Cost efficiency: Shared infrastructure across clients

The challenge: maintain data sovereignty guarantees while centralizing compute.


Architecture Options

Option A: Proxy Model with Client Data Store

MCP server on Superviber infra acts as stateless compute. All persistent data remains on client infrastructure, accessed via secure API.

flowchart TB
    subgraph CLIENT["Client AWS Account"]
        direction TB
        DS[("Data Store<br/>(S3/DynamoDB)")]
        KMS["Client KMS"]
        API["Client API Gateway"]
        DS --- KMS
        API --- DS
    end

    subgraph SV["Superviber Infrastructure"]
        direction TB
        MCP["Blue MCP Server"]
        INF["Infisical<br/>(Secrets)"]
        MCP --- INF
    end

    subgraph CLAUDE["Claude Code"]
        CC["User Session"]
    end

    CC -->|"MCP Protocol<br/>(TLS 1.3)"| MCP
    MCP -->|"Cross-Account<br/>AssumeRole"| API

    style CLIENT fill:#e8f5e9
    style SV fill:#e3f2fd
    style CLAUDE fill:#fff3e0

Data Flow:

  1. Claude Code connects to Blue MCP on Superviber infra
  2. MCP assumes cross-account role to access client API
  3. Client API reads/writes to encrypted data store
  4. Data encrypted by client KMS - MCP never sees plaintext keys
  5. MCP processes in memory, never persists client data

AWS PrivateLink provides private connectivity without traversing public internet.

flowchart LR
    subgraph CLIENT["Client VPC"]
        direction TB
        DS[("Encrypted<br/>Data Store")]
        EP["VPC Endpoint<br/>(PrivateLink)"]
        DS --- EP
    end

    subgraph SV["Superviber VPC"]
        direction TB
        MCP["Blue MCP"]
        NLB["Network Load<br/>Balancer"]
        ES["Endpoint<br/>Service"]
        MCP --- NLB --- ES
    end

    EP <-->|"Private<br/>Connection"| ES

    style CLIENT fill:#e8f5e9
    style SV fill:#e3f2fd

Pros: Traffic never leaves AWS backbone, lower latency Cons: More complex setup, per-client PrivateLink costs

Option C: Hybrid with Edge Cache

MCP runs on Superviber with optional edge caching for read-heavy ADR/RFC data.

flowchart TB
    subgraph CLIENT["Client Account"]
        DS[("Source of Truth<br/>(Encrypted)")]
        HOOK["Webhook<br/>on Change"]
    end

    subgraph SV["Superviber"]
        direction TB
        MCP["Blue MCP"]
        CACHE[("Edge Cache<br/>(Ephemeral)")]
        MCP --- CACHE
    end

    DS -->|"Sync on<br/>Change"| HOOK
    HOOK -->|"Invalidate"| CACHE
    MCP <-->|"Read/Write"| DS

    style CLIENT fill:#e8f5e9
    style SV fill:#e3f2fd

Pros: Better performance for read-heavy workloads Cons: Cache adds complexity, eventual consistency


The proxy model is simplest and maintains strongest data sovereignty guarantees.

Detailed Architecture

flowchart TB
    subgraph CLAUDE["Claude Code (User Machine)"]
        CC["Claude Session"]
    end

    subgraph SV["Superviber Infrastructure"]
        direction TB

        subgraph MCP_CLUSTER["MCP Cluster (EKS)"]
            MCP1["MCP Pod 1"]
            MCP2["MCP Pod 2"]
            MCPN["MCP Pod N"]
        end

        ALB["Application<br/>Load Balancer"]
        INF["Infisical"]

        ALB --> MCP1 & MCP2 & MCPN
        MCP1 & MCP2 & MCPN --> INF
    end

    subgraph CLIENT["Client AWS Account"]
        direction TB

        subgraph VPC["Client VPC"]
            APIGW["API Gateway<br/>(Private)"]
            LAMBDA["Lambda<br/>(Data Access)"]

            subgraph DATA["Data Layer"]
                S3[("S3 Bucket<br/>(Dialogues, RFCs)")]
                DDB[("DynamoDB<br/>(State, Index)")]
            end

            KMS["KMS Key<br/>(Client Owned)"]
        end

        IAM["IAM Role<br/>(Cross-Account)"]

        APIGW --> LAMBDA --> DATA
        DATA --> KMS
    end

    CC -->|"① MCP over TLS 1.3"| ALB
    MCP1 -->|"② AssumeRole"| IAM
    IAM -->|"③ Scoped Access"| APIGW

    style CLAUDE fill:#fff3e0
    style SV fill:#e3f2fd
    style CLIENT fill:#e8f5e9
    style DATA fill:#c8e6c9

Request Flow

sequenceDiagram
    participant CC as Claude Code
    participant MCP as Blue MCP<br/>(Superviber)
    participant INF as Infisical
    participant STS as AWS STS
    participant API as Client API
    participant KMS as Client KMS
    participant S3 as Client S3

    CC->>MCP: blue_rfc_get("0042")

    MCP->>INF: Get client credentials
    INF-->>MCP: Client ID, Role ARN

    MCP->>STS: AssumeRole(client_role_arn)
    STS-->>MCP: Temporary credentials (1hr)

    MCP->>API: GET /rfcs/0042
    API->>S3: GetObject(rfcs/0042.md)
    S3->>KMS: Decrypt(data_key)
    KMS-->>S3: Plaintext key
    S3-->>API: Decrypted content
    API-->>MCP: RFC content

    MCP-->>CC: RFC document

    Note over MCP: Data processed in memory<br/>Never persisted

Access Control Matrix

Resource Superviber Access Client Control
Blue MCP Server Owns & operates N/A
Client API Gateway Invoke via role Creates/deletes endpoint
Client S3 Bucket Read/write via role Owns bucket, sets policy
Client DynamoDB Read/write via role Owns table, sets policy
Client KMS Key No access Full control
Infisical Secrets Read (membership) Owns workspace, can revoke
IAM Cross-Account Role AssumeRole Creates/deletes role

Client IAM Role Policy

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowBlueMCPAccess",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::client-blue-data",
        "arn:aws:s3:::client-blue-data/*"
      ]
    },
    {
      "Sid": "AllowDynamoDBAccess",
      "Effect": "Allow",
      "Action": [
        "dynamodb:GetItem",
        "dynamodb:PutItem",
        "dynamodb:Query",
        "dynamodb:UpdateItem"
      ],
      "Resource": "arn:aws:dynamodb:*:*:table/blue-*"
    },
    {
      "Sid": "DenyKMSAccess",
      "Effect": "Deny",
      "Action": "kms:*",
      "Resource": "*"
    }
  ]
}

Key point: The DenyKMSAccess statement ensures Superviber can never access encryption keys directly. S3 and DynamoDB use envelope encryption - they decrypt data using the KMS key, but the key itself never leaves KMS.

Trust Policy (Client Creates)

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::SUPERVIBER_ACCOUNT_ID:role/BlueMCPServiceRole"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "sts:ExternalId": "${client_external_id}"
        }
      }
    }
  ]
}

Revocation: Client removes or modifies this trust policy → immediate access termination.


Infisical Integration

flowchart LR
    subgraph CLIENT_INF["Client's Infisical Workspace"]
        direction TB
        SEC1["AWS_ROLE_ARN"]
        SEC2["EXTERNAL_ID"]
        SEC3["API_ENDPOINT"]
        SEC4["ANTHROPIC_API_KEY"]
    end

    subgraph SV_INF["Superviber Infisical"]
        direction TB
        SVC["Service Token<br/>(Read-Only)"]
    end

    subgraph MCP["Blue MCP"]
        ENV["Runtime Env"]
    end

    SVC -->|"Membership"| CLIENT_INF
    CLIENT_INF -->|"Inject"| ENV

    style CLIENT_INF fill:#e8f5e9
    style SV_INF fill:#e3f2fd

Client onboarding:

  1. Client creates Infisical workspace
  2. Client adds required secrets (role ARN, endpoint, etc.)
  3. Client invites Superviber service account (read-only)
  4. Client can revoke by removing membership

Data Sovereignty Guarantees (Updated)

Guarantee Previous (Client Infra) New (Superviber Infra)
Data at rest Client S3/KMS Client S3/KMS (unchanged)
Data in flight TLS 1.3 TLS 1.3 (unchanged)
Encryption keys Client KMS Client KMS (unchanged)
Compute location Client account Superviber account
Data in memory Client account Superviber account (ephemeral)
Revocation IAM + Infisical IAM + Infisical (unchanged)
Audit trail Client CloudTrail Client CloudTrail + Superviber logs

New consideration: Data passes through Superviber memory during processing. Mitigations:

  • No persistence - data only held during request lifecycle
  • Memory encryption at rest (EKS with encrypted nodes)
  • SOC 2 attestation for Superviber infrastructure
  • Option for dedicated/isolated compute per client

Client Onboarding Flow

flowchart TB
    A["1. Client signs agreement"] --> B["2. Client creates<br/>Infisical workspace"]
    B --> C["3. Client provisions<br/>IAM role with trust policy"]
    C --> D["4. Client creates<br/>S3 bucket + DynamoDB"]
    D --> E["5. Client adds secrets<br/>to Infisical"]
    E --> F["6. Client invites<br/>Superviber to workspace"]
    F --> G["7. Superviber configures<br/>MCP for client"]
    G --> H["8. Client connects<br/>Claude Code to MCP"]

    style A fill:#ffecb3
    style H fill:#c8e6c9

Estimated onboarding time: 30 minutes with Terraform/CDK templates provided.


Open Questions

  1. Multi-tenancy: Single MCP cluster serving all clients, or isolated per client?

    • Single cluster: Cost efficient, simpler ops
    • Isolated: Stronger security boundary, client preference for finance
  2. Latency: Cross-account API calls add ~50-100ms per request. Acceptable?

    • Most MCP operations are not latency-sensitive
    • Dialogue runs are already async
  3. Compliance: Does data-in-memory on Superviber infra affect client's compliance posture?

    • May need to add SOC 2 Type II for Superviber
    • Some clients may still require fully client-hosted
  4. Failover: If Superviber MCP is down, clients have no access

    • Consider multi-region deployment
    • Or provide fallback to client-hosted MCP

Recommendation

Proceed with Option A (Proxy Model) with the following implementation:

  1. Deploy Blue MCP on EKS in Superviber AWS account
  2. Use Infisical for per-client credential management
  3. Provide Terraform/CDK module for client-side infrastructure
  4. Offer "dedicated compute" tier for compliance-sensitive clients
  5. Document the memory-processing caveat in security docs

Next steps:

  • Create RFC for this architecture
  • Build Terraform module for client infrastructure
  • Add multi-tenant support to Blue MCP
  • Draft updated security/compliance documentation

Investigation by Blue