hearth/docs/architecture.md
Eric Garcia e78000831e Initial commit: Port infrastructure from coherence-mcp
Hearth is the infrastructure home for the letemcook ecosystem.

Ported from coherence-mcp/infra:
- Terraform modules (VPC, EKS, IAM, NLB, S3, storage)
- Kubernetes manifests (Forgejo, ingress, cert-manager, karpenter)
- Deployment scripts (phased rollout)

Status: Not deployed. EKS cluster needs to be provisioned.

Next steps:
1. Bootstrap terraform backend
2. Deploy phase 1 (foundation)
3. Deploy phase 2 (core services including Forgejo)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 06:06:13 -05:00

7.7 KiB

Foundation Infrastructure

RFC 0039: ADR-Compliant Foundation Infrastructure

Overview

This directory contains Terraform modules and Kubernetes manifests for deploying the Alignment foundation infrastructure on AWS EKS.

Architecture

                                  Internet
                                     |
                          +---------+----------+
                          |    Shared NLB      |
                          |    (~$16/mo)       |
                          +--------------------+
                          | :53  DNS (PowerDNS)|
                          | :25  SMTP          |
                          | :587 Submission    |
                          | :993 IMAPS         |
                          | :443 HTTPS         |
                          +--------+-----------+
                                   |
              +--------------------+--------------------+
              |                    |                    |
        +-----+------+      +-----+------+      +------+-----+
        |   AZ-a     |      |   AZ-b     |      |   AZ-c     |
        +------------+      +------------+      +------------+
        |            |      |            |      |            |
        | Karpenter  |      | Karpenter  |      | Karpenter  |
        | Spot Nodes |      | Spot Nodes |      | Spot Nodes |
        |            |      |            |      |            |
        +------------+      +------------+      +------------+
        |            |      |            |      |            |
        | CockroachDB|      | CockroachDB|      | CockroachDB|
        | (m6i.large)|      | (m6i.large)|      | (m6i.large)|
        |            |      |            |      |            |
        +------------+      +------------+      +------------+

Cost Breakdown

Component Monthly Cost
EKS Control Plane $73
CockroachDB (3x m6i.large, 3yr) $105
NLB $16
EFS $5
S3 $5
Spot nodes (variable) $0-50
Total $204-254

ADR Compliance

  • ADR 0003: Self-hosted CockroachDB with FIPS 140-2
  • ADR 0004: "Set It and Forget It" auto-scaling with Karpenter
  • ADR 0005: Full-stack self-hosting (no SaaS dependencies)

Prerequisites

  1. AWS CLI configured with appropriate credentials
  2. Terraform >= 1.6.0
  3. kubectl
  4. Helm 3.x

Quick Start

1. Bootstrap Terraform Backend

First, create the S3 bucket and DynamoDB table for Terraform state:

cd terraform/environments/production
# Uncomment the backend.tf bootstrap code and run:
# terraform init && terraform apply

2. Deploy Foundation Infrastructure

cd terraform/environments/production
terraform init
terraform plan
terraform apply

3. Configure kubectl

aws eks update-kubeconfig --region us-east-1 --name alignment-production

4. Deploy Karpenter

# Set environment variables
export CLUSTER_NAME=$(terraform output -raw cluster_name)
export CLUSTER_ENDPOINT=$(terraform output -raw cluster_endpoint)
export KARPENTER_ROLE_ARN=$(terraform output -raw karpenter_role_arn)
export INTERRUPTION_QUEUE_NAME=$(terraform output -raw karpenter_interruption_queue_name)

# Install Karpenter
helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
  --namespace karpenter --create-namespace \
  -f kubernetes/karpenter/helm-values.yaml \
  --set settings.clusterName=$CLUSTER_NAME \
  --set settings.clusterEndpoint=$CLUSTER_ENDPOINT \
  --set settings.interruptionQueue=$INTERRUPTION_QUEUE_NAME \
  --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=$KARPENTER_ROLE_ARN

# Apply NodePool and EC2NodeClass
kubectl apply -f kubernetes/karpenter/nodepool.yaml
kubectl apply -f kubernetes/karpenter/ec2nodeclass.yaml

5. Deploy Storage Classes

export EFS_ID=$(terraform output -raw efs_id)
envsubst < kubernetes/storage/classes.yaml | kubectl apply -f -

Directory Structure

infra/
├── terraform/
│   ├── main.tf                 # Root module
│   ├── variables.tf            # Input variables
│   ├── outputs.tf              # Output values
│   ├── versions.tf             # Provider versions
│   ├── modules/
│   │   ├── vpc/                # VPC with multi-AZ subnets
│   │   ├── eks/                # EKS cluster with Fargate
│   │   ├── iam/                # IAM roles and IRSA
│   │   ├── storage/            # EFS and S3
│   │   ├── nlb/                # Shared NLB
│   │   └── cockroachdb/        # CockroachDB (future)
│   └── environments/
│       └── production/         # Production config
├── kubernetes/
│   ├── karpenter/              # Karpenter manifests
│   ├── cockroachdb/            # CockroachDB StatefulSet
│   ├── storage/                # Storage classes
│   ├── ingress/                # Ingress configuration
│   └── cert-manager/           # TLS certificates
└── README.md

Modules

VPC Module

Creates a VPC with:

  • 3 availability zones
  • Public subnets (for NLB, NAT Gateways)
  • Private subnets (for EKS nodes, workloads)
  • Database subnets (isolated, for CockroachDB)
  • NAT Gateway per AZ for HA
  • VPC endpoints for S3, ECR, STS, EC2

EKS Module

Creates an EKS cluster with:

  • Kubernetes 1.29
  • Fargate profiles for Karpenter and kube-system
  • OIDC provider for IRSA
  • KMS encryption for secrets
  • Cluster logging enabled

IAM Module

Creates IAM roles for:

  • Karpenter controller
  • EBS CSI driver
  • EFS CSI driver
  • AWS Load Balancer Controller
  • cert-manager
  • External DNS

Storage Module

Creates storage resources:

  • EFS filesystem with encryption
  • S3 bucket for backups (versioned, encrypted)
  • S3 bucket for blob storage
  • KMS key for encryption

NLB Module

Creates a shared NLB with:

  • HTTPS (443) for web traffic
  • DNS (53 UDP/TCP) for PowerDNS
  • SMTP (25), Submission (587), IMAPS (993) for email
  • Cross-zone load balancing
  • Target groups for each service

Operations

Scaling

Karpenter automatically scales nodes based on pending pods. No manual intervention required.

To adjust limits:

kubectl edit nodepool default

Monitoring

Check Karpenter status:

kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter -f

Check node status:

kubectl get nodes -L karpenter.sh/capacity-type,node.kubernetes.io/instance-type

Troubleshooting

View Karpenter events:

kubectl get events -n karpenter --sort-by=.lastTimestamp

Check pending pods:

kubectl get pods --all-namespaces --field-selector=status.phase=Pending

Security

  • All storage encrypted at rest (KMS)
  • TLS required for all connections
  • IMDSv2 required for all nodes
  • VPC Flow Logs enabled
  • Cluster audit logging enabled
  • FIPS 140-2 mode for CockroachDB

Disaster Recovery

Backups

CockroachDB backups are stored in S3 with:

  • Daily full backups
  • 30-day retention in Standard
  • 90-day transition to Glacier
  • 365-day noncurrent version retention

Recovery

To restore from backup:

# Restore CockroachDB from S3 backup
cockroach restore ... FROM 's3://alignment-production-backups/...'

References