docs: Update documentation for minimal k3s architecture

Reflect current state:
- k3s on single EC2 spot instance (~$7.50/month)
- Forgejo, PowerDNS, Traefik running
- Remove outdated EKS/CockroachDB references

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Eric Garcia 2026-01-24 09:03:51 -05:00
parent da40273177
commit f23ea198f0
4 changed files with 221 additions and 305 deletions

View file

@ -4,28 +4,29 @@ The warm center where infrastructure becomes real.
## What This Is
Hearth is the infrastructure repository for the letemcook ecosystem. It contains:
Hearth is the infrastructure repository for the letemcook ecosystem. It runs a minimal k3s setup on a single EC2 spot instance (~$7.50/month).
- **Terraform modules** for AWS EKS, VPC, IAM, storage
- **Kubernetes manifests** for core services (Forgejo, cert-manager, ingress)
- **Deployment scripts** for phased rollout
Services:
- **Forgejo** - Self-hosted Git
- **PowerDNS** - Authoritative DNS
- **Traefik** - Ingress with Let's Encrypt
## Quick Start
```bash
# 1. Configure AWS
aws sso login --profile muffinlabs
aws sso login --profile hearth
# 2. Bootstrap Terraform backend
cd terraform/environments/production
# 2. Deploy infrastructure
cd terraform/minimal
terraform init
terraform apply -target=module.bootstrap
terraform apply
# 3. Deploy foundation (EKS, VPC, storage)
./scripts/deploy-phase1-foundation.sh
# 3. Deploy PowerDNS (after instance is running)
scp -P 2222 scripts/deploy-powerdns.sh ec2-user@<EIP>:
ssh -p 2222 ec2-user@<EIP> 'sudo bash deploy-powerdns.sh <EIP>'
# 4. Deploy core services (Forgejo)
./scripts/deploy-phase2-core-services.sh
# 4. Update GoDaddy glue records for each domain
```
## Structure
@ -33,26 +34,28 @@ terraform apply -target=module.bootstrap
```
hearth/
├── terraform/
│ ├── modules/ # Reusable infrastructure modules
│ │ ├── vpc/ # VPC with multi-AZ subnets
│ │ ├── eks/ # EKS cluster
│ │ ├── iam/ # IAM roles and IRSA
│ │ ├── nlb/ # Network Load Balancer
│ │ └── storage/ # EFS, S3
│ ├── main.tf # Root module
│ └── minimal/ # Single EC2 + k3s
│ ├── main.tf # VPC, EC2, security groups
│ ├── variables.tf # Input variables
│ └── outputs.tf # Output values
├── kubernetes/
│ ├── forgejo/ # Git hosting
│ ├── ingress/ # ALB ingress
│ ├── cert-manager/ # TLS certificates
│ ├── karpenter/ # Auto-scaling
│ └── storage/ # Storage classes
│ └── user-data.sh # k3s + Forgejo bootstrap
├── scripts/
│ ├── deploy-phase*.sh # Phased deployment
│ └── validate-*.sh # Validation scripts
│ └── deploy-powerdns.sh # PowerDNS deployment
└── docs/
└── architecture.md # Infrastructure overview
├── architecture.md # Infrastructure overview
└── rfcs/ # Design decisions
```
## Access
```bash
# Admin SSH
ssh -p 2222 ec2-user@3.218.167.115
# kubectl (on server)
kubectl get pods -A
# Forgejo
https://git.beyondtheuniverse.superviber.com
```
## Principles
@ -62,17 +65,17 @@ From Blue's ADRs:
- **Single Source (0005)**: Infrastructure as code, one truth
- **Evidence (0004)**: Terraform plan before apply
- **No Dead Code (0010)**: Delete unused resources
- **Never Give Up (0000)**: Deploy, fail, learn, redeploy
- **Freedom Through Constraint (0011)**: Minimal viable infrastructure
## AWS Profile
Use `muffinlabs` profile for all AWS operations:
Use `hearth` profile for all AWS operations:
```bash
export AWS_PROFILE=muffinlabs
export AWS_PROFILE=hearth
```
## Related Repos
- **blue** - Philosophy and CLI tooling
- **coherence-mcp** - MCP server (source of these manifests)
- **coherence-mcp** - MCP server (original source)

View file

@ -1,37 +1,54 @@
# Hearth
Infrastructure for the letemcook ecosystem.
Infrastructure for the letemcook ecosystem. You are home.
## Overview
Hearth deploys and manages:
Hearth runs on a single EC2 spot instance with k3s, hosting:
- **EKS Cluster** - Kubernetes on AWS with Karpenter auto-scaling
- **Forgejo** - Self-hosted Git (git.beyondtheuniverse.superviber.com)
- **Core Services** - Ingress, TLS, storage
- **Forgejo** - Self-hosted Git at git.beyondtheuniverse.superviber.com
- **PowerDNS** - Authoritative DNS for managed domains
- **Traefik** - Ingress with Let's Encrypt TLS
## Status
| Component | Status |
|-----------|--------|
| Terraform modules | Ported from coherence-mcp |
| EKS cluster | Not deployed |
| Forgejo | Not deployed |
| k3s cluster | Running |
| Forgejo | Running |
| PowerDNS | Running |
| TLS | Pending (rate limited until Jan 25) |
## Managed Domains
DNS served by PowerDNS for:
- superviber.com
- muffinlabs.ai
- letemcook.com
- appbasecamp.com
- thanksforborrowing.com
- alignment.coop
## Cost
| Component | Monthly |
|-----------|---------|
| EC2 t4g.small spot | ~$5 |
| EBS gp3 20GB | ~$2 |
| Elastic IP | ~$0.50 |
| **Total** | **~$7.50** |
## Getting Started
See [CLAUDE.md](CLAUDE.md) for setup instructions.
## Cost Estimate
## Architecture
| Component | Monthly |
|-----------|---------|
| EKS Control Plane | $73 |
| Spot nodes (variable) | $0-50 |
| NLB | $16 |
| EFS | $5 |
| S3 | $5 |
| **Total** | **~$100-150** |
See [docs/architecture.md](docs/architecture.md) for details.
## RFCs
- [RFC 0003: PowerDNS Self-Hosted DNS](docs/rfcs/0003-powerdns-self-hosted.md)
## License

View file

@ -1,269 +1,164 @@
# Foundation Infrastructure
# Hearth Architecture
RFC 0039: ADR-Compliant Foundation Infrastructure
Minimal infrastructure for ~1 user at ~$7.50/month.
## Overview
This directory contains Terraform modules and Kubernetes manifests for deploying
the Alignment foundation infrastructure on AWS EKS.
## Architecture
```
Internet
|
+---------+----------+
| Shared NLB |
| (~$16/mo) |
+--------------------+
| :53 DNS (PowerDNS)|
| :25 SMTP |
| :587 Submission |
| :993 IMAPS |
| :443 HTTPS |
+--------+-----------+
+------------+------------+
| Elastic IP |
| 3.218.167.115 |
+------------+------------+
|
+--------------------+--------------------+
+-------------------+-------------------+
| | |
+-----+------+ +-----+------+ +------+-----+
| AZ-a | | AZ-b | | AZ-c |
+------------+ +------------+ +------------+
| | | | | |
| Karpenter | | Karpenter | | Karpenter |
| Spot Nodes | | Spot Nodes | | Spot Nodes |
| | | | | |
+------------+ +------------+ +------------+
| | | | | |
| CockroachDB| | CockroachDB| | CockroachDB|
| (m6i.large)| | (m6i.large)| | (m6i.large)|
| | | | | |
+------------+ +------------+ +------------+
:22 SSH :53 DNS :443 HTTPS
(Git) (PowerDNS) (Traefik)
| | |
+-------------------+-------------------+
|
+------------+------------+
| EC2 t4g.small (ARM) |
| Amazon Linux 2023 |
| 20GB gp3 EBS |
+------------+------------+
|
+------------+------------+
| k3s |
+-------------------------+
| |
+------+------+ +------+------+
| traefik | | dns |
| namespace | | namespace |
+-------------+ +-------------+
| Traefik | | PowerDNS |
| (ingress) | | (auth DNS) |
+-------------+ +-------------+
|
+------+------+
| forgejo |
| namespace |
+-------------+
| Forgejo |
| (git host) |
+-------------+
```
## Components
### EC2 Instance
- **Type**: t4g.small (2 vCPU, 2GB RAM, ARM64)
- **Pricing**: Spot instance (~$0.007/hr)
- **Storage**: 20GB gp3 EBS (encrypted)
- **OS**: Amazon Linux 2023
### k3s
Lightweight Kubernetes distribution. Single-node cluster with:
- Built-in containerd
- Local storage
- No Traefik (disabled, using our own)
### Traefik
Ingress controller with:
- HTTP → HTTPS redirect
- Let's Encrypt ACME (HTTP-01 challenge)
- TCP routing for Git SSH
### PowerDNS
Authoritative DNS server for managed domains:
- superviber.com
- muffinlabs.ai
- letemcook.com
- appbasecamp.com
- thanksforborrowing.com
- alignment.coop
Uses SQLite backend, data persisted to /data/powerdns.
### Forgejo
Self-hosted Git forge (Gitea fork):
- Web UI at git.beyondtheuniverse.superviber.com
- Git SSH on port 22
- SQLite database
- Data persisted to /data/forgejo
## Storage
All persistent data on host filesystem:
```
/data/
├── forgejo/ # Forgejo repos and database
│ └── gitea/
│ ├── gitea.db
│ └── conf/app.ini
└── powerdns/ # PowerDNS database
└── pdns.sqlite3
```
## Networking
### Security Group
| Port | Protocol | Source | Purpose |
|------|----------|--------|---------|
| 22 | TCP | 0.0.0.0/0 | Git SSH |
| 53 | UDP/TCP | 0.0.0.0/0 | DNS |
| 80 | TCP | 0.0.0.0/0 | HTTP (redirect) |
| 443 | TCP | 0.0.0.0/0 | HTTPS |
| 2222 | TCP | Admin IPs | Admin SSH |
| 6443 | TCP | Admin IPs | Kubernetes API |
### DNS Flow
```
User query → GoDaddy NS lookup → ns1/ns2.superviber.com
Glue record: 3.218.167.115
PowerDNS (port 53)
Zone lookup → Response
```
## Cost Breakdown
| Component | Monthly Cost |
|-----------|--------------|
| EKS Control Plane | $73 |
| CockroachDB (3x m6i.large, 3yr) | $105 |
| NLB | $16 |
| EFS | $5 |
| S3 | $5 |
| Spot nodes (variable) | $0-50 |
| **Total** | **$204-254** |
| Component | Monthly |
|-----------|---------|
| EC2 t4g.small spot | ~$5.00 |
| EBS gp3 20GB | ~$1.60 |
| Elastic IP | ~$0.50 |
| S3 backups | ~$0.50 |
| **Total** | **~$7.50** |
## ADR Compliance
## Backups
- **ADR 0003**: Self-hosted CockroachDB with FIPS 140-2
- **ADR 0004**: "Set It and Forget It" auto-scaling with Karpenter
- **ADR 0005**: Full-stack self-hosting (no SaaS dependencies)
Daily cron job at 3 AM:
1. SQLite backup of Forgejo database
2. k3s state backup
3. Upload to S3 (hearth-backups bucket)
4. 60-day retention with lifecycle policy
## Prerequisites
## Limitations
1. AWS CLI configured with appropriate credentials
2. Terraform >= 1.6.0
3. kubectl
4. Helm 3.x
This is personal infrastructure, not production-grade:
## Quick Start
- **No HA**: Single point of failure
- **Spot interruption**: Instance may be reclaimed (data persists on EBS)
- **No monitoring**: Basic healthchecks only
- **Single region**: us-east-1 only
### 1. Bootstrap Terraform Backend
## Future Work
First, create the S3 bucket and DynamoDB table for Terraform state:
```bash
cd terraform/environments/production
# Uncomment the backend.tf bootstrap code and run:
# terraform init && terraform apply
```
### 2. Deploy Foundation Infrastructure
```bash
cd terraform/environments/production
terraform init
terraform plan
terraform apply
```
### 3. Configure kubectl
```bash
aws eks update-kubeconfig --region us-east-1 --name alignment-production
```
### 4. Deploy Karpenter
```bash
# Set environment variables
export CLUSTER_NAME=$(terraform output -raw cluster_name)
export CLUSTER_ENDPOINT=$(terraform output -raw cluster_endpoint)
export KARPENTER_ROLE_ARN=$(terraform output -raw karpenter_role_arn)
export INTERRUPTION_QUEUE_NAME=$(terraform output -raw karpenter_interruption_queue_name)
# Install Karpenter
helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
--namespace karpenter --create-namespace \
-f kubernetes/karpenter/helm-values.yaml \
--set settings.clusterName=$CLUSTER_NAME \
--set settings.clusterEndpoint=$CLUSTER_ENDPOINT \
--set settings.interruptionQueue=$INTERRUPTION_QUEUE_NAME \
--set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=$KARPENTER_ROLE_ARN
# Apply NodePool and EC2NodeClass
kubectl apply -f kubernetes/karpenter/nodepool.yaml
kubectl apply -f kubernetes/karpenter/ec2nodeclass.yaml
```
### 5. Deploy Storage Classes
```bash
export EFS_ID=$(terraform output -raw efs_id)
envsubst < kubernetes/storage/classes.yaml | kubectl apply -f -
```
## Directory Structure
```
infra/
├── terraform/
│ ├── main.tf # Root module
│ ├── variables.tf # Input variables
│ ├── outputs.tf # Output values
│ ├── versions.tf # Provider versions
│ ├── modules/
│ │ ├── vpc/ # VPC with multi-AZ subnets
│ │ ├── eks/ # EKS cluster with Fargate
│ │ ├── iam/ # IAM roles and IRSA
│ │ ├── storage/ # EFS and S3
│ │ ├── nlb/ # Shared NLB
│ │ └── cockroachdb/ # CockroachDB (future)
│ └── environments/
│ └── production/ # Production config
├── kubernetes/
│ ├── karpenter/ # Karpenter manifests
│ ├── cockroachdb/ # CockroachDB StatefulSet
│ ├── storage/ # Storage classes
│ ├── ingress/ # Ingress configuration
│ └── cert-manager/ # TLS certificates
└── README.md
```
## Modules
### VPC Module
Creates a VPC with:
- 3 availability zones
- Public subnets (for NLB, NAT Gateways)
- Private subnets (for EKS nodes, workloads)
- Database subnets (isolated, for CockroachDB)
- NAT Gateway per AZ for HA
- VPC endpoints for S3, ECR, STS, EC2
### EKS Module
Creates an EKS cluster with:
- Kubernetes 1.29
- Fargate profiles for Karpenter and kube-system
- OIDC provider for IRSA
- KMS encryption for secrets
- Cluster logging enabled
### IAM Module
Creates IAM roles for:
- Karpenter controller
- EBS CSI driver
- EFS CSI driver
- AWS Load Balancer Controller
- cert-manager
- External DNS
### Storage Module
Creates storage resources:
- EFS filesystem with encryption
- S3 bucket for backups (versioned, encrypted)
- S3 bucket for blob storage
- KMS key for encryption
### NLB Module
Creates a shared NLB with:
- HTTPS (443) for web traffic
- DNS (53 UDP/TCP) for PowerDNS
- SMTP (25), Submission (587), IMAPS (993) for email
- Cross-zone load balancing
- Target groups for each service
## Operations
### Scaling
Karpenter automatically scales nodes based on pending pods. No manual intervention required.
To adjust limits:
```bash
kubectl edit nodepool default
```
### Monitoring
Check Karpenter status:
```bash
kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter -f
```
Check node status:
```bash
kubectl get nodes -L karpenter.sh/capacity-type,node.kubernetes.io/instance-type
```
### Troubleshooting
View Karpenter events:
```bash
kubectl get events -n karpenter --sort-by=.lastTimestamp
```
Check pending pods:
```bash
kubectl get pods --all-namespaces --field-selector=status.phase=Pending
```
## Security
- All storage encrypted at rest (KMS)
- TLS required for all connections
- IMDSv2 required for all nodes
- VPC Flow Logs enabled
- Cluster audit logging enabled
- FIPS 140-2 mode for CockroachDB
## Disaster Recovery
### Backups
CockroachDB backups are stored in S3 with:
- Daily full backups
- 30-day retention in Standard
- 90-day transition to Glacier
- 365-day noncurrent version retention
### Recovery
To restore from backup:
```bash
# Restore CockroachDB from S3 backup
cockroach restore ... FROM 's3://alignment-production-backups/...'
```
## References
- [RFC 0039: Foundation Infrastructure](../../../.repos/alignment-mcp/docs/rfcs/0039-foundation-infrastructure.md)
- [ADR 0003: CockroachDB Self-Hosted FIPS](../../../.repos/alignment-mcp/docs/adrs/0003-cockroachdb-self-hosted-fips.md)
- [ADR 0004: Set It and Forget It](../../../.repos/alignment-mcp/docs/adrs/0004-set-it-and-forget-it-architecture.md)
- [ADR 0005: Full-Stack Self-Hosting](../../../.repos/alignment-mcp/docs/adrs/0005-full-stack-self-hosting.md)
- [Karpenter Documentation](https://karpenter.sh/)
- [EKS Best Practices](https://aws.github.io/aws-eks-best-practices/)
See [RFC 0003](rfcs/0003-powerdns-self-hosted.md) for planned improvements:
- HA DNS with separate instance
- DNSSEC
- DNS-over-HTTPS
- PowerDNS-Admin UI

View file

@ -32,10 +32,11 @@ systemctl enable --now docker
sed -i "s/#Port 22/Port $SSH_PORT/" /etc/ssh/sshd_config
systemctl restart sshd
# Add admin SSH key
if [ -n "${ssh_public_key}" ]; then
# Add admin SSH key (passed from terraform)
SSH_KEY="${ssh_public_key}"
if [ -n "$SSH_KEY" ]; then
mkdir -p /home/ec2-user/.ssh
echo "${ssh_public_key}" >> /home/ec2-user/.ssh/authorized_keys
echo "$SSH_KEY" >> /home/ec2-user/.ssh/authorized_keys
chown -R ec2-user:ec2-user /home/ec2-user/.ssh
chmod 700 /home/ec2-user/.ssh
chmod 600 /home/ec2-user/.ssh/authorized_keys