0% complete
Infrastructure Track Hard 2-3 days

Full Mini-Agent Infrastructure

Deploy a complete (simplified) agent stack: VPC, RDS, EKS/ECS, S3, KMS, Secrets Manager. The capstone project.

🎯 The Mission

This is the capstone exercise. You'll build a miniature version of Tenzai's production infrastructure from scratch. By the end, you'll have deployed every major component we use.

This exercise ties together everything: networking, compute, storage, secrets, encryption, and observability.

⚠️ Time & Cost Warning: This is a substantial exercise. Budget 2-3 days. Use the smallest resource sizes possible. Tear down promptly when done to avoid costs.

🏖️ Sandbox Rules

Architecture Overview

📊 MINI-AGENT STACK
┌─────────────────────────────────────────────────────────────────────────────┐
│                                  VPC                                         │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │  Public Subnets (2 AZs)                                              │    │
│  │  - NAT Gateway                                                       │    │
│  │  - ALB                                                               │    │
│  └─────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │  Private Subnets (2 AZs)                                             │    │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐                   │    │
│  │  │   Platform  │  │   Agent     │  │   RDS       │                   │    │
│  │  │   (ECS/EKS) │  │   (ECS/EKS) │  │   Postgres  │                   │    │
│  │  └─────────────┘  └─────────────┘  └─────────────┘                   │    │
│  └─────────────────────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────────────────┘

External Services:
┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│     S3       │  │     KMS      │  │   Secrets    │  │  CloudWatch  │
│  (Artifacts) │  │ (Encryption) │  │   Manager    │  │    (Logs)    │
└──────────────┘  └──────────────┘  └──────────────┘  └──────────────┘
            

Components

🌐 VPC

2 AZs, public/private subnets, NAT Gateway, Internet Gateway

🔐 Security Groups

ALB, Platform, Agent, RDS - least privilege

⚖️ ALB

HTTPS listener, target groups, health checks

🐳 ECS/EKS

Cluster, services, task definitions

🗄️ RDS Postgres

db.t3.micro, encrypted, private subnet

📦 S3

Artifacts bucket with KMS encryption

🔑 KMS

Customer managed key for S3 + RDS

🔒 Secrets Manager

Database credentials, API keys

Implementation Phases

Phase 1: Networking (Day 1 Morning)

# VPC with 2 AZs
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 5.0"

  name = "${var.name}-vpc"
  cidr = "10.0.0.0/16"

  azs             = ["us-east-1a", "us-east-1b"]
  private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
  public_subnets  = ["10.0.101.0/24", "10.0.102.0/24"]

  enable_nat_gateway = true
  single_nat_gateway = true  # Cost savings for sandbox

  tags = {
    Project = "onboarding-capstone"
    Owner   = var.name
  }
}

Phase 2: Database (Day 1 Afternoon)

# Secrets Manager for DB credentials
resource "aws_secretsmanager_secret" "db_password" {
  name = "${var.name}-db-password"
}

resource "aws_secretsmanager_secret_version" "db_password" {
  secret_id     = aws_secretsmanager_secret.db_password.id
  secret_string = random_password.db.result
}

# RDS Postgres
resource "aws_db_instance" "main" {
  identifier        = "${var.name}-postgres"
  engine            = "postgres"
  engine_version    = "15"
  instance_class    = "db.t3.micro"
  allocated_storage = 20

  db_name  = "tenzai"
  username = "tenzai"
  password = random_password.db.result

  vpc_security_group_ids = [aws_security_group.rds.id]
  db_subnet_group_name   = aws_db_subnet_group.main.name

  storage_encrypted = true
  kms_key_id        = aws_kms_key.main.arn

  skip_final_snapshot = true  # Sandbox only!
}

Phase 3: Storage & Encryption (Day 1 End)

# KMS Key
resource "aws_kms_key" "main" {
  description             = "Key for ${var.name} stack"
  deletion_window_in_days = 7
  enable_key_rotation     = true
}

# S3 Bucket
resource "aws_s3_bucket" "artifacts" {
  bucket = "${var.name}-artifacts"
}

resource "aws_s3_bucket_server_side_encryption_configuration" "artifacts" {
  bucket = aws_s3_bucket.artifacts.id

  rule {
    apply_server_side_encryption_by_default {
      kms_master_key_id = aws_kms_key.main.arn
      sse_algorithm     = "aws:kms"
    }
  }
}

Phase 4: Compute - Platform Service (Day 2)

# ECS Cluster
resource "aws_ecs_cluster" "main" {
  name = "${var.name}-cluster"

  setting {
    name  = "containerInsights"
    value = "enabled"
  }
}

# Platform Service
resource "aws_ecs_service" "platform" {
  name            = "platform"
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.platform.arn
  desired_count   = 1
  launch_type     = "FARGATE"

  network_configuration {
    subnets         = module.vpc.private_subnets
    security_groups = [aws_security_group.platform.id]
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.platform.arn
    container_name   = "platform"
    container_port   = 8000
  }
}

Phase 5: Verification (Day 3)

✓ Success Criteria

📋 Progress Checklist

Stretch Goals

Cleanup Checklist

# IMPORTANT: Destroy in correct order
# 1. ECS Services (allow time to drain)
# 2. ALB, Target Groups
# 3. RDS (may take time)
# 4. NAT Gateway
# 5. VPC (last)

terraform destroy

# Verify in console that nothing remains