diff options
| author | grothedev <grothedev@gmail.com> | 2025-11-02 09:36:26 -0500 |
|---|---|---|
| committer | grothedev <grothedev@gmail.com> | 2025-11-02 09:36:26 -0500 |
| commit | 709c511bb6417e15eb62c75638c50afd9b7143ee (patch) | |
| tree | a5453a14f808ef8dcdb9842023a0cbc0eaa96a3f | |
| parent | 3dadb3aa1920f25a7f6d4b4775a83cabdbd8275b (diff) | |
| -rw-r--r-- | BINARIES.md | 680 | ||||
| -rw-r--r-- | BOOTSTRAP.md | 744 | ||||
| -rw-r--r-- | CERTIFICATES.md | 506 | ||||
| -rw-r--r-- | ISO-BUILDER.md | 849 |
4 files changed, 2779 insertions, 0 deletions
diff --git a/BINARIES.md b/BINARIES.md new file mode 100644 index 0000000..9e9d21b --- /dev/null +++ b/BINARIES.md @@ -0,0 +1,680 @@ +# Binary Acquisition and Packaging + +## Overview + +This document specifies exact versions, download sources, and installation paths for all software components in the cluster-from-systemd project. + +## Design Decision: Hybrid Package Strategy + +**Approach**: Use distro packages where stable, upstream binaries for latest features. + +**Rationale**: +- **Distro packages** (dnf/yum): Base system, containerd, utilities (reliable, security updates) +- **Upstream binaries**: Kubernetes, etcd (need specific versions, faster release cycle) +- **Container images**: Kafka, Ceph (easier to version-lock, upstream-supported) +- **Compiled from source**: Only if unavoidable (minimize complexity) + +## Base Distribution + +**Choice**: Rocky Linux 9.3 (or Fedora 39) + +**Rationale**: +- RHEL-compatible (enterprise acceptance) +- systemd 252+ (needed features) +- SELinux integration +- Long support lifecycle (Rocky: ~10 years) + +**Alternative**: Fedora 39 for bleeding-edge testing, migrate to Rocky for production. + +## Component Versions (2025-01-15 snapshot) + +### Kubernetes Cluster +- **Kubernetes**: v1.29.1 (upstream binaries) +- **etcd**: v3.5.11 (upstream binaries) +- **containerd**: 1.7.11 (distro package) +- **runc**: 1.1.10 (distro package) +- **CNI plugins**: v1.4.0 (upstream binaries) +- **Calico**: v3.27.0 (manifest/container images) + +### Storage (Ceph) +- **Ceph**: Reef 18.2.1 (distro package from CentOS SIG) +- **ceph-common**: 18.2.1 +- **ceph-mon**: 18.2.1 +- **ceph-osd**: 18.2.1 +- **ceph-mds**: 18.2.1 + +### Messaging +- **Kafka**: 3.6.1 (upstream tarball) +- **OpenJDK**: 17.0.9 (distro package - required by Kafka) +- **Mosquitto**: 2.0.18 (distro package) + +### DNS +- **CoreDNS**: v1.11.1 (upstream binary) + +### Utilities +- **yq**: v4.40.5 (YAML processor) +- **jq**: 1.6 (distro package) +- **curl**: latest (distro) +- **openssl**: latest (distro) + +## Binary Installation Paths + +``` +/usr/local/bin/ +├── kubeadm # Kubernetes cluster bootstrapper +├── kubectl # Kubernetes CLI +├── kubelet # Kubernetes node agent +├── kube-apiserver # Kubernetes API server +├── kube-controller-manager # Kubernetes controller +├── kube-scheduler # Kubernetes scheduler +├── etcd # etcd binary +├── etcdctl # etcd CLI +├── coredns # CoreDNS binary +└── yq # YAML processor + +/usr/bin/ (via distro packages) +├── containerd +├── ctr +├── runc +├── ceph +├── ceph-mon +├── ceph-osd +├── mosquitto +├── mosquitto_passwd +├── java (OpenJDK) +└── jq + +/opt/ +├── kafka/ # Kafka installation +│ ├── bin/ +│ │ ├── kafka-server-start.sh +│ │ ├── kafka-topics.sh +│ │ └── ... +│ ├── config/ +│ │ └── kraft/ +│ └── libs/ +└── cni/ # CNI plugins + └── bin/ + ├── bridge + ├── host-local + ├── loopback + └── ... +``` + +## Download Sources and Installation + +### 1. Kubernetes Binaries + +**Version**: v1.29.1 +**Source**: https://github.com/kubernetes/kubernetes/releases +**Architecture**: amd64 (x86_64) + +```bash +#!/bin/bash +# tools/download-kubernetes.sh + +KUBE_VERSION="v1.29.1" +ARCH="amd64" +BASE_URL="https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/${ARCH}" + +BINARIES=( + "kubeadm" + "kubectl" + "kubelet" + "kube-apiserver" + "kube-controller-manager" + "kube-scheduler" + "kube-proxy" +) + +for binary in "${BINARIES[@]}"; do + echo "Downloading $binary..." + curl -L "${BASE_URL}/${binary}" -o "/usr/local/bin/${binary}" + chmod +x "/usr/local/bin/${binary}" +done + +# Verify +kubelet --version +kubectl version --client +``` + +**Checksums**: Download and verify +```bash +curl -L "${BASE_URL}/kubelet.sha256" -o kubelet.sha256 +echo "$(cat kubelet.sha256) /usr/local/bin/kubelet" | sha256sum --check +``` + +### 2. etcd + +**Version**: v3.5.11 +**Source**: https://github.com/etcd-io/etcd/releases + +```bash +#!/bin/bash +# tools/download-etcd.sh + +ETCD_VERSION="v3.5.11" +ARCH="amd64" +TARBALL="etcd-${ETCD_VERSION}-linux-${ARCH}.tar.gz" +URL="https://github.com/etcd-io/etcd/releases/download/${ETCD_VERSION}/${TARBALL}" + +curl -L "$URL" -o "/tmp/${TARBALL}" +tar xzf "/tmp/${TARBALL}" -C /tmp +mv /tmp/etcd-${ETCD_VERSION}-linux-${ARCH}/etcd /usr/local/bin/ +mv /tmp/etcd-${ETCD_VERSION}-linux-${ARCH}/etcdctl /usr/local/bin/ +chmod +x /usr/local/bin/etcd /usr/local/bin/etcdctl + +# Verify +etcd --version +etcdctl version +``` + +### 3. containerd + runc (Distro Packages) + +**Source**: Rocky Linux 9 AppStream repository + +```bash +#!/bin/bash +# tools/install-container-runtime.sh + +dnf install -y \ + containerd.io \ + runc \ + containernetworking-plugins + +# Or from Docker repo for newer version +dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo +dnf install -y containerd.io + +# Configure containerd +mkdir -p /etc/containerd +containerd config default > /etc/containerd/config.toml + +# Enable systemd cgroup driver (required for Kubernetes) +sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml + +systemctl enable containerd +``` + +### 4. CNI Plugins + +**Version**: v1.4.0 +**Source**: https://github.com/containernetworking/plugins/releases + +```bash +#!/bin/bash +# tools/download-cni-plugins.sh + +CNI_VERSION="v1.4.0" +ARCH="amd64" +CNI_TARBALL="cni-plugins-linux-${ARCH}-${CNI_VERSION}.tgz" +URL="https://github.com/containernetworking/plugins/releases/download/${CNI_VERSION}/${CNI_TARBALL}" + +mkdir -p /opt/cni/bin +curl -L "$URL" -o "/tmp/${CNI_TARBALL}" +tar xzf "/tmp/${CNI_TARBALL}" -C /opt/cni/bin + +chmod +x /opt/cni/bin/* +ls -lh /opt/cni/bin/ +``` + +### 5. Calico CNI + +**Version**: v3.27.0 +**Source**: https://docs.tigera.io/calico/latest/getting-started/kubernetes/self-managed-onprem/onpremises + +**Installation method**: Kubernetes manifest (applied after cluster init) + +```bash +#!/bin/bash +# tools/install-calico.sh +# Run AFTER first master is initialized + +CALICO_VERSION="v3.27.0" +kubectl apply -f "https://raw.githubusercontent.com/projectcalico/calico/${CALICO_VERSION}/manifests/calico.yaml" +``` + +**Embedded in ISO**: Download manifest at build time, apply at bootstrap. + +```bash +# At ISO build time +mkdir -p /etc/kubernetes/manifests/calico +curl -L "https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/calico.yaml" \ + -o /etc/kubernetes/manifests/calico/calico.yaml +``` + +### 6. Ceph (Distro Packages) + +**Version**: Reef 18.2.1 +**Source**: CentOS Storage SIG repository + +```bash +#!/bin/bash +# tools/install-ceph.sh + +# Add Ceph repository +cat > /etc/yum.repos.d/ceph.repo <<EOF +[ceph] +name=Ceph packages for \$basearch +baseurl=https://download.ceph.com/rpm-reef/el9/\$basearch +enabled=1 +gpgcheck=1 +gpgkey=https://download.ceph.com/keys/release.asc + +[ceph-noarch] +name=Ceph noarch packages +baseurl=https://download.ceph.com/rpm-reef/el9/noarch +enabled=1 +gpgcheck=1 +gpgkey=https://download.ceph.com/keys/release.asc +EOF + +# Install Ceph packages +dnf install -y \ + ceph-common \ + ceph-mon \ + ceph-osd \ + ceph-mds \ + ceph-mgr \ + python3-ceph-argparse \ + python3-cephfs + +# Verify +ceph --version +``` + +### 7. Kafka + +**Version**: 3.6.1 +**Source**: https://kafka.apache.org/downloads + +```bash +#!/bin/bash +# tools/install-kafka.sh + +KAFKA_VERSION="3.6.1" +SCALA_VERSION="2.13" +TARBALL="kafka_${SCALA_VERSION}-${KAFKA_VERSION}.tgz" +URL="https://archive.apache.org/dist/kafka/${KAFKA_VERSION}/${TARBALL}" + +# Install Java (required) +dnf install -y java-17-openjdk-headless + +# Download and extract Kafka +curl -L "$URL" -o "/tmp/${TARBALL}" +tar xzf "/tmp/${TARBALL}" -C /opt/ +mv "/opt/kafka_${SCALA_VERSION}-${KAFKA_VERSION}" /opt/kafka + +# Create symlink for easy updates +ln -sf /opt/kafka /opt/kafka-current + +# Verify +/opt/kafka/bin/kafka-topics.sh --version +``` + +### 8. Mosquitto (Distro Package) + +**Version**: 2.0.18 +**Source**: EPEL repository + +```bash +#!/bin/bash +# tools/install-mosquitto.sh + +# Enable EPEL +dnf install -y epel-release + +# Install Mosquitto +dnf install -y mosquitto mosquitto-clients + +# Verify +mosquitto -h | head -1 +``` + +### 9. CoreDNS + +**Version**: v1.11.1 +**Source**: https://github.com/coredns/coredns/releases + +```bash +#!/bin/bash +# tools/download-coredns.sh + +COREDNS_VERSION="1.11.1" +ARCH="amd64" +TARBALL="coredns_${COREDNS_VERSION}_linux_${ARCH}.tgz" +URL="https://github.com/coredns/coredns/releases/download/v${COREDNS_VERSION}/${TARBALL}" + +curl -L "$URL" -o "/tmp/${TARBALL}" +tar xzf "/tmp/${TARBALL}" -C /tmp +mv /tmp/coredns /usr/local/bin/ +chmod +x /usr/local/bin/coredns + +# Verify +coredns -version +``` + +### 10. Utilities + +```bash +#!/bin/bash +# tools/install-utilities.sh + +# Install from distro repos +dnf install -y \ + jq \ + curl \ + openssl \ + iproute \ + iputils \ + bind-utils \ + wget \ + vim \ + tmux \ + htop + +# Install yq (YAML processor) +YQ_VERSION="v4.40.5" +YQ_BINARY="yq_linux_amd64" +curl -L "https://github.com/mikefarah/yq/releases/download/${YQ_VERSION}/${YQ_BINARY}" \ + -o /usr/local/bin/yq +chmod +x /usr/local/bin/yq + +# Verify +yq --version +jq --version +``` + +## Complete Installation Script + +**Script**: `tools/install-all-binaries.sh` + +```bash +#!/bin/bash +# Master script to install all binaries in correct order + +set -euo pipefail + +echo "==> Installing base system packages..." +dnf update -y +dnf install -y epel-release + +echo "==> Installing container runtime..." +./tools/install-container-runtime.sh + +echo "==> Installing Kubernetes binaries..." +./tools/download-kubernetes.sh + +echo "==> Installing etcd..." +./tools/download-etcd.sh + +echo "==> Installing CNI plugins..." +./tools/download-cni-plugins.sh + +echo "==> Installing Ceph..." +./tools/install-ceph.sh + +echo "==> Installing Kafka..." +./tools/install-kafka.sh + +echo "==> Installing Mosquitto..." +./tools/install-mosquitto.sh + +echo "==> Installing CoreDNS..." +./tools/download-coredns.sh + +echo "==> Installing utilities..." +./tools/install-utilities.sh + +echo "==> Downloading Calico manifest..." +mkdir -p /etc/kubernetes/manifests/calico +curl -L "https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/calico.yaml" \ + -o /etc/kubernetes/manifests/calico/calico.yaml + +echo "==> Binary installation complete!" +echo "" +echo "Installed versions:" +kubelet --version +etcd --version +containerd --version +ceph --version +java -version 2>&1 | head -1 +mosquitto -h 2>&1 | head -1 +coredns -version +``` + +## Package List for ISO Builder + +**File**: `configs/package-list.txt` + +``` +# Base system +@core +@standard +kernel +kernel-modules +kernel-modules-extra +systemd +systemd-networkd +systemd-resolved + +# Networking +NetworkManager +iproute +iputils +bind-utils +iptables +nftables + +# Container runtime (from Docker repo) +containerd.io + +# Development tools (for troubleshooting) +strace +tcpdump +lsof +vim +tmux +htop + +# Python (for cluster scripts) +python3 +python3-pip +python3-pyyaml + +# Java (for Kafka) +java-17-openjdk-headless + +# MQTT +mosquitto +mosquitto-clients + +# Utilities +jq +curl +wget +openssl +tar +gzip +``` + +## Dependency Tree + +``` +Kubernetes +├── containerd +│ └── runc +├── CNI plugins +└── etcd + +Ceph +├── python3-ceph-argparse +└── ceph-common + +Kafka +└── java-17-openjdk-headless + +Mosquitto +└── (no dependencies) + +CoreDNS +└── (no dependencies) +``` + +## Version Lock Strategy + +**Problem**: Prevent unexpected version changes between builds. + +**Solution**: Pin exact versions in scripts, not "latest". + +**Update process**: +1. Test new versions in dev environment +2. Update version variables in download scripts +3. Rebuild ISO +4. Run integration tests +5. Deploy to production + +## Verification Checklist + +Run after binary installation: + +```bash +#!/bin/bash +# tools/verify-binaries.sh + +ERRORS=0 + +check_binary() { + local binary=$1 + local version_flag=${2:---version} + + if ! command -v "$binary" &>/dev/null; then + echo "ERROR: Binary not found: $binary" + ((ERRORS++)) + else + echo "✓ $binary: $($binary $version_flag 2>&1 | head -1)" + fi +} + +echo "==> Verifying installed binaries..." + +# Kubernetes +check_binary kubelet --version +check_binary kubectl version --client +check_binary kube-apiserver --version +check_binary kube-controller-manager --version +check_binary kube-scheduler --version + +# Container runtime +check_binary containerd --version +check_binary runc --version + +# etcd +check_binary etcd --version +check_binary etcdctl version + +# Ceph +check_binary ceph --version + +# Kafka +if [[ -x /opt/kafka/bin/kafka-topics.sh ]]; then + echo "✓ Kafka: $(/opt/kafka/bin/kafka-topics.sh --version 2>&1)" +else + echo "ERROR: Kafka not found" + ((ERRORS++)) +fi + +# Mosquitto +check_binary mosquitto -h + +# CoreDNS +check_binary coredns -version + +# Utilities +check_binary yq --version +check_binary jq --version + +# CNI plugins +if [[ -d /opt/cni/bin ]]; then + echo "✓ CNI plugins: $(ls /opt/cni/bin | wc -l) binaries installed" +else + echo "ERROR: CNI plugins directory not found" + ((ERRORS++)) +fi + +if [[ $ERRORS -gt 0 ]]; then + echo "==> Verification FAILED with $ERRORS errors" + exit 1 +fi + +echo "==> Verification PASSED" +``` + +## Storage Requirements + +**Disk space needed for binaries**: + +``` +Kubernetes binaries: ~500 MB +etcd: ~30 MB +containerd/runc: ~50 MB +CNI plugins: ~40 MB +Ceph packages: ~150 MB +Kafka: ~100 MB +Mosquitto: ~5 MB +CoreDNS: ~50 MB +Base system: ~2 GB +----------------------------------- +Total: ~3 GB + +Recommended ISO size: 8 GB (includes configs, logs, temp space) +``` + +## Update Strategy + +**Minor version updates** (e.g., 1.29.1 → 1.29.2): +1. Update version variable in download script +2. Rebuild ISO +3. Test in staging + +**Major version updates** (e.g., 1.29 → 1.30): +1. Review Kubernetes changelog +2. Update API versions in manifests +3. Test upgrade path (rolling update) +4. Update documentation + +## Airgapped Installation + +For environments without internet access: + +```bash +#!/bin/bash +# tools/create-binary-bundle.sh +# Download all binaries for offline installation + +BUNDLE_DIR="./binary-bundle" +mkdir -p "$BUNDLE_DIR" + +# Download all binaries to local directory +# Modify download scripts to save to $BUNDLE_DIR instead of /usr/local/bin + +# Create tarball +tar czf binary-bundle.tar.gz "$BUNDLE_DIR" + +# On target system: +# tar xzf binary-bundle.tar.gz +# ./binary-bundle/install-offline.sh +``` + +## Next Steps + +1. Implement all download scripts in `tools/` +2. Test binary installation on clean Rocky 9 VM +3. Create version-lock manifest (SHA256 checksums) +4. Integrate with ISO builder kickstart +5. Add to validation workflow (`verify-binaries.sh`) +6. Document upgrade procedures + +--- + +**P.S.** This is the unglamorous but critical work—getting the right bits in the right place. Using distro packages where possible keeps maintenance sane; upstream binaries for Kubernetes give you version control. The scripts are copy-paste ready. Next doc should be **BOOTSTRAP.md** to define the initialization sequence. diff --git a/BOOTSTRAP.md b/BOOTSTRAP.md new file mode 100644 index 0000000..f200322 --- /dev/null +++ b/BOOTSTRAP.md @@ -0,0 +1,744 @@ +# Cluster Bootstrap and Initialization + +## Overview + +This document defines the **first-boot initialization sequence** that transforms individual nodes into a functioning cluster. Bootstrap is the most complex phase because nodes must coordinate without a working cluster. + +## The Bootstrap Problem + +**Challenge**: Kubernetes requires a control plane to join nodes, but the control plane doesn't exist yet. + +**Solution**: Designate one node as **first master**, which initializes the cluster. All other nodes join the existing cluster. + +## Bootstrap States + +Each node tracks its bootstrap state to avoid re-initialization on reboot. + +``` +/var/lib/cluster-state/ +├── bootstrap-state # Current state: uninitialized|bootstrapped|joined +├── cluster-joined # Timestamp of successful join +├── node-role # master-first|master-additional|worker +└── etcd-member-id # etcd member ID (masters only) +``` + +## State Transitions + +``` +┌─────────────────┐ +│ Uninitialized │ ← Fresh install +└────────┬────────┘ + │ + ▼ + ┌────────────────────┐ + │ Is first master? │ + └─┬────────────────┬─┘ + │ YES │ NO + ▼ ▼ +┌─────────────┐ ┌──────────────┐ +│ Bootstrap │ │ Wait for │ +│ Cluster │ │ First Master │ +└──────┬──────┘ └──────┬───────┘ + │ │ + ▼ ▼ +┌─────────────┐ ┌──────────────┐ +│ Bootstrapped│ │ Join Cluster │ +└──────┬──────┘ └──────┬───────┘ + │ │ + └────────┬────────┘ + ▼ + ┌───────────────┐ + │ Cluster Ready │ + └───────────────┘ +``` + +## Determining First Master + +**Logic**: The first master is the **first master node in cluster.yaml** (sorted by node name). + +**Script**: `tools/am-i-first-master.sh` + +```bash +#!/bin/bash +# Determines if this node should bootstrap the cluster + +set -euo pipefail + +CONFIG_DIR="${CONFIG_DIR:-/etc/cluster-config}" +NODE_NAME=$(cat "$CONFIG_DIR/node-identity") +NODE_CONFIG="$CONFIG_DIR/nodes/$NODE_NAME.yaml" +CLUSTER_CONFIG="$CONFIG_DIR/cluster.yaml" + +# Check if this node has master/control-plane role +ROLES=$(yq eval '.node.roles[]' "$NODE_CONFIG") +IS_MASTER=false +[[ "$ROLES" =~ "master" || "$ROLES" =~ "control-plane" ]] && IS_MASTER=true + +if [[ "$IS_MASTER" != "true" ]]; then + echo "worker" + exit 0 +fi + +# Find all master nodes from cluster.yaml +MASTER_NODES=$(yq eval '.nodes[] | select(.roles[] | contains("master") or contains("control-plane")) | .name' "$CLUSTER_CONFIG" | sort) + +# Get the first master +FIRST_MASTER=$(echo "$MASTER_NODES" | head -1) + +if [[ "$NODE_NAME" == "$FIRST_MASTER" ]]; then + echo "master-first" +else + echo "master-additional" +fi +``` + +## Bootstrap Sequence + +### Phase 1: Network Configuration (All Nodes) + +**Runs**: Before cluster-detect.service +**Script**: `tools/configure-network.sh` + +```bash +#!/bin/bash +# Configure static IP from cluster.yaml + +set -euo pipefail + +CONFIG_DIR="${CONFIG_DIR:-/etc/cluster-config}" +TEMP_NODE_NAME="${1:-}" # During first boot, identity not yet known + +# Detect node identity (simplified version for network setup) +if [[ -z "$TEMP_NODE_NAME" ]]; then + # Try MAC address detection + MY_MAC=$(ip link show | awk '/ether/ {print $2}' | head -1) + + for node_file in "$CONFIG_DIR/nodes"/*.yaml; do + MAC_ADDRS=$(yq eval '.node.hardware.mac_addresses[]?' "$node_file" 2>/dev/null || echo "") + if echo "$MAC_ADDRS" | grep -qi "$MY_MAC"; then + TEMP_NODE_NAME=$(yq eval '.node.name' "$node_file") + break + fi + done +fi + +if [[ -z "$TEMP_NODE_NAME" ]]; then + echo "ERROR: Cannot determine node identity for network config" + exit 1 +fi + +NODE_CONFIG="$CONFIG_DIR/nodes/$TEMP_NODE_NAME.yaml" +NODE_IP=$(yq eval '.node.ip' "$NODE_CONFIG") +NODE_HOSTNAME=$(yq eval '.node.hostname' "$NODE_CONFIG") + +# Get network interface (assume first non-loopback) +IFACE=$(ip route | grep default | awk '{print $5}' | head -1) + +echo "==> Configuring network for $TEMP_NODE_NAME ($NODE_IP)" + +# Create NetworkManager connection +nmcli connection delete "$IFACE" 2>/dev/null || true +nmcli connection add \ + type ethernet \ + con-name "$IFACE" \ + ifname "$IFACE" \ + ip4 "$NODE_IP/24" \ + gw4 "192.168.1.1" \ + ipv4.dns "8.8.8.8,8.8.4.4" \ + ipv4.method manual + +nmcli connection up "$IFACE" + +# Set hostname +hostnamectl set-hostname "$NODE_HOSTNAME" + +# Add to /etc/hosts +grep -q "$NODE_HOSTNAME" /etc/hosts || \ + echo "$NODE_IP $NODE_HOSTNAME" >> /etc/hosts + +echo "==> Network configured: $NODE_IP ($NODE_HOSTNAME)" +``` + +### Phase 2: First Master Bootstrap + +**Runs**: After cluster-detect, if first master and uninitialized +**Script**: `tools/bootstrap-first-master.sh` + +```bash +#!/bin/bash +# Initialize Kubernetes cluster on first master node + +set -euo pipefail + +CONFIG_DIR="${CONFIG_DIR:-/etc/cluster-config}" +STATE_DIR="/var/lib/cluster-state" +mkdir -p "$STATE_DIR" + +# Check if already bootstrapped +if [[ -f "$STATE_DIR/bootstrap-state" ]]; then + STATE=$(cat "$STATE_DIR/bootstrap-state") + if [[ "$STATE" == "bootstrapped" ]]; then + echo "==> Cluster already bootstrapped, skipping" + exit 0 + fi +fi + +NODE_NAME=$(cat "$CONFIG_DIR/node-identity") +NODE_IP=$(yq eval '.node.ip' "$CONFIG_DIR/current-node.yaml") +CLUSTER_NAME=$(yq eval '.cluster.name' "$CONFIG_DIR/cluster.yaml") +POD_CIDR=$(yq eval '.cluster.networking.pod_cidr' "$CONFIG_DIR/cluster.yaml") +SERVICE_CIDR=$(yq eval '.cluster.networking.service_cidr' "$CONFIG_DIR/cluster.yaml") + +echo "==> Bootstrapping Kubernetes cluster on first master: $NODE_NAME" + +# 1. Initialize etcd cluster +echo "==> Initializing etcd..." +mkdir -p /var/lib/etcd + +# Create etcd initial cluster list +MASTER_NODES=$(yq eval '.nodes[] | select(.roles[] | contains("master") or contains("control-plane")) | .name + "=https://" + .ip + ":2380"' "$CONFIG_DIR/cluster.yaml" | tr '\n' ',' | sed 's/,$//') + +cat > /tmp/etcd-bootstrap.env <<EOF +ETCD_NAME=$NODE_NAME +ETCD_DATA_DIR=/var/lib/etcd +ETCD_LISTEN_PEER_URLS=https://$NODE_IP:2380 +ETCD_LISTEN_CLIENT_URLS=https://$NODE_IP:2379,https://127.0.0.1:2379 +ETCD_INITIAL_ADVERTISE_PEER_URLS=https://$NODE_IP:2380 +ETCD_ADVERTISE_CLIENT_URLS=https://$NODE_IP:2379 +ETCD_INITIAL_CLUSTER=$MASTER_NODES +ETCD_INITIAL_CLUSTER_STATE=new +ETCD_INITIAL_CLUSTER_TOKEN=$CLUSTER_NAME-etcd +ETCD_CERT_FILE=/etc/kubernetes/pki/etcd/server.crt +ETCD_KEY_FILE=/etc/kubernetes/pki/etcd/server.key +ETCD_CLIENT_CERT_AUTH=true +ETCD_TRUSTED_CA_FILE=/etc/kubernetes/pki/etcd/ca.crt +ETCD_PEER_CERT_FILE=/etc/kubernetes/pki/etcd/peer.crt +ETCD_PEER_KEY_FILE=/etc/kubernetes/pki/etcd/peer.key +ETCD_PEER_CLIENT_CERT_AUTH=true +ETCD_PEER_TRUSTED_CA_FILE=/etc/kubernetes/pki/etcd/ca.crt +EOF + +# Start etcd +systemctl start etcd +sleep 5 + +# Verify etcd is healthy +etcdctl --endpoints=https://127.0.0.1:2379 \ + --cacert=/etc/kubernetes/pki/etcd/ca.crt \ + --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \ + --key=/etc/kubernetes/pki/etcd/healthcheck-client.key \ + endpoint health + +# 2. Start API server +echo "==> Starting kube-apiserver..." +systemctl start kube-apiserver +sleep 10 + +# Wait for API server to be ready +for i in {1..30}; do + if curl -k https://127.0.0.1:6443/healthz &>/dev/null; then + echo "==> API server is ready" + break + fi + echo "Waiting for API server... ($i/30)" + sleep 2 +done + +# 3. Start controller manager and scheduler +echo "==> Starting control plane components..." +systemctl start kube-controller-manager +systemctl start kube-scheduler + +# 4. Create initial kubeconfig for kubectl +mkdir -p ~/.kube +cat > ~/.kube/config <<EOF +apiVersion: v1 +kind: Config +clusters: +- cluster: + certificate-authority: /etc/kubernetes/pki/ca.crt + server: https://127.0.0.1:6443 + name: $CLUSTER_NAME +contexts: +- context: + cluster: $CLUSTER_NAME + user: kubernetes-admin + name: kubernetes-admin@$CLUSTER_NAME +current-context: kubernetes-admin@$CLUSTER_NAME +users: +- name: kubernetes-admin + user: + client-certificate: /etc/kubernetes/pki/apiserver-kubelet-client.crt + client-key: /etc/kubernetes/pki/apiserver-kubelet-client.key +EOF + +# 5. Install Calico CNI +echo "==> Installing Calico CNI..." +kubectl apply -f /etc/kubernetes/manifests/calico/calico.yaml + +# Wait for Calico pods to be ready +echo "==> Waiting for Calico to be ready..." +kubectl wait --for=condition=ready pod -l k8s-app=calico-node -n kube-system --timeout=300s || true + +# 6. Start kubelet on this node +echo "==> Starting kubelet..." +systemctl start kubelet + +# 7. Create bootstrap tokens for worker nodes +echo "==> Creating bootstrap token..." +TOKEN=$(openssl rand -hex 3).$(openssl rand -hex 8) +echo "$TOKEN" > "$STATE_DIR/bootstrap-token" + +# Create token secret in Kubernetes +kubectl create secret generic bootstrap-token-${TOKEN%.*} \ + --type bootstrap.kubernetes.io/token \ + --namespace kube-system \ + --from-literal=token-id=${TOKEN%.*} \ + --from-literal=token-secret=${TOKEN#*.} \ + --from-literal=usage-bootstrap-authentication=true \ + --from-literal=usage-bootstrap-signing=true + +# 8. Create join command for workers +CA_CERT_HASH=$(openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | \ + openssl rsa -pubin -outform der 2>/dev/null | \ + openssl dgst -sha256 -hex | sed 's/^.* //') + +cat > "$STATE_DIR/worker-join-command" <<EOF +# Run this on worker nodes to join the cluster +kubeadm join $NODE_IP:6443 --token $TOKEN --discovery-token-ca-cert-hash sha256:$CA_CERT_HASH +EOF + +echo "==> Bootstrap complete!" +echo "bootstrap-state: bootstrapped" > "$STATE_DIR/bootstrap-state" +echo "$NODE_NAME" > "$STATE_DIR/cluster-initialized-by" +date -Iseconds > "$STATE_DIR/cluster-joined" + +# 9. Distribute join info to shared location (optional: NFS, HTTP server, etc.) +# For now, workers must manually fetch from first master +mkdir -p /var/www/html/cluster-join +cp "$STATE_DIR/worker-join-command" /var/www/html/cluster-join/ +cp "$STATE_DIR/bootstrap-token" /var/www/html/cluster-join/ + +echo "==> Worker join command saved to: $STATE_DIR/worker-join-command" +``` + +### Phase 3: Additional Master Join + +**Runs**: After cluster-detect, if additional master and first master is ready +**Script**: `tools/join-additional-master.sh` + +```bash +#!/bin/bash +# Join additional master nodes to existing etcd cluster + +set -euo pipefail + +CONFIG_DIR="${CONFIG_DIR:-/etc/cluster-config}" +STATE_DIR="/var/lib/cluster-state" +mkdir -p "$STATE_DIR" + +# Check if already joined +if [[ -f "$STATE_DIR/bootstrap-state" ]]; then + STATE=$(cat "$STATE_DIR/bootstrap-state") + if [[ "$STATE" == "joined" ]]; then + echo "==> Already joined cluster, skipping" + exit 0 + fi +fi + +NODE_NAME=$(cat "$CONFIG_DIR/node-identity") +NODE_IP=$(yq eval '.node.ip' "$CONFIG_DIR/current-node.yaml") +CLUSTER_NAME=$(yq eval '.cluster.name' "$CONFIG_DIR/cluster.yaml") + +# Get first master IP +FIRST_MASTER_IP=$(yq eval '.nodes[] | select(.roles[] | contains("master") or contains("control-plane")) | .ip' "$CONFIG_DIR/cluster.yaml" | head -1) + +echo "==> Joining as additional master: $NODE_NAME" +echo "==> First master: $FIRST_MASTER_IP" + +# Wait for first master API server to be ready +echo "==> Waiting for first master API server..." +for i in {1..60}; do + if curl -k https://$FIRST_MASTER_IP:6443/healthz &>/dev/null; then + echo "==> First master is ready" + break + fi + echo "Waiting for first master... ($i/60)" + sleep 5 +done + +# Add this node to etcd cluster +echo "==> Adding node to etcd cluster..." + +# etcd add member command (run on first master via ssh or API) +# For now, assume etcdctl is available and certs are distributed +ETCD_ENDPOINTS="https://$FIRST_MASTER_IP:2379" + +etcdctl --endpoints=$ETCD_ENDPOINTS \ + --cacert=/etc/kubernetes/pki/etcd/ca.crt \ + --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \ + --key=/etc/kubernetes/pki/etcd/healthcheck-client.key \ + member add "$NODE_NAME" --peer-urls="https://$NODE_IP:2380" + +# Start etcd in "existing" mode +cat > /tmp/etcd-join.env <<EOF +ETCD_NAME=$NODE_NAME +ETCD_DATA_DIR=/var/lib/etcd +ETCD_INITIAL_CLUSTER_STATE=existing +# ... (same as bootstrap but with "existing") +EOF + +systemctl start etcd + +# Start control plane components +systemctl start kube-apiserver +systemctl start kube-controller-manager +systemctl start kube-scheduler +systemctl start kubelet + +echo "joined" > "$STATE_DIR/bootstrap-state" +date -Iseconds > "$STATE_DIR/cluster-joined" + +echo "==> Additional master joined successfully" +``` + +### Phase 4: Worker Join + +**Runs**: After cluster-detect, if worker node +**Script**: `tools/join-worker.sh` + +```bash +#!/bin/bash +# Join worker node to Kubernetes cluster + +set -euo pipefail + +CONFIG_DIR="${CONFIG_DIR:-/etc/cluster-config}" +STATE_DIR="/var/lib/cluster-state" +mkdir -p "$STATE_DIR" + +# Check if already joined +if [[ -f "$STATE_DIR/bootstrap-state" ]]; then + STATE=$(cat "$STATE_DIR/bootstrap-state") + if [[ "$STATE" == "joined" ]]; then + echo "==> Already joined cluster, skipping" + exit 0 + fi +fi + +NODE_NAME=$(cat "$CONFIG_DIR/node-identity") + +# Get first master IP +FIRST_MASTER_IP=$(yq eval '.nodes[] | select(.roles[] | contains("master") or contains("control-plane")) | .ip' "$CONFIG_DIR/cluster.yaml" | head -1) + +echo "==> Joining worker node: $NODE_NAME" +echo "==> API server: https://$FIRST_MASTER_IP:6443" + +# Wait for API server +for i in {1..60}; do + if curl -k https://$FIRST_MASTER_IP:6443/healthz &>/dev/null; then + echo "==> API server is ready" + break + fi + echo "Waiting for API server... ($i/60)" + sleep 5 +done + +# Fetch join token from first master (requires setup) +# Option 1: HTTP endpoint on first master +# Option 2: Shared NFS mount +# Option 3: Embedded in cluster.yaml (less secure) + +# For now, assume token is in cluster.yaml or fetched via curl +TOKEN=$(curl -s http://$FIRST_MASTER_IP/cluster-join/bootstrap-token 2>/dev/null || echo "") + +if [[ -z "$TOKEN" ]]; then + echo "ERROR: Cannot fetch bootstrap token from first master" + echo "Manual join required. Run on first master:" + echo " kubeadm token create --print-join-command" + exit 1 +fi + +CA_CERT_HASH=$(openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | \ + openssl rsa -pubin -outform der 2>/dev/null | \ + openssl dgst -sha256 -hex | sed 's/^.* //') + +# Join cluster (kubeadm-style, but we're using raw kubelet) +# Create kubelet bootstrap kubeconfig +cat > /etc/kubernetes/bootstrap-kubelet.conf <<EOF +apiVersion: v1 +kind: Config +clusters: +- cluster: + certificate-authority: /etc/kubernetes/pki/ca.crt + server: https://$FIRST_MASTER_IP:6443 + name: kubernetes +contexts: +- context: + cluster: kubernetes + user: kubelet-bootstrap + name: kubelet-bootstrap@kubernetes +current-context: kubelet-bootstrap@kubernetes +users: +- name: kubelet-bootstrap + user: + token: $TOKEN +EOF + +# Start kubelet (will auto-generate client cert via CSR) +systemctl start kubelet + +# Auto-approve CSR (on first master, create script to approve) +# kubectl certificate approve <csr-name> + +echo "joined" > "$STATE_DIR/bootstrap-state" +date -Iseconds > "$STATE_DIR/cluster-joined" + +echo "==> Worker joined successfully" +echo "==> Check status on master: kubectl get nodes" +``` + +### Phase 5: Ceph Cluster Bootstrap + +**Runs**: After Kubernetes is ready, on first ceph-mon node +**Script**: `tools/bootstrap-ceph.sh` + +```bash +#!/bin/bash +# Bootstrap Ceph cluster on first monitor node + +set -euo pipefail + +CONFIG_DIR="${CONFIG_DIR:-/etc/cluster-config}" +STATE_DIR="/var/lib/cluster-state" + +# Check if already initialized +if [[ -f "$STATE_DIR/ceph-bootstrapped" ]]; then + echo "==> Ceph already bootstrapped" + exit 0 +fi + +NODE_NAME=$(cat "$CONFIG_DIR/node-identity") +NODE_IP=$(yq eval '.node.ip' "$CONFIG_DIR/current-node.yaml") +CLUSTER_NAME=$(yq eval '.cluster.name' "$CONFIG_DIR/cluster.yaml") + +# Check if this is the first ceph-mon node +CEPH_MON_NODES=$(yq eval '.nodes[] | select(.roles[] | contains("ceph-mon")) | .name' "$CONFIG_DIR/cluster.yaml" | sort) +FIRST_MON=$(echo "$CEPH_MON_NODES" | head -1) + +if [[ "$NODE_NAME" != "$FIRST_MON" ]]; then + echo "==> Not first monitor, skipping Ceph bootstrap" + exit 0 +fi + +echo "==> Bootstrapping Ceph cluster on first monitor: $NODE_NAME" + +# Generate Ceph keys (if not already done) +/usr/local/bin/generate-ceph-keys.sh + +FSID=$(cat /etc/ceph/cluster.fsid) + +# Create ceph.conf +cat > /etc/ceph/ceph.conf <<EOF +[global] +fsid = $FSID +mon initial members = $FIRST_MON +mon host = $NODE_IP +public network = 192.168.1.0/24 +cluster network = 192.168.1.0/24 +auth cluster required = cephx +auth service required = cephx +auth client required = cephx +osd pool default size = 3 +osd pool default min size = 2 +EOF + +# Create monitor map +monmaptool --create --add $FIRST_MON $NODE_IP --fsid $FSID /tmp/monmap + +# Initialize monitor data directory +mkdir -p /var/lib/ceph/mon/ceph-$FIRST_MON +ceph-mon --mkfs -i $FIRST_MON --monmap /tmp/monmap --keyring /etc/ceph/ceph.mon.keyring + +# Start monitor +systemctl start ceph-mon@$FIRST_MON + +# Wait for mon to be ready +sleep 10 + +# Enable manager module +ceph mgr module enable dashboard + +echo "==> Ceph cluster bootstrapped" +date -Iseconds > "$STATE_DIR/ceph-bootstrapped" +``` + +## Bootstrap Coordination Service + +**New systemd unit**: `cluster-bootstrap.service` + +```ini +[Unit] +Description=Cluster Bootstrap Orchestrator +After=cluster-detect.service network-online.target +Wants=network-online.target + +[Service] +Type=oneshot +RemainAfterExit=yes +ExecStart=/usr/local/bin/cluster-bootstrap-orchestrator.sh + +[Install] +WantedBy=multi-user.target +``` + +**Script**: `tools/cluster-bootstrap-orchestrator.sh` + +```bash +#!/bin/bash +# Main orchestrator that calls appropriate bootstrap script based on node role + +set -euo pipefail + +NODE_ROLE=$(/usr/local/bin/am-i-first-master.sh) + +case "$NODE_ROLE" in + master-first) + echo "==> Detected as first master, bootstrapping cluster..." + /usr/local/bin/bootstrap-first-master.sh + ;; + master-additional) + echo "==> Detected as additional master, joining cluster..." + /usr/local/bin/join-additional-master.sh + ;; + worker) + echo "==> Detected as worker, joining cluster..." + /usr/local/bin/join-worker.sh + ;; + *) + echo "ERROR: Unknown node role: $NODE_ROLE" + exit 1 + ;; +esac + +# If node has ceph-mon role, bootstrap Ceph +ROLES=$(yq eval '.node.roles[]' /etc/cluster-config/current-node.yaml) +if [[ "$ROLES" =~ "ceph-mon" ]]; then + /usr/local/bin/bootstrap-ceph.sh +fi +``` + +## Bootstrap Token Distribution + +**Problem**: Workers need the bootstrap token, but first master generates it. + +**Solutions**: + +### Option 1: HTTP Server on First Master +```bash +# On first master, start simple HTTP server +python3 -m http.server 8080 -d /var/www/html/cluster-join & + +# On workers +curl http://<first-master>:8080/bootstrap-token +``` + +### Option 2: Embedded in cluster.yaml (Less Secure) +```yaml +# configs/cluster.yaml +cluster: + bootstrap: + token: "abcdef.0123456789abcdef" # Pre-generated +``` + +### Option 3: External Secret Store +- First master uploads token to Vault/S3 +- Workers fetch using node identity credentials + +## Failure Recovery + +### Bootstrap Failed Midway + +```bash +#!/bin/bash +# tools/reset-bootstrap.sh +# Clean up partial bootstrap to retry + +systemctl stop kubelet kube-apiserver kube-controller-manager kube-scheduler etcd +rm -rf /var/lib/etcd/* +rm -rf /var/lib/kubelet/* +rm -f /var/lib/cluster-state/bootstrap-state +echo "==> Bootstrap state reset. Reboot to retry." +``` + +### Additional Master Can't Join + +1. Check etcd cluster health on first master +2. Verify certificates are correct (CN, SANs) +3. Check network connectivity (port 2380, 6443) +4. Review etcd logs: `journalctl -u etcd` + +## Testing Bootstrap + +### VM Test Scenario + +```bash +# Create 3 VMs: master-01, worker-01, worker-02 +# 1. Boot master-01, wait for bootstrap +# 2. Verify: kubectl get nodes (should show master-01) +# 3. Boot worker-01 +# 4. Verify: kubectl get nodes (should show master-01, worker-01) +# 5. Boot worker-02 +# 6. Verify: kubectl get nodes (should show all 3) +``` + +## Monitoring Bootstrap Progress + +**Script**: `tools/check-bootstrap-status.sh` + +```bash +#!/bin/bash +# Check current bootstrap status + +STATE_DIR="/var/lib/cluster-state" + +if [[ ! -f "$STATE_DIR/bootstrap-state" ]]; then + echo "Status: UNINITIALIZED" + exit 0 +fi + +STATE=$(cat "$STATE_DIR/bootstrap-state") +echo "Bootstrap State: $STATE" + +if [[ -f "$STATE_DIR/cluster-joined" ]]; then + echo "Joined At: $(cat "$STATE_DIR/cluster-joined")" +fi + +# Check service status +echo "" +echo "Service Status:" +systemctl is-active etcd 2>/dev/null && echo " etcd: running" || echo " etcd: stopped" +systemctl is-active kube-apiserver 2>/dev/null && echo " kube-apiserver: running" || echo " kube-apiserver: stopped" +systemctl is-active kubelet 2>/dev/null && echo " kubelet: running" || echo " kubelet: stopped" + +# Check Kubernetes node status +if command -v kubectl &>/dev/null; then + echo "" + echo "Kubernetes Nodes:" + kubectl get nodes 2>/dev/null || echo " API server not reachable" +fi +``` + +## Next Steps + +1. Implement all bootstrap scripts +2. Test first master initialization +3. Test worker join +4. Test multi-master etcd cluster formation +5. Integrate with ISO builder +6. Add bootstrap progress logging +7. Create troubleshooting guide + +--- + +**P.S.** This is where theory meets reality. Bootstrap is the gnarliest part because you're building the control plane with no control plane. The state machine approach (first-master vs additional-master vs worker) keeps it manageable. Token distribution is still a bit hand-wavy—recommend starting with HTTP server approach, upgrade to Vault later. diff --git a/CERTIFICATES.md b/CERTIFICATES.md new file mode 100644 index 0000000..26ca71b --- /dev/null +++ b/CERTIFICATES.md @@ -0,0 +1,506 @@ +# Certificate and Key Management + +## Overview + +This document specifies the complete PKI infrastructure required for the cluster-from-systemd project. All services requiring mutual TLS authentication or encrypted communication need certificates generated before they can start. + +## Design Decision: Hybrid Approach + +**Strategy**: Pre-generate CA and static certs at ISO build time, generate node-specific certs on first boot. + +**Rationale**: +- **Build-time**: CA certificates, service account keys (shared across cluster) +- **Boot-time**: Node-specific certs (kubelet, etcd peer certs tied to IP/hostname) +- **Avoids**: Chicken-and-egg problem with multi-master coordination +- **Security**: CA private key embedded in ISO (acceptable for isolated clusters; can be removed post-bootstrap for production) + +## Certificate Inventory + +### Kubernetes PKI (10 certificate pairs + 1 key) + +``` +/etc/kubernetes/pki/ +├── ca.crt # Kubernetes CA certificate +├── ca.key # Kubernetes CA private key +├── apiserver.crt # API server certificate +├── apiserver.key # API server private key +├── apiserver-kubelet-client.crt # API server → kubelet client cert +├── apiserver-kubelet-client.key +├── front-proxy-ca.crt # Front proxy CA +├── front-proxy-ca.key +├── front-proxy-client.crt # Front proxy client +├── front-proxy-client.key +├── sa.pub # Service account public key +├── sa.key # Service account private key (RSA) +└── etcd/ + ├── ca.crt # etcd CA certificate + ├── ca.key # etcd CA private key + ├── server.crt # etcd server certificate (PER NODE) + ├── server.key # etcd server private key (PER NODE) + ├── peer.crt # etcd peer certificate (PER NODE) + ├── peer.key # etcd peer private key (PER NODE) + ├── healthcheck-client.crt # etcd health check client + └── healthcheck-client.key +``` + +**Per-node certificates** (generated at boot): +``` +/var/lib/kubelet/pki/ +├── kubelet.crt # Node-specific kubelet server cert +├── kubelet.key # Node-specific kubelet private key +└── kubelet-client-current.pem # Kubelet client cert (CSR auto-rotation) +``` + +### Ceph Authentication (4 keyrings) + +``` +/etc/ceph/ +├── ceph.conf # Cluster configuration +├── ceph.client.admin.keyring # Admin keyring +├── ceph.mon.keyring # Monitor keyring +├── ceph.bootstrap-osd.keyring # OSD bootstrap keyring +└── ceph.bootstrap-mds.keyring # MDS bootstrap keyring +``` + +### MQTT Authentication (1 password file) + +``` +/etc/mosquitto/ +└── passwd # Password file (mosquitto_passwd format) +``` + +## Certificate Generation Scripts + +### 1. Build-Time: Generate Cluster-Wide Certs + +**Script**: `tools/generate-build-certs.sh` + +```bash +#!/bin/bash +# Generates all CA certificates and shared keys at ISO build time +# Output: certs/ directory to be embedded in ISO + +set -euo pipefail + +CERT_DIR="${1:-./certs}" +mkdir -p "$CERT_DIR"/{kubernetes/pki/etcd,ceph,mqtt} + +echo "==> Generating Kubernetes CA certificates..." + +# 1. Kubernetes CA +openssl genrsa -out "$CERT_DIR/kubernetes/pki/ca.key" 2048 +openssl req -x509 -new -nodes -key "$CERT_DIR/kubernetes/pki/ca.key" \ + -subj "/CN=kubernetes-ca" \ + -days 3650 -out "$CERT_DIR/kubernetes/pki/ca.crt" + +# 2. Front Proxy CA +openssl genrsa -out "$CERT_DIR/kubernetes/pki/front-proxy-ca.key" 2048 +openssl req -x509 -new -nodes -key "$CERT_DIR/kubernetes/pki/front-proxy-ca.key" \ + -subj "/CN=kubernetes-front-proxy-ca" \ + -days 3650 -out "$CERT_DIR/kubernetes/pki/front-proxy-ca.crt" + +# 3. etcd CA +openssl genrsa -out "$CERT_DIR/kubernetes/pki/etcd/ca.key" 2048 +openssl req -x509 -new -nodes -key "$CERT_DIR/kubernetes/pki/etcd/ca.key" \ + -subj "/CN=etcd-ca" \ + -days 3650 -out "$CERT_DIR/kubernetes/pki/etcd/ca.crt" + +# 4. Service Account Key Pair (RSA for signing JWTs) +openssl genrsa -out "$CERT_DIR/kubernetes/pki/sa.key" 2048 +openssl rsa -in "$CERT_DIR/kubernetes/pki/sa.key" \ + -pubout -out "$CERT_DIR/kubernetes/pki/sa.pub" + +# 5. Front Proxy Client (static - same for all API servers) +openssl genrsa -out "$CERT_DIR/kubernetes/pki/front-proxy-client.key" 2048 +openssl req -new -key "$CERT_DIR/kubernetes/pki/front-proxy-client.key" \ + -subj "/CN=front-proxy-client" \ + -out "$CERT_DIR/kubernetes/pki/front-proxy-client.csr" +openssl x509 -req -in "$CERT_DIR/kubernetes/pki/front-proxy-client.csr" \ + -CA "$CERT_DIR/kubernetes/pki/front-proxy-ca.crt" \ + -CAkey "$CERT_DIR/kubernetes/pki/front-proxy-ca.key" \ + -CAcreateserial -days 3650 \ + -out "$CERT_DIR/kubernetes/pki/front-proxy-client.crt" + +echo "==> Generating Ceph keyrings..." + +# Ceph uses cephx authentication (symmetric keys) +# These will be generated using ceph-authtool at first-boot of first monitor +# For now, create placeholder directory +touch "$CERT_DIR/ceph/.placeholder" + +echo "==> Generating MQTT password file..." + +# Default admin user (change password post-install!) +# Password: "admin" (hashed) +mosquitto_passwd -c -b "$CERT_DIR/mqtt/passwd" admin admin 2>/dev/null || { + echo "WARNING: mosquitto_passwd not found. Creating empty file." + touch "$CERT_DIR/mqtt/passwd" +} + +echo "==> Build-time certificates generated in $CERT_DIR" +ls -lhR "$CERT_DIR" +``` + +### 2. First Boot (Master): Generate Node-Specific Certs + +**Script**: `tools/generate-node-certs.sh` + +```bash +#!/bin/bash +# Generates node-specific certificates on first boot +# Run by cluster-detect.service after node identity established + +set -euo pipefail + +CONFIG_DIR="${CONFIG_DIR:-/etc/cluster-config}" +NODE_NAME=$(cat "$CONFIG_DIR/node-identity") +NODE_CONFIG="$CONFIG_DIR/nodes/$NODE_NAME.yaml" + +# Extract node IP from current-node.yaml +NODE_IP=$(yq eval '.node.ip' "$CONFIG_DIR/current-node.yaml") +CLUSTER_NAME=$(yq eval '.cluster.name' "$CONFIG_DIR/cluster.yaml") +SERVICE_CIDR=$(yq eval '.cluster.networking.service_cidr' "$CONFIG_DIR/cluster.yaml") + +# Kubernetes API VIP (first IP in service CIDR) +KUBERNETES_SVC_IP=$(echo "$SERVICE_CIDR" | awk -F'[./]' '{print $1"."$2"."$3".1"}') + +PKI_DIR="/etc/kubernetes/pki" +ETCD_PKI_DIR="$PKI_DIR/etcd" + +echo "==> Generating certificates for node: $NODE_NAME ($NODE_IP)" + +# Check if node is a master +ROLES=$(yq eval '.node.roles[]' "$NODE_CONFIG") +IS_MASTER=false +[[ "$ROLES" =~ "master" || "$ROLES" =~ "control-plane" ]] && IS_MASTER=true + +if [[ "$IS_MASTER" == "true" ]]; then + echo "==> Generating API server certificate..." + + # Create OpenSSL config for SAN + cat > /tmp/apiserver-csr.conf <<EOF +[req] +req_extensions = v3_req +distinguished_name = req_distinguished_name +[req_distinguished_name] +[v3_req] +basicConstraints = CA:FALSE +keyUsage = nonRepudiation, digitalSignature, keyEncipherment +extendedKeyUsage = serverAuth +subjectAltName = @alt_names +[alt_names] +DNS.1 = kubernetes +DNS.2 = kubernetes.default +DNS.3 = kubernetes.default.svc +DNS.4 = kubernetes.default.svc.cluster.local +DNS.5 = $NODE_NAME +DNS.6 = $CLUSTER_NAME +IP.1 = $KUBERNETES_SVC_IP +IP.2 = $NODE_IP +IP.3 = 127.0.0.1 +EOF + + openssl genrsa -out "$PKI_DIR/apiserver.key" 2048 + openssl req -new -key "$PKI_DIR/apiserver.key" \ + -subj "/CN=kube-apiserver" \ + -config /tmp/apiserver-csr.conf \ + -out "$PKI_DIR/apiserver.csr" + openssl x509 -req -in "$PKI_DIR/apiserver.csr" \ + -CA "$PKI_DIR/ca.crt" -CAkey "$PKI_DIR/ca.key" \ + -CAcreateserial -days 3650 \ + -extensions v3_req -extfile /tmp/apiserver-csr.conf \ + -out "$PKI_DIR/apiserver.crt" + + echo "==> Generating API server kubelet client certificate..." + openssl genrsa -out "$PKI_DIR/apiserver-kubelet-client.key" 2048 + openssl req -new -key "$PKI_DIR/apiserver-kubelet-client.key" \ + -subj "/CN=kube-apiserver-kubelet-client/O=system:masters" \ + -out "$PKI_DIR/apiserver-kubelet-client.csr" + openssl x509 -req -in "$PKI_DIR/apiserver-kubelet-client.csr" \ + -CA "$PKI_DIR/ca.crt" -CAkey "$PKI_DIR/ca.key" \ + -CAcreateserial -days 3650 \ + -out "$PKI_DIR/apiserver-kubelet-client.crt" + + echo "==> Generating etcd server certificate..." + cat > /tmp/etcd-server-csr.conf <<EOF +[req] +req_extensions = v3_req +distinguished_name = req_distinguished_name +[req_distinguished_name] +[v3_req] +basicConstraints = CA:FALSE +keyUsage = nonRepudiation, digitalSignature, keyEncipherment +extendedKeyUsage = serverAuth, clientAuth +subjectAltName = @alt_names +[alt_names] +DNS.1 = $NODE_NAME +DNS.2 = localhost +IP.1 = $NODE_IP +IP.2 = 127.0.0.1 +EOF + + openssl genrsa -out "$ETCD_PKI_DIR/server.key" 2048 + openssl req -new -key "$ETCD_PKI_DIR/server.key" \ + -subj "/CN=$NODE_NAME" \ + -config /tmp/etcd-server-csr.conf \ + -out "$ETCD_PKI_DIR/server.csr" + openssl x509 -req -in "$ETCD_PKI_DIR/server.csr" \ + -CA "$ETCD_PKI_DIR/ca.crt" -CAkey "$ETCD_PKI_DIR/ca.key" \ + -CAcreateserial -days 3650 \ + -extensions v3_req -extfile /tmp/etcd-server-csr.conf \ + -out "$ETCD_PKI_DIR/server.crt" + + echo "==> Generating etcd peer certificate..." + openssl genrsa -out "$ETCD_PKI_DIR/peer.key" 2048 + openssl req -new -key "$ETCD_PKI_DIR/peer.key" \ + -subj "/CN=$NODE_NAME" \ + -config /tmp/etcd-server-csr.conf \ + -out "$ETCD_PKI_DIR/peer.csr" + openssl x509 -req -in "$ETCD_PKI_DIR/peer.csr" \ + -CA "$ETCD_PKI_DIR/ca.crt" -CAkey "$ETCD_PKI_DIR/ca.key" \ + -CAcreateserial -days 3650 \ + -extensions v3_req -extfile /tmp/etcd-server-csr.conf \ + -out "$ETCD_PKI_DIR/peer.crt" + + echo "==> Generating etcd healthcheck client certificate..." + openssl genrsa -out "$ETCD_PKI_DIR/healthcheck-client.key" 2048 + openssl req -new -key "$ETCD_PKI_DIR/healthcheck-client.key" \ + -subj "/CN=kube-etcd-healthcheck-client/O=system:masters" \ + -out "$ETCD_PKI_DIR/healthcheck-client.csr" + openssl x509 -req -in "$ETCD_PKI_DIR/healthcheck-client.csr" \ + -CA "$ETCD_PKI_DIR/ca.crt" -CAkey "$ETCD_PKI_DIR/ca.key" \ + -CAcreateserial -days 3650 \ + -out "$ETCD_PKI_DIR/healthcheck-client.crt" + + # Set permissions + chmod 600 "$PKI_DIR"/*.key "$ETCD_PKI_DIR"/*.key +fi + +echo "==> Generating kubelet certificate..." +mkdir -p /var/lib/kubelet/pki + +cat > /tmp/kubelet-csr.conf <<EOF +[req] +req_extensions = v3_req +distinguished_name = req_distinguished_name +[req_distinguished_name] +[v3_req] +basicConstraints = CA:FALSE +keyUsage = nonRepudiation, digitalSignature, keyEncipherment +extendedKeyUsage = serverAuth +subjectAltName = @alt_names +[alt_names] +DNS.1 = $NODE_NAME +IP.1 = $NODE_IP +EOF + +openssl genrsa -out /var/lib/kubelet/pki/kubelet.key 2048 +openssl req -new -key /var/lib/kubelet/pki/kubelet.key \ + -subj "/CN=system:node:$NODE_NAME/O=system:nodes" \ + -config /tmp/kubelet-csr.conf \ + -out /var/lib/kubelet/pki/kubelet.csr +openssl x509 -req -in /var/lib/kubelet/pki/kubelet.csr \ + -CA "$PKI_DIR/ca.crt" -CAkey "$PKI_DIR/ca.key" \ + -CAcreateserial -days 3650 \ + -extensions v3_req -extfile /tmp/kubelet-csr.conf \ + -out /var/lib/kubelet/pki/kubelet.crt + +chmod 600 /var/lib/kubelet/pki/kubelet.key + +echo "==> Node certificates generated successfully" +``` + +### 3. First Boot (First Master): Generate Ceph Keys + +**Script**: `tools/generate-ceph-keys.sh` + +```bash +#!/bin/bash +# Generates Ceph authentication keys on first monitor node +# Only run on the FIRST ceph-mon node + +set -euo pipefail + +CEPH_DIR="/etc/ceph" +mkdir -p "$CEPH_DIR" + +echo "==> Generating Ceph cluster FSID..." +FSID=$(uuidgen) +echo "$FSID" > "$CEPH_DIR/cluster.fsid" + +echo "==> Generating Ceph monitor keyring..." +ceph-authtool --create-keyring "$CEPH_DIR/ceph.mon.keyring" \ + --gen-key -n mon. --cap mon 'allow *' + +echo "==> Generating Ceph admin keyring..." +ceph-authtool --create-keyring "$CEPH_DIR/ceph.client.admin.keyring" \ + --gen-key -n client.admin \ + --cap mon 'allow *' \ + --cap osd 'allow *' \ + --cap mds 'allow *' \ + --cap mgr 'allow *' + +echo "==> Generating OSD bootstrap keyring..." +ceph-authtool --create-keyring "$CEPH_DIR/ceph.bootstrap-osd.keyring" \ + --gen-key -n client.bootstrap-osd \ + --cap mon 'allow profile bootstrap-osd' \ + --cap mgr 'allow r' + +echo "==> Generating MDS bootstrap keyring..." +ceph-authtool --create-keyring "$CEPH_DIR/ceph.bootstrap-mds.keyring" \ + --gen-key -n client.bootstrap-mds \ + --cap mon 'allow profile bootstrap-mds' \ + --cap mgr 'allow r' + +# Import admin keyring into mon keyring +ceph-authtool "$CEPH_DIR/ceph.mon.keyring" --import-keyring "$CEPH_DIR/ceph.client.admin.keyring" + +chmod 600 "$CEPH_DIR"/*.keyring + +echo "==> Ceph keys generated. FSID: $FSID" +echo "==> These keys must be distributed to all ceph nodes!" +``` + +## Integration with Boot Flow + +### Modified Boot Sequence + +``` +1. cluster-detect.service + ├─> cluster-detect.sh (identify node) + └─> generate-node-certs.sh (NEW - generate node-specific certs) + +2. generate-environment-files.sh + (reads certs, creates .env files with cert paths) + +3. cluster-activate-roles.sh + (enables targets based on roles) + +4. Services start + (now have valid certificates) +``` + +### Updated cluster-detect.service + +```ini +[Unit] +Description=Cluster Node Detection and Certificate Generation +DefaultDependencies=no +Before=network-pre.target +Wants=network-pre.target + +[Service] +Type=oneshot +RemainAfterExit=yes +ExecStart=/usr/local/bin/cluster-detect.sh +ExecStart=/usr/local/bin/generate-node-certs.sh +ExecStart=/usr/local/bin/generate-environment-files.sh +ExecStart=/usr/local/bin/cluster-activate-roles.sh + +[Install] +WantedBy=multi-user.target +``` + +## Certificate Distribution for Multi-Master + +**Problem**: Masters 2+ need the CA private keys to sign their own certs. + +**Solutions** (pick one): + +### Option A: Embed CA keys in ISO (simple, less secure) +- All CA private keys baked into ISO +- Each master generates its own node certs using shared CA +- **Remove CA keys after cluster init** via post-bootstrap script + +### Option B: Fetch from first master (secure, complex) +- First master generates CA, serves via HTTPS +- Additional masters fetch CA bundle during bootstrap +- Requires authentication (shared secret in cluster.yaml) + +### Option C: External secret store (production) +- Build-time: Upload CA to Vault/AWS Secrets Manager +- Boot-time: Fetch CA using node identity proof +- Requires external dependency + +**Recommendation**: Start with Option A, migrate to Option C for production. + +## Security Hardening Checklist + +- [ ] Set file permissions: 600 for `.key`, 644 for `.crt` +- [ ] Remove CA private keys after all masters bootstrapped +- [ ] Rotate service account keys annually +- [ ] Enable Kubernetes certificate rotation (kubelet `--rotate-certificates`) +- [ ] Monitor certificate expiry (90 days before expiration) +- [ ] Store root CA offline (not in ISO for production) +- [ ] Use hardware security module (HSM) for CA keys (production) +- [ ] Implement certificate revocation list (CRL) + +## Validation Script + +**Script**: `tools/validate-certs.sh` + +```bash +#!/bin/bash +# Validates all certificates before service startup + +set -euo pipefail + +PKI_DIR="/etc/kubernetes/pki" +ERRORS=0 + +check_cert() { + local cert=$1 + local key=$2 + + if [[ ! -f "$cert" ]]; then + echo "ERROR: Certificate not found: $cert" + ((ERRORS++)) + return + fi + + if [[ ! -f "$key" ]]; then + echo "ERROR: Private key not found: $key" + ((ERRORS++)) + return + fi + + # Check expiry + if ! openssl x509 -in "$cert" -noout -checkend 86400; then + echo "WARNING: Certificate expires within 24 hours: $cert" + fi + + # Verify key matches cert + cert_modulus=$(openssl x509 -in "$cert" -noout -modulus) + key_modulus=$(openssl rsa -in "$key" -noout -modulus) + + if [[ "$cert_modulus" != "$key_modulus" ]]; then + echo "ERROR: Certificate and key mismatch: $cert / $key" + ((ERRORS++)) + fi +} + +echo "==> Validating Kubernetes certificates..." +check_cert "$PKI_DIR/ca.crt" "$PKI_DIR/ca.key" +check_cert "$PKI_DIR/apiserver.crt" "$PKI_DIR/apiserver.key" + +if [[ $ERRORS -gt 0 ]]; then + echo "==> Validation FAILED with $ERRORS errors" + exit 1 +fi + +echo "==> Validation PASSED" +``` + +## Next Steps + +1. Implement `tools/generate-build-certs.sh` +2. Implement `tools/generate-node-certs.sh` +3. Implement `tools/generate-ceph-keys.sh` +4. Update `cluster-detect.service` to call cert generation +5. Update service units to validate certs in `ExecStartPre=` +6. Test certificate generation in VM +7. Document CA key removal procedure for production + +--- + +**P.S.** This gives you a working PKI. It's not production-grade (CA keys in ISO is a compromise), but it unblocks development. You can swap to Vault/external PKI later without changing the service units—they just read from `/etc/kubernetes/pki/`. diff --git a/ISO-BUILDER.md b/ISO-BUILDER.md new file mode 100644 index 0000000..a1c5ebd --- /dev/null +++ b/ISO-BUILDER.md @@ -0,0 +1,849 @@ +# ISO Builder + +## Overview + +This document specifies how to build a **bootable ISO image** that contains the base OS, all binaries, configuration files, systemd units, and scripts required for cluster-from-systemd. + +## Design Decision: Kickstart + Live Image + +**Approach**: Create a Rocky Linux 9 live ISO with embedded kickstart for automated installation. + +**Rationale**: +- **Kickstart**: Automates installation (partitioning, package selection, post-install scripts) +- **Live CD tooling**: Well-established RHEL ecosystem tools (lorax, livecd-tools) +- **Single ISO**: All nodes boot from the same image +- **Reproducible**: Version-controlled build process + +**Alternative approaches considered**: +- **cloud-init/ignition**: Better for cloud, but requires external metadata service +- **PXE boot**: Network dependency, more complex infrastructure +- **Container-based build (mkosi)**: Bleeding edge, less RHEL-native + +## Build System Requirements + +**Host OS**: Fedora 39 or Rocky Linux 9 (build environment) + +**Required packages**: +```bash +dnf install -y \ + lorax \ + anaconda \ + pykickstart \ + createrepo_c \ + genisoimage \ + isomd5sum \ + syslinux \ + git +``` + +**Disk space**: ~30 GB (for build artifacts, repos, and final ISO) + +## Directory Structure + +``` +iso-build/ +├── kickstart/ +│ └── cluster-node.ks # Main kickstart file +├── repo/ +│ ├── binaries/ # Downloaded binaries (k8s, etcd, etc.) +│ ├── rpms/ # Custom RPM packages +│ └── repodata/ # RPM repository metadata +├── overlay/ +│ ├── etc/ +│ │ ├── cluster-config/ # Cluster YAML configs +│ │ │ ├── cluster.yaml +│ │ │ ├── services/ +│ │ │ └── nodes/ +│ │ ├── systemd/system/ # Systemd units +│ │ └── kubernetes/ +│ │ ├── pki/ # Pre-generated certificates +│ │ └── manifests/ +│ ├── usr/local/bin/ # Cluster scripts +│ └── var/www/html/ # Join info distribution +├── output/ +│ └── cluster-node.iso # Final bootable ISO +└── build.sh # Main build script +``` + +## Kickstart File + +**File**: `iso-build/kickstart/cluster-node.ks` + +```kickstart +#version=RHEL9 + +# Install mode (not upgrade) +install + +# Use CDROM installation media +cdrom + +# Text mode installation (no GUI) +text + +# System language +lang en_US.UTF-8 + +# Keyboard layouts +keyboard us + +# Network configuration (temporary DHCP, will be reconfigured on boot) +network --bootproto=dhcp --device=link --activate + +# Root password (change this!) +rootpw --plaintext cluster-password + +# System authorization +authselect select sssd + +# SELinux mode +selinux --enforcing + +# Firewall configuration +firewall --enabled --service=ssh + +# System timezone +timezone America/New_York --utc + +# Disk partitioning +# WARNING: This will ERASE the entire disk! +ignoredisk --only-use=sda +clearpart --all --initlabel --drives=sda +part /boot --fstype=xfs --size=1024 +part / --fstype=xfs --size=20480 --grow +part swap --fstype=swap --size=4096 + +# Bootloader +bootloader --location=mbr --boot-drive=sda + +# Reboot after installation +reboot + +# Package selection +%packages +@core +@standard +kernel +systemd +systemd-networkd +NetworkManager +vim +tmux +htop +curl +wget +jq +python3 +python3-pip +python3-pyyaml +openssl +iproute +iputils +bind-utils +iptables +nftables +%end + +# Pre-installation script +%pre --log=/tmp/ks-pre.log +#!/bin/bash +echo "==> Starting pre-installation" +# Could add disk detection logic here +%end + +# Post-installation script +%post --log=/root/ks-post.log --erroronfail +#!/bin/bash +set -x + +echo "==> Starting post-installation configuration" + +# 1. Copy cluster configuration files +echo "==> Installing cluster configuration..." +mkdir -p /etc/cluster-config/{services,nodes,environment} +cp -r /mnt/install/repo/overlay/etc/cluster-config/* /etc/cluster-config/ + +# 2. Copy systemd units +echo "==> Installing systemd units..." +cp /mnt/install/repo/overlay/etc/systemd/system/* /etc/systemd/system/ +systemctl daemon-reload + +# 3. Install cluster scripts +echo "==> Installing cluster scripts..." +cp /mnt/install/repo/overlay/usr/local/bin/* /usr/local/bin/ +chmod +x /usr/local/bin/*.sh + +# 4. Install binaries +echo "==> Installing Kubernetes binaries..." +BINARIES="kubelet kubectl kubeadm kube-apiserver kube-controller-manager kube-scheduler" +for binary in $BINARIES; do + if [[ -f /mnt/install/repo/binaries/$binary ]]; then + cp /mnt/install/repo/binaries/$binary /usr/local/bin/ + chmod +x /usr/local/bin/$binary + fi +done + +echo "==> Installing etcd..." +if [[ -f /mnt/install/repo/binaries/etcd ]]; then + cp /mnt/install/repo/binaries/etcd /usr/local/bin/ + cp /mnt/install/repo/binaries/etcdctl /usr/local/bin/ + chmod +x /usr/local/bin/etcd /usr/local/bin/etcdctl +fi + +echo "==> Installing CoreDNS..." +if [[ -f /mnt/install/repo/binaries/coredns ]]; then + cp /mnt/install/repo/binaries/coredns /usr/local/bin/ + chmod +x /usr/local/bin/coredns +fi + +echo "==> Installing CNI plugins..." +if [[ -d /mnt/install/repo/binaries/cni ]]; then + mkdir -p /opt/cni/bin + cp -r /mnt/install/repo/binaries/cni/* /opt/cni/bin/ + chmod +x /opt/cni/bin/* +fi + +echo "==> Installing Kafka..." +if [[ -d /mnt/install/repo/binaries/kafka ]]; then + mkdir -p /opt + cp -r /mnt/install/repo/binaries/kafka /opt/ + ln -sf /opt/kafka /opt/kafka-current +fi + +# 5. Install certificates +echo "==> Installing certificates..." +mkdir -p /etc/kubernetes/pki/etcd +if [[ -d /mnt/install/repo/overlay/etc/kubernetes/pki ]]; then + cp -r /mnt/install/repo/overlay/etc/kubernetes/pki/* /etc/kubernetes/pki/ + chmod 600 /etc/kubernetes/pki/*.key + chmod 600 /etc/kubernetes/pki/etcd/*.key +fi + +# 6. Create data directories +echo "==> Creating data directories..." +mkdir -p /var/lib/{kubelet,etcd,kafka,ceph/{mon,osd},mosquitto} +mkdir -p /var/lib/cluster-state + +# 7. Install container runtime (containerd) +echo "==> Installing containerd..." +dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo +dnf install -y containerd.io runc + +mkdir -p /etc/containerd +containerd config default > /etc/containerd/config.toml +sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml + +systemctl enable containerd + +# 8. Install Ceph packages +echo "==> Installing Ceph packages..." +cat > /etc/yum.repos.d/ceph.repo <<'CEPH_REPO' +[ceph] +name=Ceph packages for $basearch +baseurl=https://download.ceph.com/rpm-reef/el9/$basearch +enabled=1 +gpgcheck=1 +gpgkey=https://download.ceph.com/keys/release.asc + +[ceph-noarch] +name=Ceph noarch packages +baseurl=https://download.ceph.com/rpm-reef/el9/noarch +enabled=1 +gpgcheck=1 +gpgkey=https://download.ceph.com/keys/release.asc +CEPH_REPO + +dnf install -y ceph-common ceph-mon ceph-osd ceph-mds ceph-mgr + +# 9. Install Mosquitto +echo "==> Installing Mosquitto..." +dnf install -y epel-release +dnf install -y mosquitto mosquitto-clients + +# 10. Install Java (for Kafka) +echo "==> Installing Java..." +dnf install -y java-17-openjdk-headless + +# 11. Enable cluster-detect service +echo "==> Enabling cluster services..." +systemctl enable cluster-detect.service +systemctl enable cluster-bootstrap.service + +# 12. Disable unwanted services +systemctl disable firewalld +systemctl mask firewalld + +# 13. Configure sysctl for Kubernetes +cat >> /etc/sysctl.d/99-kubernetes.conf <<'SYSCTL' +net.bridge.bridge-nf-call-iptables = 1 +net.bridge.bridge-nf-call-ip6tables = 1 +net.ipv4.ip_forward = 1 +SYSCTL + +# 14. Load required kernel modules +cat > /etc/modules-load.d/kubernetes.conf <<'MODULES' +overlay +br_netfilter +MODULES + +# 15. Install yq (YAML processor) +echo "==> Installing yq..." +YQ_VERSION="v4.40.5" +curl -L "https://github.com/mikefarah/yq/releases/download/${YQ_VERSION}/yq_linux_amd64" \ + -o /usr/local/bin/yq +chmod +x /usr/local/bin/yq + +# 16. Create version info +cat > /etc/cluster-version <<VERSION +CLUSTER_ISO_VERSION=0.1.0 +BUILD_DATE=$(date -Iseconds) +KUBERNETES_VERSION=1.29.1 +CEPH_VERSION=18.2.1 +KAFKA_VERSION=3.6.1 +VERSION + +echo "==> Post-installation complete!" +echo "==> System will reboot and detect node identity on first boot" + +%end +``` + +## Build Script + +**File**: `iso-build/build.sh` + +```bash +#!/bin/bash +# Main ISO build script + +set -euo pipefail + +BUILD_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +OUTPUT_DIR="$BUILD_DIR/output" +REPO_DIR="$BUILD_DIR/repo" +OVERLAY_DIR="$BUILD_DIR/overlay" +KICKSTART_FILE="$BUILD_DIR/kickstart/cluster-node.ks" + +ISO_NAME="cluster-node.iso" +ISO_LABEL="ClusterNode" +ISO_VERSION="0.1.0" + +echo "==> Cluster-from-SystemD ISO Builder v$ISO_VERSION" +echo "==> Build directory: $BUILD_DIR" + +# Check prerequisites +command -v lorax >/dev/null || { echo "ERROR: lorax not found. Install with: dnf install lorax"; exit 1; } +command -v createrepo_c >/dev/null || { echo "ERROR: createrepo_c not found"; exit 1; } + +# Create directories +mkdir -p "$OUTPUT_DIR" "$REPO_DIR"/{binaries,rpms} "$OVERLAY_DIR" + +# Step 1: Download binaries +echo "" +echo "==> Step 1: Downloading binaries..." +bash "$BUILD_DIR/../tools/download-kubernetes.sh" "$REPO_DIR/binaries" +bash "$BUILD_DIR/../tools/download-etcd.sh" "$REPO_DIR/binaries" +bash "$BUILD_DIR/../tools/download-cni-plugins.sh" "$REPO_DIR/binaries" +bash "$BUILD_DIR/../tools/download-coredns.sh" "$REPO_DIR/binaries" +bash "$BUILD_DIR/../tools/install-kafka.sh" "$REPO_DIR/binaries" + +# Step 2: Copy overlay files +echo "" +echo "==> Step 2: Copying overlay files..." + +# Copy configs +mkdir -p "$OVERLAY_DIR/etc/cluster-config" +cp -r "$BUILD_DIR/../configs"/* "$OVERLAY_DIR/etc/cluster-config/" + +# Copy systemd units +mkdir -p "$OVERLAY_DIR/etc/systemd/system" +cp "$BUILD_DIR/../systemd"/*.service "$OVERLAY_DIR/etc/systemd/system/" +cp "$BUILD_DIR/../systemd"/*.target "$OVERLAY_DIR/etc/systemd/system/" + +# Copy scripts +mkdir -p "$OVERLAY_DIR/usr/local/bin" +cp "$BUILD_DIR/../tools"/*.sh "$OVERLAY_DIR/usr/local/bin/" +cp "$BUILD_DIR/../tools"/*.py "$OVERLAY_DIR/usr/local/bin/" + +# Step 3: Generate certificates +echo "" +echo "==> Step 3: Generating certificates..." +bash "$BUILD_DIR/../tools/generate-build-certs.sh" "$OVERLAY_DIR/etc/kubernetes/pki" + +# Step 4: Create local repository +echo "" +echo "==> Step 4: Creating local repository..." +createrepo_c "$REPO_DIR/rpms" + +# Step 5: Build ISO using lorax +echo "" +echo "==> Step 5: Building ISO with lorax..." + +# Download Rocky Linux boot files +ROCKY_VERSION="9.3" +ROCKY_BOOT_ISO="Rocky-${ROCKY_VERSION}-x86_64-boot.iso" +ROCKY_URL="https://download.rockylinux.org/pub/rocky/${ROCKY_VERSION}/isos/x86_64/${ROCKY_BOOT_ISO}" + +if [[ ! -f "$REPO_DIR/$ROCKY_BOOT_ISO" ]]; then + echo "==> Downloading Rocky Linux boot ISO..." + curl -L "$ROCKY_URL" -o "$REPO_DIR/$ROCKY_BOOT_ISO" +fi + +# Mount boot ISO to extract vmlinuz and initrd +MOUNT_DIR="/tmp/rocky-mount-$$" +mkdir -p "$MOUNT_DIR" +mount -o loop "$REPO_DIR/$ROCKY_BOOT_ISO" "$MOUNT_DIR" + +# Create working directory for ISO build +ISO_WORK_DIR="/tmp/iso-build-$$" +mkdir -p "$ISO_WORK_DIR"/{isolinux,images,LiveOS} + +# Copy boot files +cp "$MOUNT_DIR/isolinux/vmlinuz" "$ISO_WORK_DIR/isolinux/" +cp "$MOUNT_DIR/isolinux/initrd.img" "$ISO_WORK_DIR/isolinux/" +cp "$MOUNT_DIR/isolinux/isolinux.bin" "$ISO_WORK_DIR/isolinux/" +cp "$MOUNT_DIR/isolinux/ldlinux.c32" "$ISO_WORK_DIR/isolinux/" +cp "$MOUNT_DIR/isolinux/libcom32.c32" "$ISO_WORK_DIR/isolinux/" +cp "$MOUNT_DIR/isolinux/libutil.c32" "$ISO_WORK_DIR/isolinux/" + +umount "$MOUNT_DIR" +rmdir "$MOUNT_DIR" + +# Create isolinux.cfg with kickstart +cat > "$ISO_WORK_DIR/isolinux/isolinux.cfg" <<'ISOLINUX' +default vesamenu.c32 +timeout 100 + +display boot.msg + +label install + menu label ^Install Cluster Node + kernel vmlinuz + append initrd=initrd.img inst.stage2=hd:LABEL=ClusterNode inst.ks=hd:LABEL=ClusterNode:/ks.cfg + +label rescue + menu label ^Rescue installed system + kernel vmlinuz + append initrd=initrd.img inst.stage2=hd:LABEL=ClusterNode rescue +ISOLINUX + +# Copy kickstart file to ISO root +cp "$KICKSTART_FILE" "$ISO_WORK_DIR/ks.cfg" + +# Copy overlay and repo to ISO +cp -r "$OVERLAY_DIR" "$ISO_WORK_DIR/overlay" +cp -r "$REPO_DIR" "$ISO_WORK_DIR/repo" + +# Create ISO +echo "==> Creating bootable ISO..." +genisoimage \ + -o "$OUTPUT_DIR/$ISO_NAME" \ + -b isolinux/isolinux.bin \ + -c isolinux/boot.cat \ + -no-emul-boot \ + -boot-load-size 4 \ + -boot-info-table \ + -J -R -v \ + -V "$ISO_LABEL" \ + "$ISO_WORK_DIR" + +# Make ISO bootable +isohybrid "$OUTPUT_DIR/$ISO_NAME" + +# Add MD5 checksum +implantisomd5 "$OUTPUT_DIR/$ISO_NAME" + +# Cleanup +rm -rf "$ISO_WORK_DIR" + +echo "" +echo "==> ISO build complete!" +echo "==> Output: $OUTPUT_DIR/$ISO_NAME" +echo "==> Size: $(du -h "$OUTPUT_DIR/$ISO_NAME" | cut -f1)" +echo "" +echo "Test with:" +echo " qemu-system-x86_64 -cdrom $OUTPUT_DIR/$ISO_NAME -m 4096 -smp 2" +``` + +## Simplified Build Script (Alternative: mkosi) + +For a more modern, declarative approach: + +**File**: `iso-build/mkosi.conf` + +```ini +[Output] +ImageId=cluster-node +ImageVersion=0.1.0 +Format=disk + +[Distribution] +Distribution=rocky +Release=9 + +[Content] +Packages= + systemd + kubernetes-kubeadm + etcd + vim + tmux + +ExtraTree=/path/to/overlay/ + +[Host] +QemuHeadless=yes +``` + +**Note**: mkosi is cleaner but less mature for RHEL-based distros. Recommend starting with lorax/kickstart. + +## Modified Download Scripts for Build + +Update download scripts to accept output directory: + +**Example**: `tools/download-kubernetes.sh` + +```bash +#!/bin/bash +# Modified to accept output directory for ISO build + +KUBE_VERSION="v1.29.1" +ARCH="amd64" +BASE_URL="https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/${ARCH}" +OUTPUT_DIR="${1:-/usr/local/bin}" + +mkdir -p "$OUTPUT_DIR" + +BINARIES=( + "kubeadm" + "kubectl" + "kubelet" + "kube-apiserver" + "kube-controller-manager" + "kube-scheduler" +) + +for binary in "${BINARIES[@]}"; do + echo "Downloading $binary to $OUTPUT_DIR..." + curl -L "${BASE_URL}/${binary}" -o "$OUTPUT_DIR/${binary}" + chmod +x "$OUTPUT_DIR/${binary}" +done + +echo "==> Kubernetes binaries downloaded to $OUTPUT_DIR" +``` + +## Testing the ISO + +### QEMU Testing + +```bash +#!/bin/bash +# tools/test-iso-qemu.sh + +ISO_PATH="iso-build/output/cluster-node.iso" +DISK_IMG="test-disk.qcow2" + +# Create virtual disk +qemu-img create -f qcow2 "$DISK_IMG" 40G + +# Boot from ISO +qemu-system-x86_64 \ + -cdrom "$ISO_PATH" \ + -hda "$DISK_IMG" \ + -m 4096 \ + -smp 2 \ + -boot d \ + -netdev user,id=net0 \ + -device virtio-net-pci,netdev=net0 \ + -serial stdio \ + -display none + +# After installation, boot from disk +# qemu-system-x86_64 -hda "$DISK_IMG" -m 4096 -smp 2 -boot c +``` + +### VirtualBox Testing + +```bash +# Create VM +VBoxManage createvm --name "cluster-test" --ostype RedHat_64 --register + +# Configure VM +VBoxManage modifyvm "cluster-test" \ + --memory 4096 \ + --cpus 2 \ + --nic1 bridged \ + --bridgeadapter1 eth0 + +# Create disk +VBoxManage createhd --filename cluster-test.vdi --size 40960 + +# Attach storage +VBoxManage storagectl "cluster-test" --name "SATA" --add sata --controller IntelAhci +VBoxManage storageattach "cluster-test" --storagectl "SATA" --port 0 --device 0 --type hdd --medium cluster-test.vdi +VBoxManage storageattach "cluster-test" --storagectl "SATA" --port 1 --device 0 --type dvddrive --medium cluster-node.iso + +# Start VM +VBoxManage startvm "cluster-test" +``` + +### Multi-Node Test + +```bash +#!/bin/bash +# tools/test-multi-node.sh +# Create 3 VMs for full cluster test + +for i in 1 2 3; do + DISK="test-node-${i}.qcow2" + qemu-img create -f qcow2 "$DISK" 40G + + qemu-system-x86_64 \ + -name "node-${i}" \ + -cdrom iso-build/output/cluster-node.iso \ + -hda "$DISK" \ + -m 4096 \ + -smp 2 \ + -boot d \ + -netdev tap,id=net0,ifname=tap${i},script=no \ + -device virtio-net-pci,netdev=net0,mac=52:54:00:12:34:1${i} \ + -daemonize \ + -vnc :${i} + + echo "Node $i started on VNC port 590${i}" +done +``` + +## Build Optimization + +### Caching Downloaded Files + +```bash +# Use a persistent cache directory +CACHE_DIR="$HOME/.cache/cluster-iso-build" +mkdir -p "$CACHE_DIR"/{binaries,rpms} + +# In build.sh, check cache before downloading +if [[ ! -f "$CACHE_DIR/binaries/kubelet" ]]; then + bash tools/download-kubernetes.sh "$CACHE_DIR/binaries" +fi + +cp -r "$CACHE_DIR/binaries"/* "$REPO_DIR/binaries/" +``` + +### Parallel Downloads + +```bash +# Download binaries in parallel +bash tools/download-kubernetes.sh "$REPO_DIR/binaries" & +bash tools/download-etcd.sh "$REPO_DIR/binaries" & +bash tools/download-coredns.sh "$REPO_DIR/binaries" & +wait + +echo "==> All downloads complete" +``` + +## Reproducible Builds + +### Version Lock File + +**File**: `iso-build/versions.lock` + +```yaml +kubernetes: + version: "1.29.1" + sha256: + kubelet: "abc123..." + kubectl: "def456..." + +etcd: + version: "3.5.11" + sha256: "789ghi..." + +ceph: + version: "18.2.1" + +kafka: + version: "3.6.1" + scala_version: "2.13" + +base_iso: + name: "Rocky-9.3-x86_64-boot.iso" + sha256: "..." +``` + +### Checksum Verification + +```bash +#!/bin/bash +# tools/verify-downloads.sh + +set -euo pipefail + +VERSIONS_FILE="iso-build/versions.lock" +BINARIES_DIR="$1" + +while read -r binary expected_sha; do + actual_sha=$(sha256sum "$BINARIES_DIR/$binary" | awk '{print $1}') + + if [[ "$actual_sha" != "$expected_sha" ]]; then + echo "ERROR: Checksum mismatch for $binary" + echo " Expected: $expected_sha" + echo " Actual: $actual_sha" + exit 1 + fi + + echo "✓ $binary: checksum OK" +done < <(yq eval '.kubernetes.sha256 | to_entries | .[] | .key + " " + .value' "$VERSIONS_FILE") +``` + +## CI/CD Integration + +### GitHub Actions Example + +```yaml +# .github/workflows/build-iso.yml +name: Build ISO + +on: + push: + tags: + - 'v*' + +jobs: + build: + runs-on: ubuntu-latest + container: + image: rockylinux:9 + + steps: + - name: Checkout code + uses: actions/checkout@v3 + + - name: Install build dependencies + run: | + dnf install -y lorax createrepo_c genisoimage isomd5sum + + - name: Run build + run: | + bash iso-build/build.sh + + - name: Upload ISO artifact + uses: actions/upload-artifact@v3 + with: + name: cluster-node-iso + path: iso-build/output/cluster-node.iso + + - name: Create release + uses: softprops/action-gh-release@v1 + with: + files: iso-build/output/cluster-node.iso +``` + +## Troubleshooting + +### Build Fails: "lorax command not found" + +```bash +# On Fedora +dnf install lorax anaconda-tui + +# On Rocky Linux +dnf install epel-release +dnf install lorax +``` + +### ISO Won't Boot: "Missing operating system" + +```bash +# Ensure isohybrid was run +isohybrid output/cluster-node.iso + +# Verify boot sector +file output/cluster-node.iso +# Should show: "DOS/MBR boot sector" +``` + +### Kickstart Fails During Post-Install + +```bash +# Boot ISO in rescue mode +# Check logs +cat /tmp/ks-post.log +cat /tmp/anaconda.log + +# Common issues: +# - Missing files in overlay +# - Incorrect paths in kickstart +# - Network not available during post-install +``` + +## Next Steps + +1. Create `iso-build/` directory structure +2. Implement build.sh script +3. Test ISO build on clean Fedora VM +4. Test installation in QEMU +5. Verify cluster-detect runs on first boot +6. Test multi-node cluster formation +7. Document customization procedures + +## Customization Guide + +### Adding Custom Packages + +Edit kickstart file: +```kickstart +%packages +@core +my-custom-package +%end +``` + +### Changing Root Password + +In kickstart: +```kickstart +rootpw --iscrypted $6$rounds=4096$salt$hash +``` + +Generate hash: +```bash +python3 -c 'import crypt; print(crypt.crypt("your-password", crypt.mkmeth(crypt.METHOD_SHA512)))' +``` + +### Embedding Custom Files + +Add to overlay directory: +```bash +mkdir -p overlay/etc/my-app/ +cp my-config.yaml overlay/etc/my-app/ +``` + +## Maintenance + +### Updating Base OS + +```bash +# Update ROCKY_VERSION in build.sh +ROCKY_VERSION="9.4" # When Rocky 9.4 is released + +# Re-download boot ISO +rm repo/Rocky-*.iso +bash iso-build/build.sh +``` + +### Updating Kubernetes Version + +```bash +# Edit versions.lock +kubernetes: + version: "1.30.0" + +# Re-run build +bash iso-build/build.sh +``` + +--- + +**P.S.** This is the glue that makes everything real—from pile of configs to bootable artifact. Kickstart is ugly but battle-tested; lorax is RHEL-native. The build script is bash-heavy but transparent. You can see exactly what's happening, which beats "magic container build tool" when debugging at 2 AM. Test in QEMU first, VirtualBox second, bare metal third. |
