Kubernetes Cluster
One Ubuntu host (k8s-worker-3090 at 192.168.1.253), bootstrapped with
kubeadm, doubling as control plane and the only worker. Pod network is
kube-flannel (default kubeadm-friendly CNI, no extras).
Why single-node
The cluster trades the ergonomics of HA for the cost and complexity floor of a single box. Most failure modes a real cluster catches via redundancy this one catches via Velero backups + a tested restore drill — see Backup & DR.
Single-node has two practical consequences worth knowing:
- Some chart-default alerts are wrong by construction.
etcdMembersDownandetcdInsufficientMembersfrom kube-prometheus-stack assume HA quorum and always fire after every reboot. Both are disabled viadefaultRules.disabledininfrastructure/monitoring/kube-prometheus-stack/values.yaml. - etcd restart blips the apiserver for ~5–10 seconds. No worker can serve the apiserver while etcd is starting. Plan maintenance windows around it.
Control-plane metric ports
By default, kubeadm binds the four metrics endpoints to localhost only:
| Component | Port | Where the flag lives |
|---|---|---|
| kube-controller-manager | 10257 | /etc/kubernetes/manifests/kube-controller-manager.yaml |
| kube-scheduler | 10259 | /etc/kubernetes/manifests/kube-scheduler.yaml |
| etcd | 2381 | /etc/kubernetes/manifests/etcd.yaml |
| kube-proxy | 10249 | kube-proxy ConfigMap in kube-system |
Until each is rebound to 0.0.0.0, Prometheus shows them as perpetual
TargetDown. Static-pod manifests pick up changes automatically (kubelet
watches /etc/kubernetes/manifests/); kube-proxy needs a DaemonSet
restart after the ConfigMap edit.
# Backups (dotfiles so kubelet ignores them)
sudo cp -a /etc/kubernetes/manifests/etcd.yaml \
/etc/kubernetes/manifests/.bak-etcd-$(date +%F).yaml
sudo sed -i 's|--listen-metrics-urls=http://127.0.0.1:2381|--listen-metrics-urls=http://0.0.0.0:2381|' \
/etc/kubernetes/manifests/etcd.yamlThe kube-proxy inotify gotcha
A workstation-grade host (Docker Desktop + dev containers + VS Code) burns
through inotify instances fast. kube-proxy exits on bounce with
"command failed" err="failed complete: too many open files" if
fs.inotify.max_user_instances is still at the Ubuntu default of 128.
sudo tee /etc/sysctl.d/99-kubernetes-inotify.conf <<'EOF'
fs.inotify.max_user_instances = 8192
fs.inotify.max_user_watches = 524288
EOF
sudo sysctl --systemAfter bumping, delete the failing kube-proxy pod so a fresh one starts under the new limits.
Certificate renewal
kubeadm issues client certs with a 1-year lifetime. The admin.conf
cert is the one you’ll feel first when it expires (kubectl starts
returning “credentials” errors).
sudo kubeadm certs check-expiration | grep admin.conf
sudo kubeadm certs renew admin.conf
sudo install -o $(id -un) -g $(id -gn) -m 600 /etc/kubernetes/admin.conf ~/.kube/configA kubeadm-cert-renew.timer systemd unit runs weekly and auto-renews
anything within 30 days of expiry; it also pages via ntfy at 60 days.