Adding and Setting Up a New Node¶

All cluster nodes are k3s server nodes (etcd members). There are no agent-only nodes currently.

Prerequisites¶

Node is on the same L2 network as existing nodes
SSH access as root
Node IP is known and reachable from the existing nodes

Step 1 — Join the Cluster¶

Run on the new node, substituting the first node's IP and the token from /var/lib/rancher/k3s/server/token:

curl -sfL https://get.k3s.io | sh -s - server \
  --server https://<first-node-ip>:6443 \
  --token "<token>" \
  --flannel-backend=none \
  --disable-network-policy \
  --disable=servicelb \
  --disable-kube-proxy \
  --node-ip=<this-node-ip>

From any existing node, verify the new node appears:

kubectl get nodes -o wide

It will show NotReady briefly — Calico needs a moment to apply to the new node.

Step 2 — Verify Calico and WireGuard¶

Wait for the Calico node pod to be Running:

kubectl get pods -n calico-system -l app.kubernetes.io/name=calico-node -o wide

Then confirm the node has received a WireGuard key:

kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.annotations.projectcalico\.org/WireguardPublicKey}{"\n"}{end}'

Every node in the list must show a key. A blank entry means WireGuard has not activated on that node yet — wait and recheck.

Step 3 — Add TLS SAN (optional but recommended)¶

The API server certificates are generated at bootstrap with --tls-san flags for each node IP. Adding a new node IP avoids TLS errors if the API is ever contacted via that node directly.

On every existing server node, add the new IP to /etc/rancher/k3s/config.yaml:

tls-san:
  - <node1-ip>
  - <node2-ip>
  - <node3-ip>
  - <node4-ip>
  - <new-node-ip>

Then restart k3s on each node one at a time (allow it to become Ready before moving to the next):

systemctl restart k3s
kubectl get nodes

Step 4 — Verify Node is Healthy¶

# Node is Ready
kubectl get node <new-node-name>

# No stuck pods scheduled to this node
kubectl get pods -A --field-selector spec.nodeName=<new-node-name> | grep -Ev 'Running|Completed'

# Calico BGP peer established
kubectl exec -n calico-system ds/calico-node -- calico-node -show-status

Maintenance: Draining and Uncordoning¶

Before rebooting or performing maintenance on a node:

# Evict all pods, mark unschedulable
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data

# After maintenance is done, re-enable scheduling
kubectl uncordon <node-name>

Warning

DaemonSet pods (Calico, Traefik) are not evicted by drain — they will restart on their own after the node comes back. Pass --ignore-daemonsets to avoid the drain failing.

Removing a Node¶

# On the node being removed
k3s-uninstall.sh

# From any other node
kubectl delete node <node-name>

etcd will automatically handle the member removal. With 4 nodes the cluster can tolerate losing 1 node and remain available; losing 2 makes the cluster read-only.