After an unscheduled reboot of the VMs that host my K8s cluster, I was struggling to work out why the kubelet wasn't starting properly.
I ran systemctl start kubelet.service to start it and then checked the status with systemctl status kubelet.service which showed: -
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Sun 2021-06-06 00:35:01 PDT; 3s ago
Docs: https://kubernetes.io/docs/home/
Main PID: 82478 (kubelet)
Tasks: 7 (limit: 2279)
Memory: 14.6M
CGroup: /system.slice/kubelet.service
└─82478 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/conf>
Jun 06 00:35:01 garble1.domain.com systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jun 06 00:35:01 garble1.domain.com kubelet[82478]: I0606 00:35:01.836881 82478 server.go:197] "Warning: For remote container runtime, --pod-infra-container-image is i>
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Sun 2021-06-06 00:35:01 PDT; 3s ago
Docs: https://kubernetes.io/docs/home/
Main PID: 82478 (kubelet)
Tasks: 7 (limit: 2279)
Memory: 14.6M
CGroup: /system.slice/kubelet.service
└─82478 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/conf>
Jun 06 00:35:01 garble1.domain.com systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jun 06 00:35:01 garble1.domain.com kubelet[82478]: I0606 00:35:01.836881 82478 server.go:197] "Warning: For remote container runtime, --pod-infra-container-image is i>
Jun 06 00:35:01 garble1.domain.com ubelet[82478]: I0606 00:35:01.866762 82478 server.go:440] "Kubelet version" kubeletVersion="v1.21.0"
Jun 06 00:35:01 garble1.domain.com kubelet[82478]: I0606 00:35:01.867455 82478 server.go:851] "Client rotation is on, will bootstrap in background"
Jun 06 00:35:01 garble1.domain.com kubelet[82478]: I0606 00:35:01.870367 82478 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-clie>
Jun 06 00:35:01 garble1.domain.com kubelet[82478]: I0606 00:35:01.873004 82478 dynamic_cafile_content.go:167] Starting client-ca-bundle::/etc/kubernetes/pki/ca.crt
Jun 06 00:35:01 garble1.domain.com kubelet[82478]: I0606 00:35:01.867455 82478 server.go:851] "Client rotation is on, will bootstrap in background"
Jun 06 00:35:01 garble1.domain.com kubelet[82478]: I0606 00:35:01.870367 82478 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-clie>
Jun 06 00:35:01 garble1.domain.com kubelet[82478]: I0606 00:35:01.873004 82478 dynamic_cafile_content.go:167] Starting client-ca-bundle::/etc/kubernetes/pki/ca.crt
which looked OK.
I checked again: -
systemctl status kubelet.service
and saw: -
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Sun 2021-06-06 00:35:22 PDT; 8s ago
Docs: https://kubernetes.io/docs/home/
Process: 82505 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILURE)
Main PID: 82505 (code=exited, status=1/FAILURE)
Jun 06 00:35:22 garble1.domain.com systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Jun 06 00:35:22 garble1.domain.com systemd[1]: kubelet.service: Failed with result 'exit-code'.
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Sun 2021-06-06 00:35:22 PDT; 8s ago
Docs: https://kubernetes.io/docs/home/
Process: 82505 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILURE)
Main PID: 82505 (code=exited, status=1/FAILURE)
Jun 06 00:35:22 garble1.domain.com systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Jun 06 00:35:22 garble1.domain.com systemd[1]: kubelet.service: Failed with result 'exit-code'.
which looked not so good.
I then checked the syslog with: -
tail -f /var/log/syslog
and saw, amongst many other things, this: -
Jun 6 00:40:27 garble1 kubelet[83211]: E0606 00:40:27.104582 83211 server.go:292] "Failed to run kubelet" err="failed to run Kubelet: running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false. /proc/swaps contained: [Filename\t\t\t\tType\t\tSize\tUsed\tPriority /swap.img file\t\t4194300\t0\t-2]"
Of course, the VMs were rebooted ... so swap is still on ....
A quick trip to swapoff with: -
swapoff -a
and we're back in the game.
kubectl get nodes
NAME STATUS ROLES AGE VERSION
garble1.domain.com Ready control-plane,master 3d13h v1.21.0
garble2.domain.com Ready <none> 3d13h v1.21.0
crictl pods
POD ID CREATED STATE NAME NAMESPACE ATTEMPT
c3969548182d6 17 seconds ago Ready calico-node-nl2g2 kube-system 0
bd06ccb126620 18 seconds ago Ready kube-proxy-ht4mq kube-system 0
5a31b04c1d01a 18 seconds ago Ready kube-scheduler-garble1.domain.com kube-system 0
ac6e59ccb87f1 25 seconds ago Ready kube-controller-manager-garble1.domain.com kube-system 0
d2ece5d26441e 35 seconds ago Ready kube-apiserver-garble1.domain.com kube-system 0
10019ac4de96d 45 seconds ago Ready etcd-garble1.domain.com kube-system 0