Table of Contents
In this article, we will understand how to take a Kubernetes Cluster node out for Maintenance. Performing node maintenance is a standard task which needs to be done periodically. Sometimes taking the node out from a Cluster and joining it back to the Cluster can become tricky and cumbersome. Fortunately it is very easy to do in a Kubernetes Cluster. You just have to follow the process well. Here I am going to explain you the entire process with the help of an example node so that it will be easy for you to understand and perform in a production kind of environment. But before that let's quickly go through our lab setup.
We are having one worker node called
"node-1" and one master called
"master" in our Kubernetes Cluster. We are having three test pod currently running on node-1 and no pod running on master. Our task is to take the node "node-1" out of the Kubernetes Cluster for maintenance and then joined it back once the maintenance is completed. It is important to ensure that when we take the node out, all its load should be transferred to master without any hiccups.
How to take a Kubernetes Cluster node out for Maintenance
Step 1: Check all the Pods
First let's verify all the pods are running fine by using kubectl get pods -o wide command. This command will show detailed output which includes the pod status, node on which it is running, number of restart it had, Age, IP etc.
root@master:~# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES test-746c87566d-6fzjx 1/1 Running 0 11m 10.253.1.2 node-1 <none> <none> test-746c87566d-j49rx 1/1 Running 0 11m 10.253.1.4 node-1 <none> <none> test-746c87566d-xbnj5 1/1 Running 0 11m 10.253.1.3 node-1 <none> <none>
Step 2: Verify all the Nodes
Then the next step is to verify the health of the nodes by using
kubectl get nodes. This should show node status as
Ready. If any of the nodes say otherwise, then first you need to fix the node status and then proceed with the next step.
root@master:~# kubectl get nodes NAME STATUS ROLES AGE VERSION master Ready master 20m v1.20.0 node-1 Ready <none> 19m v1.20.0
Step 3: Drain out the Node
Now you can move ahead with the maintenance and drain out first node-1 using
kubectl drain node-1 command as shown below. The idea of drain out is to make sure all the loads of node-1 gets transferred to other available node in the Cluster. In our case, since we have only two nodes i.e
node-1 so in a successful drain out all the loads of node-1 should get transferred to master. But hey what's showing below. It looks like there is some error and it is unable to drain node "node-1". If you are also facing this kind of error then here you need to add one more option i.e
--ignore-daemonsets to get through this.
If there are daemon set-managed pods, drain will not proceed without
--ignore-daemonsets, and regardless it will not delete any daemon set-managed pods, because those pods would be immediately replaced by the daemon set controller, which ignores unschedulable markings. Hence we need to use this option. More on Kubernetes official documentation.
root@master:~# kubectl drain node-1 node/node-1 cordoned error: unable to drain node "node-1", aborting command... There are pending nodes to be drained: node-1 cannot delete Pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet (use --force to override): default/simple-webapp-1 cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): kube-system/kube-flannel-ds-xngfr, kube-system/kube-proxy-hsq9m
So now you have to run
kubectl drain node-1 --ignore-daemonsets command as shown below. This will mark the node unschedulable to prevent new pods from arriving. You have to wait till all the pods get evicted.
root@master:~# kubectl drain node-1 --ignore-daemonsets node/node-1 cordoned WARNING: ignoring DaemonSet-managed Pods: kube-system/kube-flannel-ds-g2v5g, kube-system/kube-proxy-fsfjt evicting pod default/test-746c87566d-xbnj5 evicting pod default/test-746c87566d-6fzjx evicting pod default/test-746c87566d-j49rx pod/test-746c87566d-j49rx evicted pod/test-746c87566d-6fzjx evicted pod/test-746c87566d-xbnj5 evicted node/node-1 evicted
Step 4: Verify all the apps
Now you can verify if all the pods got transferred to master by using
kubectl get pods -o wide command. Then you can see that indeed it is transferred and running fine. So this confirms that node-1 has been successfully unscheduled from running any apps. Now you can proceed with the maintenance of node
root@master:~# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES test-746c87566d-8h8gz 1/1 Running 0 80s 10.253.0.5 master <none> <none> test-746c87566d-k5mxz 1/1 Running 0 80s 10.253.0.6 master <none> <none> test-746c87566d-zslkf 1/1 Running 0 80s 10.253.0.4 master <none> <none>
Step 5: Schedule the Node
After maintenance is completed, we need to make the node schedulable again by bringing it back to the cluster using
kubectl uncordon node-1 command as shown below.
root@master:~# kubectl uncordon node-1 node/node-1 uncordoned
Step 6: Verify the Cluster
To verify if the node is back in cluster, you can run
kubectl get nodes command. If you see the status as
Ready then it confirms that node-1 successfully scheduled for arriving pods.
root@master:~# kubectl get nodes NAME STATUS ROLES AGE VERSION master Ready master 27m v1.20.0 node-1 Ready <none> 27m v1.20.0
Step 7: Verify Pods Status
But here is another interesting output you can notice when you run
kubectl get pods -o wide command again. You can notice that currently there are no pods running on node-1 even after the node has been successfully joined back. Well, it is simply because we had removed this node for maintenance and now it will only schedule when new pod arrives.
root@master:~# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES test-746c87566d-8h8gz 1/1 Running 0 9m30s 10.253.0.5 master <none> <none> test-746c87566d-k5mxz 1/1 Running 0 9m30s 10.253.0.6 master <none> <none> test-746c87566d-zslkf 1/1 Running 0 9m30s 10.253.0.4 master <none> <none>
Another interesting point you might think why all the pods were placed on master when node-1 was taken out for maintenance. Well, it is simply because master node did not had any taint. This you can verify by grepping the taint keyword from
kubectl describe nodes master command as shown below.
root@master:~# kubectl describe nodes master | grep -i taint Taints: <none>