Cyberithub

Kubernetes pod stuck in terminating state for long

Advertisements

In this article, we will see how to delete a kubernetes pod stuck in terminating state for quite sometime. It is often observed that whenever we try to deploy a new release of our application in kubernetes cluster, new pod gets created using the latest image but the older pods get stuck in terminating state forever due to some reason. Apart from this, pod can also get stuck in terminating state forever in many other scenarios.

Although it is very frustrating to see pod status in Terminating state on output but it is not difficult to deal with. Here I am going to explain you the reason for showing this state and all the possible solutions that you can apply to solve the problem depending on your use case scenario.

 

Kubernetes pod stuck in terminating state for long

Kubernetes pod stuck in terminating state for long

Also Read: How to Install Black Duck Using Synopsysctl on Kubernetes Cluster

Recently, I have been trying to deploy the latest release of my FSCM application to kubernetes cluster. After successful deployment, I notice that the new pod got created but the older one did not got terminated and still showing in terminating state as you can also see below. Now this problem can happen due to many reasons. Here we will try to see all the possible solutions that you can apply to solve this problem.

cyberithub@node1 % kubectl get pods -n kronos
NAME                             READY  STATUS         RESTARTS  AGE
hcm-app-768stccb98-kjgi6         1/1    Running        0         53m
portal-app-8hkloaau67-lhjk6      1/1    Running        0         7d1h
fscm-app-74oip845kj-gbbco        1/1    Running        0         2h
fscm-app-76u878jjg9-kjigv0       1/1    Terminating    0         5d8h

 

Solution 1: Check Node

Many times, if the node where pod is running on is down then it will get stuck in terminating state indefinitely. You will have to bring the node back in cluster to complete the pod deletion. To check the node status, run kubectl get nodes command.

cyberithub@node1 % kubectl get nodes

 

Solution 2: Restart Kubelet service

In few of the cases, kubelet service on the affected node might experience some issue or become unresponsive due to which pod operation gets affected. To fix this problem, you just have to restart the service on that node by using sudo systemctl restart kubelet command. However, you have to be careful before running below command as it might affect other pods operations running on that node.

NOTE:

Please note that you will require sudo or root access to restart the kubelet service.
cyberithub@node1 % sudo systemctl restart kubelet

 

Solution 3: Delete the pod

If patching does not work then you should try removing the stuck pod by simply deleting them using kubectl delete pod <pod_name> -n <namespace> command. If it gets deleted then fine otherwise command will not complete and it will keep showing the same output as you can see below. Then you might have to press Ctrl+C to exit from the running command. In that case, you have to try next solution.

cyberithub@node1 % kubectl delete pod fscm-app-76u878jjg9-kjigv0 -n kronos
pod "fscm-app-76u878jjg9-kjigv0" deleted
^C

 

Solution 4: Patch the pod

If you have any finalizer field on the pod's output then you can manually attempt to patch the pod by setting it to null to force kubernetes to delete the pod immediately. Setting finalizers to null means all the tasks are finished and now pod can be deleted without any consideration.

NOTE:

Finalizers are namespaced key object in Kubernetes which are used to confirm that a certain task is completed before pod can be deleted. Until finalizer is finished out, pod will remain in terminating state.
cyberithub@node1 % kubectl patch pod fscm-app-76u878jjg9-kjigv0 --patch '{"metadata":{"finalizers":null}}' -n kronos

 

Solution 5: Delete pod by setting --grace-period=0

If the previous solution does not work then you can always try this solution where you can set the --grace-period=0 during pod deletion. So the command will be kubectl delete pod <pod_name> --grace-period=0 -n <namespace>. This is simply because if you check the description of your pod by using kubectl describe pod <pod_name> -o yaml -n <namespace> command then you will notice that the Termination Grace Period is set to some seconds. This means pod will get terminated after completion of grace period. But in reality pod is still stuck so if you set it to 0 then it should immediately terminate the pod gracefully.

cyberithub@node1 % kubectl delete pod fscm-app-76u878jjg9-kjigv0 --grace-period=0 -n kronos

But mind you this is not always the case. Sometimes after using below command, you will still see pod stuck in deleted stuck and the command never completes. So, again you have to press Ctrl+C to get out of it and then you have to try the next given solution.

cyberithub@node1 % kubectl delete pod fscm-app-76u878jjg9-kjigv0 --grace-period=0 -n kronos
pod "fscm-app-76u878jjg9-kjigv0" deleted 
^C

 

Solution 6: Delete pod forcefully using --force

Finally if none of the above solution worked out then you have to apply a risky solution of deleting the pod forcefully using --force option. So the command to run is kubectl delete pod <pod_name> --grace-period=0 --force -n <namespace> as shown below. Applying this solution is risky because it only deletes the pod and not the underlying resources used by the pod such as a stuck container. So please be careful before applying below solution in a critical or production clusters.

cyberithub@node1 % kubectl delete pod fscm-app-76u878jjg9-kjigv0 --grace-period=0 --force -n kronos
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "fscm-app-76u878jjg9-kjigv0" force deleted

Leave a Comment