How to restart kubernetes nodes

Question

The status of nodes is reported as unknown   conditions                              type    Ready                status    Unknown                lastHeartbeatTime    2015-11-12T06 03 19Z                lastTransitionTime    2015-11-12T06 04 03Z                reason    Kubelet stopped posting node status                 whle kubectl get nodes return a NOTReady status  What does this imply and how to fix this

User · Answer

In my case I am running 3 nodes in VM s by using Hyper-V  By using the following steps I was able to  restart  the cluster after restarting all VM s     Optional  Swap off     swapoff -a You have to restart all Docker containers    docker restart   docker ps -a -q  Check the nodes status after you performed step 1 and 2 on all nodes  the status is NotReady     kubectl get nodes Restart the node    systemctl restart kubelet Check again the status  now should be in Ready status    Note  I do not know if it does metter the order of nodes restarting  but I choose to start with the k8s master node and after with the minions  Also it will take a little bit to change the node state from NotReady to Ready

User · Answer

I had an onpremises HA installation  a master and a worker stopped working returning a NOTReady status  Checking the kubelet logs on the nodes I found out this problem   failed to run Kubelet  Running with swap on is not supported  please disable swap  or set --fail-swap-on flag to false   Disabling swap on nodes with   swapoff -a   and restarting the kubelet  systemctl restart kubelet   did the work

User · Answer

Get nodes  kubectl get nodes   Result   NAME            STATUS     AGE 192 168 1 157   NotReady   42d 192 168 1 158   Ready      42d 192 168 1 159   Ready      42d   Describe node  Here is a NotReady on the node of 192 168 1 157  Then debugging this notready node  and you can read offical documents - Application Introspection and Debugging   kubectl describe node 192 168 1 157   Partial Result   Conditions  Type          Status          LastHeartbeatTime                       LastTransitionTime                      Reason                  Message ----          ------          -----------------                       ------------------                      ------                  ------- OutOfDisk     Unknown         Sat  28 Dec 2016 12 56 01  0000         Sat  28 Dec 2016 12 56 41  0000         NodeStatusUnknown       Kubelet stopped posting node status  Ready         Unknown         Sat  28 Dec 2016 12 56 01  0000         Sat  28 Dec 2016 12 56 41  0000         NodeStatusUnknown       Kubelet stopped posting node status    There is a OutOfDisk on my node  then Kubelet stopped posting node status  So  I must free some disk space  using the command of df on my Ubuntu14 04 I can check the details of memory  and using the command of docker rmi image id image name under the role of su I can remove the useless images   Login in node  Login in 192 168 1 157 by using ssh  like ssh administrator 192 168 1 157  and switch to the  su  by sudo su   Restart kubelet   etc init d kubelet restart   Result   stop  Unknown instance   kubelet start running  process 59261   Get nodes again  On the master   kubectl get nodes   Result   NAME            STATUS    AGE 192 168 1 157   Ready     42d 192 168 1 158   Ready     42d 192 168 1 159   Ready     42d   Ok  that node works fine   Here is a reference  Kubernetes

User · Answer

You can delete the node from the master by issuing   kubectl delete node hostname company net   The NOTReady status probably means that the master can t access the kubelet service  Check if everything is OK on the client

User · Answer

I had this problem too but it looks like it depends on the Kubernetes offering and how everything was installed  In Azure  if you are using acs-engine install  you can find the shell script that is actually being run to provision it at    opt azure containers provision sh   To get a more fine-grained understanding  just read through it and run the commands that it specifies  For me  I had to run as root   systemctl enable kubectl systemctl restart kubectl   I don t know if the enable is necessary and I can t say if these will work with your particular installation  but it definitely worked for me

User · Answer

If a node is so unhealthy that the master can t get status from it -- Kubernetes may not be able to restart the node   And if health checks aren t working  what hope do you have of accessing the node by SSH   In this case  you may have to hard-reboot -- or  if your hardware is in the cloud  let your provider do it   For example  the AWS EC2 Dashboard allows you to right-click an instance to pull up an  Instance State  menu -- from which you can reboot terminate an unresponsive node   Before doing this  you might choose to kubectl cordon node for good measure   And you may find kubectl delete node to be an important part of the process for getting things back to normal -- if the node doesn t automatically rejoin the cluster after a reboot     Why would a node become unresponsive   Probably some resource has been exhausted in a way that prevents the host operating system from handling new requests in a timely manner   This could be disk  or network -- but the more insidious case is out-of-memory  OOM   which Linux handles poorly   To help Kubernetes manage node memory safely  it s a good idea to do both of the following    Reserve some memory for the system  Be very careful with  avoid  opportunistic memory specifications for your pods   In other words  don t allow different values of requests and limits for memory    The idea here is to avoid the complications associated with memory overcommit  because memory is incompressible  and both Linux and Kubernetes  OOM killers may not trigger before the node has already become unhealthy and unreachable

[nodes] How to restart kubernetes nodes?

Examples related to nodes

Examples related to kubernetes