If you want to reboot a bare metal server that is registered as a node in your cluster, the easiest way is to reprovision it.
This can be done just by deleting the Machine
object associated with the bare metal node.
However, in some cases it might be useful to reboot the machine in-place, such as when the local disk contains data which would require too much time to re-sync.
This guide will explain how you can do that.
In the Autopilot cluster, set the paused annotation for the machine:
Double-check that you don't have a typo in the annotation. Otherwise, the node will get flagged as not functional and the machine will get reprovisioned.
In your workload cluster, drain the node:
And wait until all the pods are terminated. You can check with this command:
Now, you can SSH into the server, and perform any needed maintenance tasks.
If you are unsure how to do that, refer to the How to SSH into nodes guide.
For rebooting it, type this in the server shell:
First, uncordon the node in your workload cluster:
And, in the Autopilot cluster, remove the pause annotation:
Now the node is a functional member of the cluster again.