As with any other software application, Kubernetes also requires regular cleanups to keep the cluster performing at an optimal level. As the cluster grows, so does the number of API objects, volumes, services, etc., leading to the cluster hitting resource limits, performance degradation, and impacting its stability.
Why Clean Kubernetes?
There can be a multitude of resources that will remain unused within the cluster ranging from pods, persistent volumes, and secrets to role binding and service accounts which may lead to various issues within the cluster.
Forgotten pods can consume the available resources within a node and lead to resource starvation on other pods. This situation is more prominent if you have dedicated Kubernetes nodes bound to a special resource like GPU. The same applies to storage. A used persistent volume can limit the available storage resource and cause insufficient ephemeral storage issues within the nodes. All these wasted resources mean higher costs for maintaining the cluster.
Numerous objects getting accumulated in the etcd database can directly affect the overall performance of the cluster. Additionally, from a security standpoint, keeping unused role binding and service accounts can increase the attack surface of the cluster.
Ways to Keep Kubernetes Clean
There are two ways Kubernetes can be cleaned. The simplest way is to clean the cluster environment manually. However, it can be a time-consuming process as an admin has to go through all the objects within the cluster to identify things that can be cleaned.
The other option is setting up preventive measures such as limit ranges and resource quotas to limit the number of resources that can be created within the cluster and how much hardware resources they can consume. It ensures only a specified number of objects will be created within the cluster and forces the cluster-admin to reevaluate all the objects when the resource limit is reached.
If you already have a cluster without setting up these preventive measures, one needs to manually clean up the cluster before configuring these limits. Otherwise, the cluster may hit these quotas and limits, causing issues in the functionality of the cluster.
Automating Kubernetes Cleanup
Manual cleaning is not a scalable solution. There will be instances where manual cleaning is required even after setting up limits and quotas. This is where automated cleaning tools come into play. For a simple manual cleaning requirement, one can use a tool such as k8spurger that will look for unused role binding, service account, configmaps, etc., and remove them in bulk.
A cluster administrator can consider kube-janitor as a more comprehensive solution. This customizable tool can be deployed as any other workload within the cluster. What separates this tool from others is that it allows users to set up custom cleaning rules that dictate what can and should be cleaned. This tool helps achieve all cluster cleaning requirements, from setting up simple rules to clean up stale or unused resources to cleaning entire namespaces after a set amount of time.
Automation does not mean you should ignore the above-mentioned preventive measures. The best approach will be a mixture of both. One thing to be aware of when using automated tools that live inside the cluster is that they will also consume resources. Thus, properly allocating the necessary resources for the tool is also an essential step in setting up automated cleaning within the cluster.
Using Cluster Monitoring in Kubernetes Cluster Cleanup
Even with manual and automated cleaning, there will be instances where some unused objects will be missed from the cleaning process. Implementing proper monitoring within the cluster is the best way to mitigate this issue while ensuring that the objects are kept within the defined resource quotas and limits.
Tools like Prometheus can be used to expose cluster metrics such as etcd database size, object count, and resource usage for each pod (CPU, memory, storage), the number of pods for each node, and log sizes for pods. This exposure allows a cluster-admin to have a top-down view of the entire cluster. It is not only useful in cleaning to identify unused or orphaned objects, but also in optimizing and troubleshooting clusters.
Monitoring should not be configured for the sake of monitoring. There should be automated alerts, with cluster administrators proactively monitoring the environment to gain the full advantage of monitoring.
Keeping the Kubernetes cluster neat and tidy is an important aspect of cluster management. However, cleanup within the cluster is an afterthought for most Kubernetes administrators, which can negatively impact the cluster performance and cause a waste of resources that could have been better utilized. Therefore, a proper cleaning strategy must be a part of any well-managed Kubernetes cluster.
Kubernetes’ administrators have no excuse not to implement a proper clearing strategy for the Kubernetes cluster, with simple yet powerful methods both natively supported by Kubernetes and free third-party tools for automation.