Horizontal pod autoscaler in Kubernetes

Checkmate Global Technologies assists companies with their digital transformation journey by employing cutting-edge technologies. We emphasize client satisfaction thanks to our ISO certification, and this is evident in our objective for digital transformation. We are a CMM level-3 business


The objective of a HorizontalPodAutoscaler in Kubernetes is to automatically scale the workload to match demand by updating a workload resource such as a Deployment or StatefulSet. When a load increases, more Pods are deployed, which is referred to as horizontal scaling. The HorizontalPodAutoscaler informs the workload resource (the Deployment, StatefulSet, or other similar resources) to scale back down if the load drops and the number of Pods is more than the defined minimum.

For using horizontal pod autoscaler in Kubernetes first we will need to install the Kubernetes metric server which queries the resource usages of pods and nodes like CPU and memory utilization.

For installing the metric server in EKS  use the following commands.

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yamlkubectl get deployment metrics-server -n kube-system

If you’re using any local Kubernetes cluster like minikube then you will need to add the                    –kubelet-insecure-tls argument in the container property in the metrics server deployment manifest file.

Use the following command to verify that the metrics server deployment is running.

kubectl get deployment metrics-server -n kube-system

Now we will deploy an Nginx web server using Kubernetes deployment. To use the autoscaler we will need to define the resource requests. Requests define the minimum amount of resources required by pods.

apiVersion: apps/v1kind: Deploymentmetadata:  name: nginx-deployspec:   replicas: 2   selector:     matchLabels:       app: web   template:       metadata:         labels:           app: web       spec:          containers:          - name: nginx            image: nginx:1.23            ports:             - containerPort: 80                resources:               requests:                  memory: 200Mi                  cpu: 100m      ---apiVersion: v1kind: Servicemetadata:   name: nginx-servicespec:  selector:    app: web  type: NodePort  ports:  - name: http    port: 80    nodePort: 30080     

Next, deploy the horizontal pod autoscaler manifest file. Here we have defined that if CPU usage increase above 50, it will scale up the replicas.

apiVersion: autoscaling/v2kind: HorizontalPodAutoscalermetadata:  name: nginx-hpaspec:  scaleTargetRef:    apiVersion: apps/v1    kind: Deployment    name: nginx-deploy  minReplicas: 1  maxReplicas: 10  metrics:  - type: Resource    resource:      name: cpu      target:        type: Utilization        averageUtilization: 50  behavior:    scaleDown:      stabilizationWindowSeconds: 60      policies:      - type: Pods        value: 1        periodSeconds: 60    scaleUp:      stabilizationWindowSeconds: 0      policies:      - type: Pods        value: 2        periodSeconds: 15  

Now for testing the autoscaler, we will use a busybox container deployment. It will call the Nginx pods in a loop and hence increase the CPU load.

apiVersion: apps/v1kind: Deploymentmetadata:  name: nginx-load  labels:    app: nginx-loadspec:  replicas: 1  selector:    matchLabels:      app: nginx-load  template:    metadata:      name: nginx-load      labels:        app: nginx-load    spec:       containers:       - name: busybox         image: busybox         command:         - /bin/sh         - -c         - "while true; do wget -q -O- nginx-service; done"

After deploying the busybox pod we will see that CPU utilization will increase. In our case it has increased up to 80% which is above 50% threshold that we defined.

After some time, autoscaler will notice the increased CPU utilization and increase the number of replicas up to 3, thus bringing down the CPU utilization.

Now we will delete the busy box deployment and we will see that in some time number of replicas will scale down as well.

Please contact our technical consultants if you have anything related to cloud infrastructure to be discussed.
