/
6. How to configure HI GIO Kunernetes cluster autoscale

6. How to configure HI GIO Kunernetes cluster autoscale

Overview

Step-by-step guide on how to configure HI GIO Kubernetes cluster autoscale

  • Install tanzu-cli

  • Create cluster-autoscaler deployment from tanzu package using tanzu-cli

  • Enable cluster autoscale for your cluster

  • Test cluster autoscale

  • Delete cluster-autoscaler deployment and clean up test resource

 

Procedure

  1. Pre-requisites:
  • Ubuntu bastion can connect to your Kubernetes cluster

  • Permission for access to your Kubernetes cluster

 

  1. Procedure:
  1. Install tanzu-cli

    #Install tanzu-cli to ubuntu sudo apt update sudo apt install -y ca-certificates curl gpg sudo mkdir -p /etc/apt/keyrings curl -fsSL https://storage.googleapis.com/tanzu-cli-installer-packages/keys/TANZU-PACKAGING-GPG-RSA-KEY.gpg | sudo gpg --dearmor -o /etc/apt/keyrings/tanzu-archive-keyring.gpg echo "deb [signed-by=/etc/apt/keyrings/tanzu-archive-keyring.gpg] https://storage.googleapis.com/tanzu-cli-installer-packages/apt tanzu-cli-jessie main" | sudo tee /etc/apt/sources.list.d/tanzu.list sudo apt update sudo apt install -y tanzu-cli #Verify tanzu-cli installation tanzu version
    image-20241223-082306.png

To install tanzu-cli in other environments, please refer to the documentation below:

Installing and Using VMware Tanzu CLI v1.5.x

(Optional) If you want to configure tanzu completion, please run the command below and follow the instructions output

tanzu completion --help
image-20241223-082817.png

 

  1. Create cluster-autoscaler deployment from tanzu package using tanzu-cli

  • Switched to your Kubernetes context

    kubectl config use-context <your context name>
    image-20241223-083216.png
  • List available cluster-autoscaler in tanzu package and note the version name

    tanzu package available list cluster-autoscaler.tanzu.vmware.com
    image-20241223-083428.png
  • Create kubeconfig secret name cluster-autoscaler-mgmt-config-secret in cluster kube-system namespace

    kubectl create secret generic cluster-autoscaler-mgmt-config-secret \ --from-file=value=<path to your kubeconfig file> \ -n kube-system
    image-20241223-084449.png

Please do not change the secret name (cluster-autoscaler-mgmt-config-secret) and namespace (kube-system)

  • Create cluster-autoscaler-values.yaml file

    arguments: ignoreDaemonsetsUtilization: true maxNodeProvisionTime: 15m maxNodesTotal: 0 #Leave this value as 0. We will define the max and min number of nodes later. metricsPort: 8085 scaleDownDelayAfterAdd: 10m scaleDownDelayAfterDelete: 10s scaleDownDelayAfterFailure: 3m scaleDownUnneededTime: 10m clusterConfig: clusterName: "demo-autoscale-tkg" #adjust here clusterNamespace: "demo-autoscale-tkg-ns" #adjust here paused: false

Required values:

  • clusterName: your cluster name

  • clusterNamespace: your cluster namespace

  • Install cluster-autoscaler

    tanzu package install cluster-autoscaler \ --package cluster-autoscaler.tanzu.vmware.com \ --version <version available> \ #adjust the version listed above to match your kubernetes version --values-file 'cluster-autoscaler-values.yaml' \ --namespace tkg-system #please do not change, this is default namespace for tanzu package
    image-20241223-085721.png

The cluster-autoscaler will deploy into the kube-system namespace.

Run the command below to verify cluster-autoscaler deployment

kubectl get deployments.apps -n kube-system cluster-autoscaler
image-20241223-090422.png
  • Configure the minimum and maximum number of nodes in your cluster

    • Get machinedeployments name and namespace

      kubectl get machinedeployments.cluster.x-k8s.io -A
      image-20241223-090941.png
    • Set cluster-api-autoscaler-node-group-min-size and cluster-api-autoscaler-node-group-max-size

      kubectl annotate machinedeployment <machinedeployment name> cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size=<number min> -n <machinedeployment namespace> kubectl annotate machinedeployment <machinedeployment name> cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size=<number max> -n <machinedeployment namespace>
      image-20241223-091454.png

       

  • Enable cluster autoscale for your cluster

Because this step requires provider permission to perform, please notify the cloud provider to perform this step.

 

  1. Test cluster autoscale

  • Get the current number of nodes

    kubectl get nodes
    image-20241223-092206.png

There is currently only one worker node.

  • Create test-autoscale.yaml file

    apiVersion: apps/v1 kind: Deployment metadata: name: nginx namespace: default spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx ports: - containerPort: 80 topologySpreadConstraints: #Spreads pods across different nodes (ensures no node has more pods than others) - maxSkew: 1 topologyKey: kubernetes.io/hostname whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: app: nginx
  • Apply test-autoscale.yaml file to deploy 2 replicas of nginx pod in the default namespace (it will trigger to create a new worker node)

    kubectl apply -f test-autoscale.yaml
    image-20241223-093311.png
  • Get nginx deployment

    kubectl get pods
    image-20241223-093659.png
    kubectl describe pod nginx-589656b9b5-mcm5j | grep -A 10 Events
    image-20241223-093958.png

You can see there is a new nginx pod with a status of Pending and the events shown FailedScheduling and TriggeredScaleUp:

Warning FailedScheduling 2m53s default-scheduler 0/2 nodes are available: 1 node(s) didn't match pod topology spread constraints, 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 Preemption is not helpful for scheduling. Normal TriggeredScaleUp 2m43s cluster-autoscaler pod triggered scale-up: [{MachineDeployment/demo-autoscale-tkg-ns/demo-autoscale-tkg-worker-node-pool-1 1->2 (max: 5)}]
  • Waiting for a new node to be provisioned, then you can see a new worker node has been provisioned and new nginx pod status is running

    image-20241223-094250.png
  • Clean up test resource

    kubectl delete -f test-autoscale.yaml
    image-20241223-095032.png

After deleting the nginx deployment test. The cluster waits a few minutes to delete the unneeded node (please see scaleDownUnneededTime value in cluster-autoscaler-values.yaml file)

  • Delete cluster-autoscaler deployment (Optional)

In case you don't want your cluster to auto-scale anymore. You can delete cluster-autoscaler deployment using tanzu-cli

tanzu package installed delete cluster-autoscaler -n tkg-system -y
image-20241223-100123.png

End.

Related content