6. How to configure HI GIO Kunernetes cluster autoscale
Overview
Step-by-step guide on how to configure HI GIO Kubernetes cluster autoscale
Install tanzu-cli
Create cluster-autoscaler deployment from tanzu package using tanzu-cli
Enable cluster autoscale for your cluster
Test cluster autoscale
Delete cluster-autoscaler deployment and clean up test resource
Procedure
- Pre-requisites:
Ubuntu bastion can connect to your Kubernetes cluster
Permission for access to your Kubernetes cluster
- Procedure:
Install tanzu-cli
#Install tanzu-cli to ubuntu sudo apt update sudo apt install -y ca-certificates curl gpg sudo mkdir -p /etc/apt/keyrings curl -fsSL https://storage.googleapis.com/tanzu-cli-installer-packages/keys/TANZU-PACKAGING-GPG-RSA-KEY.gpg | sudo gpg --dearmor -o /etc/apt/keyrings/tanzu-archive-keyring.gpg echo "deb [signed-by=/etc/apt/keyrings/tanzu-archive-keyring.gpg] https://storage.googleapis.com/tanzu-cli-installer-packages/apt tanzu-cli-jessie main" | sudo tee /etc/apt/sources.list.d/tanzu.list sudo apt update sudo apt install -y tanzu-cli #Verify tanzu-cli installation tanzu version
To install tanzu-cli in other environments, please refer to the documentation below:
https://techdocs.broadcom.com/us/en/vmware-tanzu/cli/tanzu-cli/1-5/cli/index.html
(Optional) If you want to configure tanzu completion, please run the command below and follow the instructions output
tanzu completion --help
Create cluster-autoscaler deployment from tanzu package using tanzu-cli
Switched to your Kubernetes context
kubectl config use-context <your context name>List available cluster-autoscaler in tanzu package and note the version name
tanzu package available list cluster-autoscaler.tanzu.vmware.comCreate kubeconfig secret name
cluster-autoscaler-mgmt-config-secretin clusterkube-systemnamespacekubectl create secret generic cluster-autoscaler-mgmt-config-secret \ --from-file=value=<path to your kubeconfig file> \ -n kube-system
Please do not change the secret name (cluster-autoscaler-mgmt-config-secret) and namespace (kube-system)
Create
cluster-autoscaler-values.yamlfilearguments: ignoreDaemonsetsUtilization: true maxNodeProvisionTime: 15m maxNodesTotal: 0 #Leave this value as 0. We will define the max and min number of nodes later. metricsPort: 8085 scaleDownDelayAfterAdd: 10m scaleDownDelayAfterDelete: 10s scaleDownDelayAfterFailure: 3m scaleDownUnneededTime: 10m clusterConfig: clusterName: "demo-autoscale-tkg" #adjust here clusterNamespace: "demo-autoscale-tkg-ns" #adjust here paused: false
Required values:
clusterName: your cluster nameclusterNamespace: your cluster namespace
Install cluster-autoscaler
tanzu package install cluster-autoscaler \ --package cluster-autoscaler.tanzu.vmware.com \ --version <version available> \ #adjust the version listed above to match your kubernetes version --values-file 'cluster-autoscaler-values.yaml' \ --namespace tkg-system #please do not change, this is default namespace for tanzu package
The cluster-autoscaler will deploy into the kube-system namespace.
Run the command below to verify cluster-autoscaler deployment
kubectl get deployments.apps -n kube-system cluster-autoscalerConfigure the minimum and maximum number of nodes in your cluster
Get
machinedeploymentsname and namespacekubectl get machinedeployments.cluster.x-k8s.io -ASet
cluster-api-autoscaler-node-group-min-sizeandcluster-api-autoscaler-node-group-max-sizekubectl annotate machinedeployment <machinedeployment name> cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size=<number min> -n <machinedeployment namespace> kubectl annotate machinedeployment <machinedeployment name> cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size=<number max> -n <machinedeployment namespace>
Enable cluster autoscale for your cluster
Because this step requires provider permission to perform, please notify the cloud provider to perform this step.
Test cluster autoscale
Get the current number of nodes
kubectl get nodes
There is currently only one worker node.
Create
test-autoscale.yamlfileapiVersion: apps/v1 kind: Deployment metadata: name: nginx namespace: default spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx ports: - containerPort: 80 topologySpreadConstraints: #Spreads pods across different nodes (ensures no node has more pods than others) - maxSkew: 1 topologyKey: kubernetes.io/hostname whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: app: nginxApply
test-autoscale.yamlfile to deploy 2 replicas of nginx pod in the default namespace (it will trigger to create a new worker node)kubectl apply -f test-autoscale.yamlGet nginx deployment
kubectl get podskubectl describe pod nginx-589656b9b5-mcm5j | grep -A 10 Events
You can see there is a new nginx pod with a status of Pending and the events shown FailedScheduling and TriggeredScaleUp:
Warning FailedScheduling 2m53s default-scheduler 0/2 nodes are available: 1 node(s) didn't match pod topology spread constraints, 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 Preemption is not helpful for scheduling.
Normal TriggeredScaleUp 2m43s cluster-autoscaler pod triggered scale-up: [{MachineDeployment/demo-autoscale-tkg-ns/demo-autoscale-tkg-worker-node-pool-1 1->2 (max: 5)}]Waiting for a new node to be provisioned, then you can see a new worker node has been provisioned and new nginx pod status is running
Clean up test resource
kubectl delete -f test-autoscale.yaml
After deleting the nginx deployment test. The cluster waits a few minutes to delete the unneeded node (please see scaleDownUnneededTime value in cluster-autoscaler-values.yaml file)
Delete cluster-autoscaler deployment (Optional)
In case you don't want your cluster to auto-scale anymore. You can delete cluster-autoscaler deployment using tanzu-cli
tanzu package installed delete cluster-autoscaler -n tkg-system -yEnd.