# Kubernetes Autoscaling Guide ## Cluster Autoscaling Cluster autoscaler allows us to scale cluster nodes when they become full
I would recommend to learn about scaling your cluster nodes before scaling pods.
Video [here](https://youtu.be/jM36M39MA3I) Kubernetes cluster auto scaling ## Horizontal Pod Autoscaling HPA allows us to scale pods when their resource utilisation goes over a threshold
Pod auto scaling ## Requirements ### A Cluster * For both autoscaling guides, we'll need a cluster.
* For `Cluster Autoscaler` You need a cloud based cluster that supports the cluster autoscaler
* For `HPA` We'll use [kind](http://kind.sigs.k8s.io/) ### Cluster Autoscaling - Creating an AKS Cluster ``` # azure example NAME=aks-getting-started RESOURCEGROUP=aks-getting-started SERVICE_PRINCIPAL= SERVICE_PRINCIPAL_SECRET= az aks create -n $NAME \ --resource-group $RESOURCEGROUP \ --location australiaeast \ --kubernetes-version 1.16.10 \ --nodepool-name default \ --node-count 1 \ --node-vm-size Standard_F4s_v2 \ --node-osdisk-size 250 \ --service-principal $SERVICE_PRINCIPAL \ --client-secret $SERVICE_PRINCIPAL_SECRET \ --output none \ --enable-cluster-autoscaler \ --min-count 1 \ --max-count 5 ``` ### Horizontal Pod Autocaling - Creating a Kind Cluster My Node has 6 CPU cores for this demo
``` kind create cluster --name hpa --image kindest/node:v1.18.4 ``` ### Metric Server * For `Cluster Autoscaler` - On cloud-based clusters, Metric server may already be installed.
* For `HPA` - We're using kind [Metric Server](https://github.com/kubernetes-sigs/metrics-server) provides container resource metrics for use in autoscaling pipelines
Because I run K8s `1.18` in `kind`, the Metric Server version i need is `0.3.7`
We will need to deploy Metric Server [0.3.7](https://github.com/kubernetes-sigs/metrics-server/releases/tag/v0.3.7)
I used `components.yaml`from the release page link above.
Important Note : For Demo clusters (like `kind`), you will need to disable TLS
You can disable TLS by adding the following to the metrics-server container args
For production, make sure you remove the following :
``` - --kubelet-insecure-tls - --kubelet-preferred-address-types="InternalIP" ``` Deployment:
``` cd kubernetes\autoscaling kubectl -n kube-system apply -f .\components\metric-server\metricserver-0.3.7.yaml #test kubectl -n kube-system get pods #note: wait for metrics to populate! kubectl top nodes ``` ## Example Application For all autoscaling guides, we'll need a simple app, that generates some CPU load
* Build the app * Push it to a registry * Ensure resource requirements are set * Deploy it to Kubernetes * Ensure metrics are visible for the app ``` # build cd kubernetes\autoscaling\components\application docker build . -t aimvector/application-cpu:v1.0.0 # push docker push aimvector/application-cpu:v1.0.0 # resource requirements resources: requests: memory: "50Mi" cpu: "500m" limits: memory: "500Mi" cpu: "2000m" # deploy kubectl apply -f deployment.yaml # metrics kubectl top pods ``` ## Cluster Autoscaler For cluster autoscaling, you should be able to scale the pods manually and watch the cluster scale.
Cluster autoscaling stops here.
For Pod Autoscaling (HPA), continue
## Generate some traffic Let's deploy a simple traffic generator pod ``` cd kubernetes\autoscaling\components\application kubectl apply -f .\traffic-generator.yaml # get a terminal to the traffic-generator kubectl exec -it traffic-generator sh # install wrk apk add --no-cache wrk # simulate some load wrk -c 5 -t 5 -d 99999 -H "Connection: Close" http://application-cpu #you can scale to pods manually and see roughly 6-7 pods will satisfy resource requests. kubectl scale deploy/application-cpu --replicas 2 ``` ## Deploy an autoscaler ``` # scale the deployment back down to 2 kubectl scale deploy/application-cpu --replicas 2 # deploy the autoscaler kubectl autoscale deploy/application-cpu --cpu-percent=95 --min=1 --max=10 # pods should scale to roughly 6-7 to match criteria of 95% of resource requests kubectl get pods kubectl top pods kubectl get hpa/application-cpu -owide kubectl describe hpa/application-cpu ``` ## Vertical Pod Autoscaling The vertical pod autoscaler allows us to automatically set request values on our pods
based on recommendations. This helps us tune the request values based on actual CPU and Memory usage.
More [here](./vertical-pod-autoscaling/readme.md)