Kubernetes Autoscaling Guide
Cluster Autoscaling
Cluster autoscaler allows us to scale cluster nodes when they become full
Horizontal Pod Autoscaling
HPA allows us to scale pods when their resource utilisation goes over a threshold
Requirements
A Cluster
- For both autoscaling guides, we'll need a cluster.
- For
Cluster Autoscaler
You need a cloud based cluster that supports the cluster autoscaler - For
HPA
We'll use kind
Cluster Autoscaling - Creating an AKS Cluster
# azure example
NAME=aks-getting-started
RESOURCEGROUP=aks-getting-started
SERVICE_PRINCIPAL=
SERVICE_PRINCIPAL_SECRET=
az aks create -n $NAME \
--resource-group $RESOURCEGROUP \
--location australiaeast \
--kubernetes-version 1.16.10 \
--nodepool-name default \
--node-count 1 \
--node-vm-size Standard_F4s_v2 \
--node-osdisk-size 250 \
--service-principal $SERVICE_PRINCIPAL \
--client-secret $SERVICE_PRINCIPAL_SECRET \
--output none \
--enable-cluster-autoscaler \
--min-count 1 \
--max-count 5
Horizontal Pod Autocaling - Creating a Kind Cluster
My Node has 6 CPU cores for this demo
kind create cluster --name hpa --image kindest/node:v1.18.4
Metric Server
- For
Cluster Autoscaler
- On cloud-based clusters, Metric server may already be installed. - For
HPA
- We're using kind
Metric Server provides container resource metrics for use in autoscaling pipelines
Because I run K8s 1.18
in kind
, the Metric Server version i need is 0.3.7
We will need to deploy Metric Server 0.3.7
I used components.yaml
from the release page link above.
Important Note : For Demo clusters (like kind
), you will need to disable TLS
You can disable TLS by adding the following to the metrics-server container args
For production, make sure you remove the following :
- --kubelet-insecure-tls
- --kubelet-preferred-address-types="InternalIP"
Deployment:
cd kubernetes\autoscaling
kubectl -n kube-system apply -f .\metric-server\metricserver-0.3.7.yaml
#test
kubectl -n kube-system get pods
#wait for metrics to populate
kubectl top nodes
Example Application
For all autoscaling guides, we'll need a simple app, that generates some CPU load
- Build the app
- Push it to a registry
- Ensure resource requirements are set
- Deploy it to Kubernetes
- Ensure metrics are visible for the app
# build
cd kubernetes\autoscaling\application-cpu
docker build . -t aimvector/application-cpu:v1.0.0
# push
docker push aimvector/application-cpu:v1.0.0
# resource requirements
resources:
requests:
memory: "50Mi"
cpu: "500m"
limits:
memory: "500Mi"
cpu: "2000m"
# deploy
kubectl apply -f deployment.yaml
# metrics
kubectl top pods
Generate some traffic
# get a terminal to the traffic-generator
kubectl exec -it traffic-generator sh
# install wrk
apk add --no-cache wrk curl
# simulate some load
wrk -c 5 -t 5 -d 99999 -H "Connection: Close" http://application-cpu
kubectl scale deploy/application-cpu --replicas 2