Vertical Pod Autoscaling

Provides recommendations for CPU and Memory request values.

Understanding Resources

In this example, I'll be focusing on CPU for scaling.
We need to ensure we have an understanding of the compute resources we have.

How many cores do we have
How many cores do our application use
Observe our applications usage
Use the VPA to recommend resource request values for our application

Create a cluster

My Node has 6 CPU cores for this demo

kind create cluster --name vpa --image kindest/node:v1.18.4

Deploy Metric Server

Metric Server provides container resource metrics for use in autoscaling pipelines

We will need to deploy Metric Server 0.3.7
I used components.yamlfrom the release page link above.

Note: For Demo clusters (like kind), you will need to disable TLS
You can disable TLS by adding the following to the metrics-server container args

For production, make sure you remove the following :

- --kubelet-insecure-tls
- --kubelet-preferred-address-types="InternalIP"

Deploy it:

cd kubernetes\autoscaling
kubectl -n kube-system apply -f .\metric-server\metricserver-0.3.7.yaml

#test 
kubectl -n kube-system get pods

#wait for metrics to populate
kubectl top nodes

Example App

We have an app that simulates CPU usage

# build

cd kubernetes\autoscaling\application-cpu
docker build . -t aimvector/application-cpu:v1.0.0

# push
docker push aimvector/application-cpu:v1.0.0

# resource requirements
resources:
  requests:
    memory: "50Mi"
    cpu: "500m"
  limits:
    memory: "500Mi"
    cpu: "2000m"

# deploy 
kubectl apply -f deployment.yaml

# metrics
kubectl top pods

Generate some CPU load

# Deploy a tester to run traffic from

cd kubernetes\autoscaling
kubectl apply -f .\autoscaler-vpa\tester.yaml

# get a terminal
kubectl exec -it tester sh
# install wrk
apk add --no-cache wrk curl

# simulate some load
wrk -c 5 -t 5 -d 99999 -H "Connection: Close" http://application-cpu

# scale and keep checking `kubectl top`
# every time we add a pod, CPU load per pod should drop dramatically.
# roughly 8 pods will have each pod use +- 400m

kubectl scale deploy/application-cpu --replicas 2

2.4 KiB Raw Blame History