Introduction to Kubernetes Probes

Create a kubernetes cluster

In this guide we we''ll need a Kubernetes cluster for testing. Let's create one using kind

cd kubernetes/probes
kind create cluster --name demo --image kindest/node:v1.28.0

Test the cluster:

kubectl get nodes
NAME                 STATUS   ROLES           AGE   VERSION
demo-control-plane   Ready    control-plane   59s   v1.28.0

Applications

Client app is used to act as a client that sends web requests :

kubectl apply -f client.yaml

The server app is the app that will receive web requests:

kubectl apply -f server.yaml

Test making web requests constantly:

while true; do curl http://server; sleep 1s; done

Bump the server version label up and apply to force a new deployment
Notice the client throws an error, so traffic is interupted, not good!

This is because our new pod during deployment is not ready to take traffic!

Readiness Probes

Let's add a readiness probe that tells Kubernetes when we are ready:

readinessProbe:
  httpGet:
    path: /
    port: 5000
  initialDelaySeconds: 3
  periodSeconds: 3
  failureThreshold: 3

Automatic failover with Readiness probes

Let's pretend our application starts hanging and not longer returns responses
This is common with some web servers and may need to be manually restarted

kubectl exec -it podname -- sh -c "rm /data.txt"

Now we will notice our client app starts getting errors.
Few things to notice:

Our readiness probe detected an issue and removed traffic from the faulty pod.
We should be running more than one application so we would be highly available

kubectl scale deploy server --replicas 2

Notice traffic comes back as its routed to the healthy pod

Fix our old pod: kubectl exec -it podname -- sh -c "echo 'ok' > /data.txt"

If we do this again with 2 pods, notice we still get an interuption but our app automaticall stabalises after some time
This is because readinessProbe has failureThreshold and some failure will be expected before recovery
Do not set this failureThreshold too low as you may remove traffic frequently. Tune accordingly!

Readiness probes help us automatically remove traffic when there are intermittent network issues

Liveness Probes

Liveness probe helps us when we cannot automatically recover.
Let's use the same mechanism to create a vaulty pod:

kubectl exec -it podname -- sh -c "rm /data.txt"

Our readiness probe has saved us from traffic issues.
But we want the pod to recover automatically, so let's create livenessProbe:

livenessProbe:
  httpGet:
    path: /
    port: 5000
  initialDelaySeconds: 3
  periodSeconds: 4
  failureThreshold: 8

Scale back up: kubectl scale deploy server --replicas 2 Create a vaulty pod: kubectl exec -it podname -- sh -c "rm /data.txt"

If we observe we will notice the readinessProbe saves our traffic, and livenessProbe will eventually replace the bad pod

Startup Probes

The startup probe is for slow starting applications
It's important to understand difference between start up and readiness probes.
In our examples here, readiness probe acts as a startup probe too, since our app is fairly slow starting!
This difference is explained in the video.