diff --git a/kubernetes/probes/README.md b/kubernetes/probes/README.md new file mode 100644 index 0000000..ed8b380 --- /dev/null +++ b/kubernetes/probes/README.md @@ -0,0 +1,121 @@ +# Introduction to Kubernetes Probes + + +## Create a kubernetes cluster + +In this guide we we''ll need a Kubernetes cluster for testing. Let's create one using [kind](https://kind.sigs.k8s.io/)
+ +``` +cd kubernetes/probes +kind create cluster --name demo --image kindest/node:v1.28.0 +``` + +Test the cluster: +``` +kubectl get nodes +NAME STATUS ROLES AGE VERSION +demo-control-plane Ready control-plane 59s v1.28.0 + +``` + +## Applications + +Client app is used to act as a client that sends web requests : + +``` +kubectl apply -f client.yaml +``` + +The server app is the app that will receive web requests: + +``` +kubectl apply -f server.yaml +``` + +Test making web requests constantly: + +``` +while true; do curl http://server; sleep 1s; done +``` + +Bump the server `version` label up and apply to force a new deployment
+Notice the client throws an error, so traffic is interupted, not good!
+ +This is because our new pod during deployment is not ready to take traffic! + +## Readiness Probes + +Let's add a readiness probe that tells Kubernetes when we are ready: + +``` +readinessProbe: + httpGet: + path: / + port: 5000 + initialDelaySeconds: 3 + periodSeconds: 3 + failureThreshold: 3 +``` + +### Automatic failover with Readiness probes + +Let's pretend our application starts hanging and not longer returns responses
+This is common with some web servers and may need to be manually restarted + +``` +kubectl exec -it podname -- sh -c "rm /data.txt" +``` + +Now we will notice our client app starts getting errors.
+Few things to notice: + +* Our readiness probe detected an issue and removed traffic from the faulty pod. +* We should be running more than one application so we would be highly available + +``` +kubectl scale deploy server --replicas 2 +``` + +* Notice traffic comes back as its routed to the healthy pod + +Fix our old pod: `kubectl exec -it podname -- sh -c "echo 'ok' > /data.txt"`
+ +* If we do this again with 2 pods, notice we still get an interuption but our app automaticall stabalises after some time +* This is because readinessProbe has `failureThreshold` and some failure will be expected before recovery +* Do not set this `failureThreshold` too low as you may remove traffic frequently. Tune accordingly! + +Readiness probes help us automatically remove traffic when there are intermittent network issues
+ +## Liveness Probes + +Liveness probe helps us when we cannot automatically recover.
+Let's use the same mechanism to create a vaulty pod: + +``` +kubectl exec -it podname -- sh -c "rm /data.txt" +``` + +Our readiness probe has saved us from traffic issues.
+But we want the pod to recover automatically, so let's create livenessProbe: + +``` +livenessProbe: + httpGet: + path: / + port: 5000 + initialDelaySeconds: 3 + periodSeconds: 4 + failureThreshold: 8 +``` + +Scale back up: `kubectl scale deploy server --replicas 2` +Create a vaulty pod: `kubectl exec -it podname -- sh -c "rm /data.txt" ` + +If we observe we will notice the readinessProbe saves our traffic, and livenessProbe will eventually replace the bad pod
+ +## Startup Probes + +The [startup probe](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-startup-probes) is for slow starting applications
+It's important to understand difference between start up and readiness probes.
+In our examples here, readiness probe acts as a startup probe too, since our app is fairly slow starting!
+This difference is explained in the video.
\ No newline at end of file diff --git a/kubernetes/probes/client.yaml b/kubernetes/probes/client.yaml new file mode 100644 index 0000000..34c4f50 --- /dev/null +++ b/kubernetes/probes/client.yaml @@ -0,0 +1,22 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: client + labels: + app: client +spec: + selector: + matchLabels: + app: client + replicas: 1 + template: + metadata: + labels: + app: client + spec: + containers: + - name: client + image: alpine:latest + command: + - sleep + - "9999" diff --git a/kubernetes/probes/server.yaml b/kubernetes/probes/server.yaml new file mode 100644 index 0000000..23f4953 --- /dev/null +++ b/kubernetes/probes/server.yaml @@ -0,0 +1,83 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: server + labels: + app: server +spec: + selector: + matchLabels: + app: server + replicas: 1 + template: + metadata: + labels: + app: server + version: "1" + spec: + containers: + - name: server + image: python:alpine + workingDir: /app + command: ["/bin/sh"] + args: + - -c + - "pip3 install --disable-pip-version-check --root-user-action=ignore flask && echo 'ok' > /data.txt && flask run -h 0.0.0.0 -p 5000" + ports: + - containerPort: 5000 + volumeMounts: + - name: app + mountPath: "/app" + readinessProbe: + httpGet: + path: / + port: 5000 + initialDelaySeconds: 3 + periodSeconds: 3 + failureThreshold: 3 + livenessProbe: + httpGet: + path: / + port: 5000 + initialDelaySeconds: 3 + periodSeconds: 4 + failureThreshold: 8 + volumes: + - name: app + configMap: + name: server-code +--- +apiVersion: v1 +kind: Service +metadata: + name: server + labels: + app: server +spec: + type: ClusterIP + selector: + app: server + ports: + - protocol: TCP + name: http + port: 80 + targetPort: 5000 +--- +apiVersion: v1 +kind: ConfigMap +metadata: + name: server-code +data: + app.py: | + import time + import logging + import os.path + + logging.basicConfig(level=logging.DEBUG) + + from flask import Flask + app = Flask(__name__) + @app.route("/") + def hello(): + with open('/data.txt') as data: + return data.read()