Running PostgreSQL in Kubernetes (Basic)

In chapters one, two and three we've managed to stand up a Primary and Stand-By PostgreSQL instances using containers.

We've learnt the fundamentals of how to persist and store data, how to configure instances and how to setup streaming replication from a Primary container to a Stand-by container.

The challenges

We have encountered a few challenges along the way, but running PostgreSQL in a container is pretty similar to just running it on a server outside of a container.

Kubernetes will add a bunch more complexity which we'll cover in this chapter.
A few points to note:

If you are not familiar with running PostreSQL in a container, this chapter is not for you. Please go back to Chapter 1
If you are not familiar with configuration of PostreSQL, do not attempt to run it in Kubernetes. Please go back to Chapter 2
If you are not familiar with Streaming Replication, Do not attempt to run PostreSQL in Kubernetes. Please go back to Chapter 3
If you are not familiar with StatefulSets, Do not attempt to run PostreSQL in Kubernetes
We will not be using Popular PostgreSQL controllers\operators or Helm charts in this guide. Operators and controllers simply automate things, and those open source tooling assumes you understand all the above mentioned tech.

One caveat to think of before running PostgreSQL in Kubernetes, or any database for that matter, is how would you handle cluster upgrades?
Most cloud providers uprade by rolling new nodes and deleting old nodes, meaning your primary server may be deleted and start on a new node without any data.
If you don't have a strategy here, you will lose your data.

If something goes wrong and you're using operators or controllers and don't have a background in how PostgreSQL works, you will lose data.

And finally - The work in this guide has not been tested for Production workloads and written purely for educational purposes.

Create a Kubernetes cluster

In this chapter, we will start by creating a test Kubernetes cluster using kind

kind create cluster --name postgresql --image kindest/node:v1.28.0

kubectl get nodes
NAME                       STATUS   ROLES                  AGE   VERSION
postgresql-control-plane   Ready    control-plane,master   31s   v1.28.0

Setting up our PostgreSQL environment

Deploy a namespace to hold our resources:

kubectl create ns postgresql

In Chapter 3, we defined a few environment variables in our docker run command.
Some of those values are sensitive, so in Kubernetes we'll place sensitive values in a Kubernetes secret.

Create our secret for our first PostgreSQL instance:

kubectl -n postgresql create secret generic postgresql `
  --from-literal POSTGRES_USER="postgresadmin" `
  --from-literal POSTGRES_PASSWORD='admin123' `
  --from-literal POSTGRES_DB="postgresdb" `
  --from-literal REPLICATION_USER="replicationuser" `
  --from-literal REPLICATION_PASSWORD='replicationPassword'

Deploy our first PostgreSQL instance

Statefulsets

As we know we're going to need state and persist data, we'll go create a StatefulSet

I've taken a copy of the Statefulset from the Kubernetes site and created a statefulset.yaml for reference.

In the video, we'll replace some of these values so our PostgreSQL will fit.

We replace the name of nginx with postgres
Tweak the k8s service
Tweak the statefulset.yaml
Add environment variables and secret mappings.
Add our configurations in a Configmap

We will take a look at Replication in the following chapter, so our replication user will not exist in the database just yet.

Deploy our PostgreSQL instance:

kubectl -n postgresql apply -f storage/databases/postgresql/4-k8s-basic/yaml/statefulset.yaml

Check our installation

kubectl -n postgresql get pods

# check the database logs
kubectl -n postgresql logs postgres-0

You will notice archive errors in the logs, because the archive directory does not exist in our volume.
We will address this soon.

Let's check our instance further:

kubectl -n postgresql exec -it postgres-0 -- bash

# login to postgres
psql --username=postgresadmin postgresdb

# see our replication user created
\du

#create a table
CREATE TABLE customers (firstname text, customer_id serial, date_created timestamp);

#show the table
\dt

# quit out of postgresql
\q

# check the data directory
ls -l /data/pgdata

# check the archive (does not exist!)
ls -l /data/archive

Init containers

Init containers play a big role in fulfilling specific needs in IT workloads

Init containers run before other containers in our pods. It can greatly assist when we need to do manual tasks. Like creating users, setting up tables, etc.
Init containers can help us initialise things, like creating this /data/archive directory

Now it may seem overkill for simply creating a directory, however this init container will play a big role in our next chapter on replication.

We can use init containers to setup our postgres as a primary, or a standby server. Stay tuned!

Let's create our init container:

initContainers:
- name: init
  image: postgres:15.0
  command: [ "bash", "-c" ]
  args:
  - |
    #create archive directory
    mkdir -p /data/archive && chown -R 999:999 /data/archive

This init container also needs to share the volume of the database container:

volumeMounts:
- name: data
  mountPath: /data
  readOnly: false

And redeploy!

kubectl -n postgresql apply -f storage/databases/postgresql/4-k8s-basic/yaml/statefulset.yaml

# check our install

kubectl -n postgresql get pods
kubectl -n postgresql logs postgres-0
kubectl -n postgresql exec -it postgres-0 -- bash
ls /data
ls /data/archive/

# check if our table was persisted!
psql --username=postgresadmin postgresdb
\dt
\q

That's it for chapter four!
Now we've successfully managed to lift our PostgreSQL container and deploy it to Kubernetes.
In the next chapter we'll take what we've learnt here and combine our previous studies to setup a Primary and Stand-By instance of PostgreSQL in Kubernetes