How to configure PostgreSQL
This is part 2 of our PostgreSQL series.
In this chapter, we learn about fundamentals of the Postgres configuration.
Many people make the mistakes of relying directly on Kubernetes PostgreSQL controllers
and Helm charts without having any understanding of Databases.
Let's start where we left off, and review our simple PostgreSQL database:
Run a simple PostgreSQL database (docker)
cd storage/databases/postgres/2-configuration
docker run -it --rm --name postgres `
-e POSTGRES_PASSWORD=admin123 `
-v ${PWD}/pgdata:/var/lib/postgresql/data `
-p 5000:5432 `
postgres:15.0
Environment Variables
Many settings can be specified using environment variables.
I generally recommend not relying on default values and set most of the settings
possible.
I personally prefer most or all settings in a configuration file, so it can be committed to source control.
This is where Environment variables are great because we can inject secrets there
and keep passwords out of our configuration files and out of source control.
This will be important in Kubernetes later on.
We will not learn all or even most of the configurations in this chapter, as PostgreSQL has a lot of depth. So we will only learn what we need, one step at a time.
Let's take a look at some basic configurations here
Let's set a few things here:
Environment Variable | Meaning |
---|---|
POSTGRES_USER | Username for the Postgres Admin |
POSTGRES_PASSWORD | Password for the Postgres Admin |
POSTGRES_DB | Default database for your Postgres Server |
PGDATA | Path where data is stored |
Configuration files
If we take a look at our docker
mount that we defined in our docker run
command:
-v ${PWD}/pgdata:/var/lib/postgresql/data
The {PWD}/pgdata
folder that we have mounted contains not only data, but some default configuration files that we can explore.
Three files are important here:
Configuration file | Meaning | Documentation |
---|---|---|
pg_hba.conf | Host Based Authentication file | Official Documentation |
pg_ident.conf | User Mappings file | Official Documentation |
postgresql.conf | PostgreSQL main configuraiton |
The pg_hba.conf File
We'll start this guide with the host based authentication file.
This file is automatically created in the data directory as we see.
We should create a copy of this file and configure it ourselves.
It controls who can access our PostgreSQL server.
Let's refer to the official documentation as well as walk through the config.
The config file itself has a great description of the contents.
As mentioned in the previous chapter, it's always good not to rely on default configurations. So let's create our own pg_hba.conf
file.
We can grab the content from the default configuration and we may edit it as we go.
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all trust
# IPv4 local connections:
host all all 127.0.0.1/32 trust
# IPv6 local connections:
host all all ::1/128 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all trust
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
host all all all scram-sha-256
The pg_ident.conf File
This config file is a mapping file between system users and database users.
Let's refer to the official documentation and walk through the config.
This is not a feature that we will need in this series, so we will skip this config for the time being.
The postgresql.conf File
This configuration file is the main one for PostgreSQL.
As you can see this is a large file with in-depth tuning and customization capability.
File Locations
Let's set our data directory locations as well as config file locations
Our volume mount path in the container is also short and simple.
Note that we also split config from data so we have separate paths :
data_directory = '/data'
hba_file = '/config/pg_hba.conf'
ident_file = '/config/pg_ident.conf'
Connection and Authentication
The shared_buffers parameter determines how much memory is dedicated to the server for caching data. The value should be set to 15% to 25% of the machine's total RAM. For example: if your machine's RAM size is 32 GB, then the recommended value for shared_buffers is 8 GB
We will take a look at WAL
(Write Ahead Log), Archiving, Primary, and Standby configurations in a future chapter on replication
port = 5432
listen_addresses = '*'
max_connections = 100
shared_buffers = 128MB
dynamic_shared_memory_type = posix
max_wal_size = 1GB
min_wal_size = 80MB
log_timezone = 'Etc/UTC'
datestyle = 'iso, mdy'
timezone = 'Etc/UTC'
#locale settings
lc_messages = 'en_US.utf8' # locale for system error message
lc_monetary = 'en_US.utf8' # locale for monetary formatting
lc_numeric = 'en_US.utf8' # locale for number formatting
lc_time = 'en_US.utf8' # locale for time formatting
default_text_search_config = 'pg_catalog.english'
We can also include other configurations from other locations with the include_dir
and include
options.
We will skip these for the sake of keeping things simple.
Nested configurations can over complicate a setup and makes it hard to troubleshoot when issues occur.
Specifying Custom Configuration
If we run on Linux, we need to ensure that the postgres
user which has a user ID of 999
by default, should have access to the configuration files.
sudo chown 999:999 config/postgresql.conf
sudo chown 999:999 config/pg_hba.conf
sudo chown 999:999 config/pg_ident.conf
There is another important gotcha here.
The PGDATA
variable tells PostgreSQL where our data directory is.
Similarly, we've learnt that our configuration file also has data_directory
which tells PostgreSQL the same.
However, the latter is only read by PostgreSQL after initialization has occurred.
PostgreSQL's initialization phase sets up directory permissions on the data directory.
If we leave out PGDATA
, then we will get errors that the data directory is invalid.
Hence PGDATA
is important here.
Running our PostgreSQL
Finally, we can run our database with our custom configuration files:
docker run -it --rm --name postgres `
-e POSTGRES_USER=postgresadmin `
-e POSTGRES_PASSWORD=admin123 `
-e POSTGRES_DB=postgresdb `
-e PGDATA="/data" `
-v ${PWD}/pgdata:/data `
-v ${PWD}/config:/config `
-p 5000:5432 `
postgres:15.0 -c 'config_file=/config/postgresql.conf'
That's it for chapter two!
In chapter 3, we will take a look at Replication and how to replicate our data to another PostgreSQL instance for better availability.