# How to configure PostgreSQL This is part 2 of our PostgreSQL series.
In this chapter, we learn about fundamentals of the Postgres configuration.
Many people make the mistakes of relying directly on Kubernetes PostgreSQL controllers and Helm charts without having any understanding of Databases.
Let's start where we left off, and review our simple PostgreSQL database: ## Run a simple PostgreSQL database (docker) ``` cd storage/databases/postgresql/2-configuration docker run -it --rm --name postgres ` -e POSTGRES_PASSWORD=admin123 ` -v ${PWD}/pgdata:/var/lib/postgresql/data ` -p 5000:5432 ` postgres:15.0 ``` ## Environment Variables Many settings can be specified using environment variables.
I generally recommend not relying on default values and set most of the settings possible.
I personally prefer most or all settings in a configuration file, so it can be committed to source control.
This is where Environment variables are great because we can inject secrets there and keep passwords out of our configuration files and out of source control.
This will be important in Kubernetes later on.
We will not learn all or even most of the configurations in this chapter, as PostgreSQL has a lot of depth. So we will only learn what we need, one step at a time.
Let's take a look at some basic configurations [here](https://hub.docker.com/_/postgres) Let's set a few things here: | Environment Variable | Meaning | |----------------------|---------| | POSTGRES_USER | Username for the Postgres Admin | | POSTGRES_PASSWORD | Password for the Postgres Admin | | POSTGRES_DB | Default database for your Postgres Server | | PGDATA | Path where data is stored | ## Configuration files If we take a look at our `docker` mount that we defined in our `docker run` command:
`-v ${PWD}/pgdata:/var/lib/postgresql/data `
The `{PWD}/pgdata` folder that we have mounted contains not only data, but some default configuration files that we can explore.
Three files are important here: |Configuration file | Meaning | Documentation |----------------------|---------|-------| | pg_hba.conf | Host Based Authentication file | [Official Documentation](https://www.postgresql.org/docs/current/auth-pg-hba-conf.html) | | pg_ident.conf | User Mappings file | [Official Documentation](https://www.postgresql.org/docs/current/auth-username-maps.html) | postgresql.conf | PostgreSQL main configuraiton | ## The pg_hba.conf File We'll start this guide with the host based authentication file.
This file is automatically created in the data directory as we see.
We should create a copy of this file and configure it ourselves.
It controls who can access our PostgreSQL server.
Let's refer to the official documentation as well as walk through the config.
The config file itself has a great description of the contents.
As mentioned in the previous chapter, it's always good not to rely on default configurations. So let's create our own `pg_hba.conf` file.
We can grab the content from the default configuration and we may edit it as we go. ``` # TYPE DATABASE USER ADDRESS METHOD # "local" is for Unix domain socket connections only local all all trust # IPv4 local connections: host all all 127.0.0.1/32 trust # IPv6 local connections: host all all ::1/128 trust # Allow replication connections from localhost, by a user with the # replication privilege. local replication all trust host replication all 127.0.0.1/32 trust host replication all ::1/128 trust host all all all scram-sha-256 ``` ## The pg_ident.conf File This config file is a mapping file between system users and database users.
Let's refer to the official documentation and walk through the config.
This is not a feature that we will need in this series, so we will skip this config for the time being.
## The postgresql.conf File This configuration file is the main one for PostgreSQL.
As you can see this is a large file with in-depth tuning and customization capability.
### File Locations Let's set our data directory locations as well as config file locations
Our volume mount path in the container is also short and simple.
Note that we also split config from data so we have separate paths : ``` data_directory = '/data' hba_file = '/config/pg_hba.conf' ident_file = '/config/pg_ident.conf' ``` ### Connection and Authentication The shared_buffers parameter determines how much memory is dedicated to the server for caching data. The value should be set to 15% to 25% of the machine's total RAM. For example: if your machine's RAM size is 32 GB, then the recommended value for shared_buffers is 8 GB
We will take a look at `WAL` (Write Ahead Log), Archiving, Primary, and Standby configurations in a future chapter on replication
``` port = 5432 listen_addresses = '*' max_connections = 100 shared_buffers = 128MB dynamic_shared_memory_type = posix max_wal_size = 1GB min_wal_size = 80MB log_timezone = 'Etc/UTC' datestyle = 'iso, mdy' timezone = 'Etc/UTC' #locale settings lc_messages = 'en_US.utf8' # locale for system error message lc_monetary = 'en_US.utf8' # locale for monetary formatting lc_numeric = 'en_US.utf8' # locale for number formatting lc_time = 'en_US.utf8' # locale for time formatting default_text_search_config = 'pg_catalog.english' ``` We can also include other configurations from other locations with the `include_dir` and `include` options.
We will skip these for the sake of keeping things simple.
Nested configurations can over complicate a setup and makes it hard to troubleshoot when issues occur.
### Specifying Custom Configuration If we run on Linux, we need to ensure that the `postgres` user which has a user ID of `999` by default, should have access to the configuration files.
``` sudo chown 999:999 config/postgresql.conf sudo chown 999:999 config/pg_hba.conf sudo chown 999:999 config/pg_ident.conf ``` There is another important gotcha here.
The `PGDATA` variable tells PostgreSQL where our data directory is.
Similarly, we've learnt that our configuration file also has `data_directory` which tells PostgreSQL the same.
However, the latter is only read by PostgreSQL after initialization has occurred.
PostgreSQL's initialization phase sets up directory permissions on the data directory.
If we leave out `PGDATA`, then we will get errors that the data directory is invalid.
Hence `PGDATA` is important here.
## Running our PostgreSQL Finally, we can run our database with our custom configuration files: ``` docker run -it --rm --name postgres ` -e POSTGRES_USER=postgresadmin ` -e POSTGRES_PASSWORD=admin123 ` -e POSTGRES_DB=postgresdb ` -e PGDATA="/data" ` -v ${PWD}/pgdata:/data ` -v ${PWD}/config:/config ` -p 5000:5432 ` postgres:15.0 -c 'config_file=/config/postgresql.conf' ``` That's it for chapter two!
In [chapter 3](../3-replication/README.md), we will take a look at Replication and how to replicate our data to another PostgreSQL instance for better availability.