Deploy Pgpool-II on Kubernetes to Achieve Query Load Balancing and Monitoring

In my previous blog post, I have introduced how to monitor Pgpool-II and PostgreSQL cluster using Pgpool-II Exporter. As I mentioned in the previous post, Pgpool-II Exporter exposes metrics that Prometheus can collect. Prometheus is one of the most popular monitoring tools used on Kubernetes. In this post, I would like to introduce how to deploy Pgpool-II and Pgpool-II Exporter on Kubernetes.

Architecture

A PostgreSQL operator is required to manage PostgreSQL clusters on Kubernetes. Therefore, you need to combine Pgpool-II with a PostgreSQL operator. Below is the architecture:

Prerequisites

Before you start deploying Pgpool-II, please check the following prerequisites:

Make sure you have a Kubernetes cluster, and kubectl is installed.
PostgreSQL Operator and a PostgreSQL cluster are installed. In the blog, I used the PostgreSQL Operator Crunchy PostgreSQL Operator to create a PostgreSQL cluster.

Deploy Pgpool-II and Pgpool-II Exporter

We deploy the Pgpool-II pod that contains a Pgpool-II container and a Pgpool-II Exporter container. You need to specify the Docker images (see more information, Pgpool-II Dockerfile and Pgpool-II Exporter Dockerfile).

apiVersion: apps/v1
kind: Deployment
metadata:
name: pgpool
spec:
replicas: 1
selector:
    matchLabels:
      app: pgpool
template:
    metadata:
      labels:
        app: pgpool
    spec:
      containers:
      - name: pgpool
        image: pgpool/pgpool
      ...
      - name: pgpool-stats
        image: pgpool/pgpool2_exporter
      ...

Configure Pgpool-II

Pgpool-II's health check, automatic failover, Watchdog and online recovery features aren't required on Kubernetes. You need to only enable load balancing and connection pooling.

Here, we configure Pgpool-II using environment variables. If you are using a production environment, we recommend using a ConfigMap to configure Pgpool-II's config files, i.e. pgpool.conf, pcp.conf, pool_passwd and pool_hba.conf. You can find more information in the documentation.

Environment variables starting with PGPOOL_PARAMS_ can be converted to Pgpool-II's configuration parameters and these values can override the default configurations. You can download the pgpool_deploy.yaml and specify environment variables in this manifest file.

The Pgpool-II container Docker images is build with streaming replication mode. By default, load balancing, connection pooling and streaming replication check is enabled.
Specify only two backend nodes. Specify the Primary Service name to backend_hostname0. Specify the Replica Service name to backend_hostname1.
Because failover is managed by Kubernetes, specify DISALLOW_TO_FAILOVER flag to backend_flag for both of the two nodes and ALWAYS_PRIMARY flag to backend_flag0.

env:
- name: PGPOOL_PARAMS_BACKEND_HOSTNAME0
value: "hippo"
- name: PGPOOL_PARAMS_BACKEND_HOSTNAME1
value: "hippo-replica"
- name: PGPOOL_PARAMS_BACKEND_FLAG0
value: "ALWAYS_PRIMARY|DISALLOW_TO_FAILOVER"
- name: PGPOOL_PARAMS_BACKEND_FLAG1
value: "DISALLOW_TO_FAILOVER"

You need to specify sr_check_user and sr_check_password to perform streaming replication check. For the security reasons, we recommend that you specify either an encrypted password or a value that refers to the Secret's value. For example, if you are using Crunchy PostgreSQL Operator, by default the postgres user password is stored in hippo-postgres-secret. You can define the environment variables PGPOOL_PARAMS_SR_CHECK_USER and PGPOOL_PARAMS_SR_CHECK_PASSWORD:

- name: PGPOOL_PARAMS_SR_CHECK_USER
valueFrom:
    secretKeyRef:
      name: hippo-postgres-secret
      key: username
- name: PGPOOL_PARAMS_SR_CHECK_PASSWORD
valueFrom:
    secretKeyRef:
      name: hippo-postgres-secret
      key: password

Configure Pgpool-II Exporter

To configure Pgpool-II Exporter container, you need to specify the environment variables POSTGRES_USER, POSTGRES_PASSWORD and PGPOOL_SERVICE to define DATA_SOURCE_NAME.

- name: pgpool-stats
image: pgpool/pgpool2_exporter
env:
- name: POSTGRES_USER
    valueFrom:
      secretKeyRef:
        name: hippo-postgres-secret
        key: username
- name: POSTGRES_PASSWORD
    valueFrom:
      secretKeyRef:
        name: hippo-postgres-secret
        key: password
- name: PGPOOL_SERVICE
    value: "localhost"

The Pgpool-II and Pgpool-II Exporter should work with the minimal configuration above. Once you have defined the required environment variables in Deployment manifest pgpool_deploy.yaml, you can deploy Pgpool-II pod.

$ kubectl apply -f pgpool_deploy.yaml

After deploying Pgpool-II, you can see the Pgpool-II pod and services using kubectl get pod and kubectl get svc command.

$ kubectl get pod -n pgo -l app=pgpool
NAME READY STATUS RESTARTS AGE
pgpool-9694c4548-7dsqj 2/2 Running 0 37m

$ kubectl get svc -n pgo -l 'app in (pgpool, pgpool-stats)'
NAME           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
pgpool         ClusterIP   10.100.83.171   <none>        9999/TCP   52m
pgpool-stats   ClusterIP   10.100.44.229   <none>        9719/TCP   52m

Try queries load balancing

Let's connect to Pgpool-II and try query load balancing.

$ kubectl -n pgo port-forward svc/pgpool 9999:9999 &
$ psql -h 127.0.0.1 -p 9999 -U postgres -c "show pool_nodes"
node_id |   hostname    | port | status | lb_weight | role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state |
--------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+
0       | hippo         | 5432 | up     | 0.500000 | primary | 0          | false             | 0                 |                   |                        |
1       | hippo-replica | 5432 | up     | 0.500000 | standby | 0          | true              | 0                 |                   |                        |

Run SELECT query several times.

psql -h 127.0.0.1 -p 9999 -U postgres -c "select 1"

As you can see the select_cnt of both nodes are increasing. Pgpool-II can distribute SELECT queries among the multiple PostgreSQL pods.

$ psql -h 127.0.0.1 -p 9999 -U postgres -c "show pool_nodes"
node_id |   hostname    | port | status | lb_weight | role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state |
--------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+
0       | hippo         | 5432 | up     | 0.500000 | primary | 3          | false             | 0                 |                   |                        |
1       | hippo-replica | 5432 | up     | 0.500000 | standby | 6          | true              | 0                 |                   |                        |

Search This Blog

Playing with Elephants ~PostgreSQL and Pgpool-II~