Deploy Pgpool-II on Kubernetes to Achieve Query Load Balancing and Monitoring

In my previous blog post, I have introduced how to monitor Pgpool-II and PostgreSQL cluster using Pgpool-II Exporter. As I mentioned in the previous post, Pgpool-II Exporter exposes metrics that Prometheus can collect. Prometheus is one of the most popular monitoring tools used on Kubernetes. In this post, I would like to introduce how to deploy Pgpool-II and Pgpool-II Exporter on Kubernetes.

Architecture

A PostgreSQL operator is required to manage PostgreSQL clusters on Kubernetes. Therefore, you need to combine Pgpool-II with a PostgreSQL operator. Below is the architecture:

Prerequisites

Before you start deploying Pgpool-II, please check the following prerequisites:

  • Make sure you have a Kubernetes cluster, and kubectl is installed.
  • PostgreSQL Operator and a PostgreSQL cluster are installed. In the blog, I used the PostgreSQL Operator Crunchy PostgreSQL Operator to create a PostgreSQL cluster.

Deploy Pgpool-II and Pgpool-II Exporter

We deploy the Pgpool-II pod that contains a Pgpool-II container and a Pgpool-II Exporter container. You need to specify the Docker images (see more information, Pgpool-II Dockerfile and Pgpool-II Exporter Dockerfile).

apiVersion: apps/v1
kind: Deployment
metadata:
  name: pgpool
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pgpool
  template:
    metadata:
      labels:
        app: pgpool
    spec:
      containers:
      - name: pgpool
        image: pgpool/pgpool
        ...
      - name: pgpool-stats
        image: pgpool/pgpool2_exporter
        ...

Configure Pgpool-II

Pgpool-II's health check, automatic failover, Watchdog and online recovery features aren't required on Kubernetes. You need to only enable load balancing and connection pooling

Here, we configure Pgpool-II using environment variables. If you are using a production environment, we recommend using a ConfigMap to configure Pgpool-II's config files, i.e. pgpool.conf, pcp.conf, pool_passwd and pool_hba.conf. You can find more information in the documentation.

Environment variables starting with PGPOOL_PARAMS_ can be converted to Pgpool-II's configuration parameters and these values can override the default configurations. You can download the pgpool_deploy.yaml and specify environment variables in this manifest file.

  • The Pgpool-II container Docker images is build with streaming replication mode. By default, load balancing, connection pooling and streaming replication check is enabled. 
  • Specify only two backend nodes. Specify the Primary Service name to backend_hostname0. Specify the Replica Service name to backend_hostname1. 
  • Because failover is managed by Kubernetes, specify DISALLOW_TO_FAILOVER flag to backend_flag for both of the two nodes and ALWAYS_PRIMARY flag to backend_flag0.

env:
- name: PGPOOL_PARAMS_BACKEND_HOSTNAME0
  value: "hippo"
- name: PGPOOL_PARAMS_BACKEND_HOSTNAME1
  value: "hippo-replica"
- name: PGPOOL_PARAMS_BACKEND_FLAG0
  value: "ALWAYS_PRIMARY|DISALLOW_TO_FAILOVER"
- name: PGPOOL_PARAMS_BACKEND_FLAG1
  value: "DISALLOW_TO_FAILOVER"

  • You need to specify sr_check_user and sr_check_password to perform streaming replication check. For the security reasons, we recommend that you specify either an encrypted password or a value that refers to the Secret's value. For example, if you are using Crunchy PostgreSQL Operator, by default the postgres user password is stored in hippo-postgres-secret. You can define the environment variables PGPOOL_PARAMS_SR_CHECK_USER and PGPOOL_PARAMS_SR_CHECK_PASSWORD:

- name: PGPOOL_PARAMS_SR_CHECK_USER
  valueFrom:
    secretKeyRef:
      name: hippo-postgres-secret
      key: username
- name: PGPOOL_PARAMS_SR_CHECK_PASSWORD
  valueFrom:
    secretKeyRef:
      name: hippo-postgres-secret
      key: password

Configure Pgpool-II Exporter

To configure Pgpool-II Exporter container, you need to specify the environment variables POSTGRES_USER, POSTGRES_PASSWORD and PGPOOL_SERVICE to define DATA_SOURCE_NAME

- name: pgpool-stats
  image: pgpool/pgpool2_exporter
  env:
  - name: POSTGRES_USER
    valueFrom:
      secretKeyRef:
        name: hippo-postgres-secret
        key: username
  - name: POSTGRES_PASSWORD
    valueFrom:
      secretKeyRef:
        name: hippo-postgres-secret
        key: password
  - name: PGPOOL_SERVICE
    value: "localhost"

The Pgpool-II and Pgpool-II Exporter should work with the minimal configuration above. Once you have defined the required environment variables in Deployment manifest pgpool_deploy.yaml, you can deploy Pgpool-II pod. 

$ kubectl apply -f pgpool_deploy.yaml

After deploying Pgpool-II, you can see the Pgpool-II pod and services using kubectl get pod and kubectl get svc command.

$  kubectl get pod -n pgo -l app=pgpool
NAME                     READY   STATUS    RESTARTS   AGE
pgpool-9694c4548-7dsqj   2/2     Running   0          37m

$ kubectl get svc -n pgo -l 'app in (pgpool, pgpool-stats)'
NAME           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
pgpool         ClusterIP   10.100.83.171   <none>        9999/TCP   52m
pgpool-stats   ClusterIP   10.100.44.229   <none>        9719/TCP   52m

Try queries load balancing

Let's connect to Pgpool-II and try query load balancing.

$ kubectl -n pgo port-forward svc/pgpool 9999:9999 &
$ psql -h 127.0.0.1 -p 9999 -U postgres -c "show pool_nodes"
node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state |
--------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+
0       | hippo         | 5432 | up     | 0.500000  | primary | 0          | false             | 0                 |                   |                        |
1       | hippo-replica | 5432 | up     | 0.500000  | standby | 0          | true              | 0                 |                   |                        |

Run SELECT query several times.

psql -h 127.0.0.1 -p 9999 -U postgres -c "select 1"

As you can see the select_cnt of both nodes are increasing. Pgpool-II can distribute SELECT queries among the multiple PostgreSQL pods. 

$ psql -h 127.0.0.1 -p 9999 -U postgres -c "show pool_nodes"
node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state |
--------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+
0       | hippo         | 5432 | up     | 0.500000  | primary | 3          | false             | 0                 |                   |                        |
1       | hippo-replica | 5432 | up     | 0.500000  | standby | 6          | true              | 0                 |                   |                        |

Comments

Popular posts from this blog

Installing Pgpool-II on Debian/Ubuntu

Query Load Balancing in Pgpool-II

Connection Pooling in Pgpool-II