Deploy Pgpool-II on Kubernetes to Achieve Query Load Balancing and Monitoring
Architecture
A PostgreSQL operator is required to manage PostgreSQL clusters on Kubernetes. Therefore, you need to combine Pgpool-II with a PostgreSQL operator. Below is the architecture:
Prerequisites
Before you start deploying Pgpool-II, please check the following prerequisites:
- Make sure you have a Kubernetes cluster, and kubectl is installed.
- PostgreSQL Operator and a PostgreSQL cluster are installed. In the blog, I used the PostgreSQL Operator Crunchy PostgreSQL Operator to create a PostgreSQL cluster.
Deploy Pgpool-II and Pgpool-II Exporter
We deploy the Pgpool-II pod that contains a Pgpool-II container and a Pgpool-II Exporter container. You need to specify the Docker images (see more information, Pgpool-II Dockerfile and Pgpool-II Exporter Dockerfile).
apiVersion: apps/v1
kind: Deployment
metadata:
name: pgpool
spec:
replicas: 1
selector:
matchLabels:
app: pgpool
template:
metadata:
labels:
app: pgpool
spec:
containers:
- name: pgpool
image: pgpool/pgpool
...
- name: pgpool-stats
image: pgpool/pgpool2_exporter
...
Configure Pgpool-II
Pgpool-II's health check, automatic failover, Watchdog and online recovery features aren't required on Kubernetes. You need to only enable load balancing and connection pooling.
Here, we configure Pgpool-II using environment variables. If you are using a production environment, we recommend using a ConfigMap to configure Pgpool-II's config files, i.e. pgpool.conf, pcp.conf, pool_passwd and pool_hba.conf. You can find more information in the documentation.
Environment variables starting with PGPOOL_PARAMS_ can be converted to Pgpool-II's configuration parameters and these values can override the default configurations. You can download the pgpool_deploy.yaml and specify environment variables in this manifest file.
- The Pgpool-II container Docker images is build with streaming replication mode. By default, load balancing, connection pooling and streaming replication check is enabled.
- Specify only two backend nodes.
Specify the Primary Service name to backend_hostname0. Specify the
Replica Service name to backend_hostname1.
- Because failover is
managed by Kubernetes, specify DISALLOW_TO_FAILOVER flag to backend_flag
for both of the two nodes and ALWAYS_PRIMARY flag to backend_flag0.
env:
- name: PGPOOL_PARAMS_BACKEND_HOSTNAME0
value: "hippo"
- name: PGPOOL_PARAMS_BACKEND_HOSTNAME1
value: "hippo-replica"
- name: PGPOOL_PARAMS_BACKEND_FLAG0
value: "ALWAYS_PRIMARY|DISALLOW_TO_FAILOVER"
- name: PGPOOL_PARAMS_BACKEND_FLAG1
value: "DISALLOW_TO_FAILOVER"
- You need to specify sr_check_user and sr_check_password to perform streaming replication check. For the security reasons, we recommend that you specify either an encrypted password or a value that refers to the Secret's value. For example, if you are using Crunchy PostgreSQL Operator, by default the postgres user password is stored in hippo-postgres-secret. You can define the environment variables PGPOOL_PARAMS_SR_CHECK_USER and PGPOOL_PARAMS_SR_CHECK_PASSWORD:
- name: PGPOOL_PARAMS_SR_CHECK_USER
valueFrom:
secretKeyRef:
name: hippo-postgres-secret
key: username
- name: PGPOOL_PARAMS_SR_CHECK_PASSWORD
valueFrom:
secretKeyRef:
name: hippo-postgres-secret
key: password
Configure Pgpool-II Exporter
To configure Pgpool-II Exporter container, you need to specify the environment variables POSTGRES_USER, POSTGRES_PASSWORD and PGPOOL_SERVICE to define DATA_SOURCE_NAME.
- name: pgpool-stats
image: pgpool/pgpool2_exporter
env:
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: hippo-postgres-secret
key: username
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: hippo-postgres-secret
key: password
- name: PGPOOL_SERVICE
value: "localhost"
The Pgpool-II and Pgpool-II Exporter should work with the minimal configuration above. Once you have defined the required environment variables in Deployment manifest pgpool_deploy.yaml, you can deploy Pgpool-II pod.
$ kubectl apply -f pgpool_deploy.yaml
After deploying Pgpool-II, you can see the Pgpool-II pod and services using kubectl get pod and kubectl get svc command.
$ kubectl get pod -n pgo -l app=pgpool
NAME READY STATUS RESTARTS AGE
pgpool-9694c4548-7dsqj 2/2 Running 0 37m
$ kubectl get svc -n pgo -l 'app in (pgpool, pgpool-stats)'
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
pgpool ClusterIP 10.100.83.171 <none> 9999/TCP 52m
pgpool-stats ClusterIP 10.100.44.229 <none> 9719/TCP 52m
Try queries load balancing
Let's connect to Pgpool-II and try query load balancing.
$ kubectl -n pgo port-forward svc/pgpool 9999:9999 &
$ psql -h 127.0.0.1 -p 9999 -U postgres -c "show pool_nodes"
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state |
--------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+
0 | hippo | 5432 | up | 0.500000 | primary | 0 | false | 0 | | |
1 | hippo-replica | 5432 | up | 0.500000 | standby | 0 | true | 0 | | |
Run SELECT query several times.
psql -h 127.0.0.1 -p 9999 -U postgres -c "select 1"
As you can see the select_cnt of both nodes are increasing. Pgpool-II can distribute SELECT queries among the multiple PostgreSQL pods.
$ psql -h 127.0.0.1 -p 9999 -U postgres -c "show pool_nodes"
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state |
--------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+
0 | hippo | 5432 | up | 0.500000 | primary | 3 | false | 0 | | |
1 | hippo-replica | 5432 | up | 0.500000 | standby | 6 | true | 0 | | |
Comments
Post a Comment