Pgpool-II Configuration Parameters - auto_failback

Pgpool-II can monitor the status of PostgreSQL and automatically perform failover or degeneration if a failure is detected. 

A healthy standby PostgreSQL server may be detached from Pgpool-II due to a temporary network problem. To re-attach a detached standby PostgreSQL server, you need to make sure that it is a healthy standby server and then manually run pcp_attach_node command or restart Pgpool-II. 

Since Pgpool-II 4.1, it is possible to automatically re-attach a healthy standby PostgreSQL server by enabling auto_failback. In this post, I'll describe how to enable the automatic failback feature.

Automatic failback

To use this automatic failback feature, Pgpool-II's health check and streaming replication check must be enabled.

Because Pgpool-II periodically retrieves information from pg_stat_replication to check the connectivity between primary and standby servers, PostgreSQL 9.1 or later is required.

Pgpool-II will attach the PostgreSQL server, if the following conditions are satisfied:

  • if it is a standby PostgreSQL server
  • if the status of the PostgreSQL server is down (it is the status managed by Pgpool-II)
  • if pg_stat_replication.state is streaming

To enable automatic failback you need to configure the following parameters:

auto_failback = on
auto_failback_interval
health_check_user
health_check_password
sr_check_user
sr_check_password
backend_application_name*

Configuring automatic failback

This section shows how to configure automatic failback.

This tutorial assumes you have already setup streaming replication between two PostgreSQL servers with the following hostnames:

  • server1: Primary and Pgpool-II
  • server2: Standby

Creating a PostgreSQL user

The PostgreSQL user used for streaming replication check must be a superuser or with the role pg_monitor. For security reasons, It is not recommended to use superuser. Here we create a dedicated user "pgpool" and grant minimum privileges.

[root@server1 ~]# psql -h server1 -U postgres
postgres=# CREATE ROLE pgpool WITH LOGIN IN ROLE pg_monitor;

Editing pg_hba.conf

In this tutorial we set trust authentication is only for testing purpose. In a production environment, Please follow the PostgreSQL and  Pgpool-II  documentation to configure proper authentication settings.

Add the following entry to pg_hba.conf:

host    all         pgpool             samenet                 trust

Setting application_name

We set the hostname of standby server to application_name in the primary_conninfo setting. This application_name must also be specified in backend_application_name* of Pgpool-II, which is used to identify the standby server.

primary_conninfo = 'host=server1 port=5432 user=repl application_name=server2'

Editing pgpool.conf

Edit pgpool.conf and add the following configurations:

auto_failback = on
auto_failback_interval = 1min

health_check_period = 10
health_check_user = 'pgpool'
sr_check_period = 10
sr_check_user = 'pgpool'

backend_hostname0 = 'server1'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/var/lib/pgsql/14/data'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_application_name0 = 'server1'

backend_hostname1 = 'server2'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/var/lib/pgsql/14/data'
backend_flag1 = 'ALLOW_TO_FAILOVER'
backend_application_name1 = 'server2'

Because trust authentication is specified in PostgreSQL, the settings of health_check_password and sr_check_password are not required. If you wish to use other authentication methods, please follow this tutorial to set user passwords in Pgpool-II.

To check pg_stat_replication.state (current WAL sender state) of each standby server, the configurations of backend_application_name* are required. It must be the same name as application_name specified in primary_conninfo setting. In this tutorial we use the hostname of each server.

Verifying automatic failback

Once PostgreSQL and Pgpool-II are started, you can run show pool_nodes to check the initial cluster status. Primary and standby are up.

[root@server1 ~]# psql -h server1 -U postgres -p 9999 -c "show pool_nodes"
 node_id | hostname | port | status | pg_status | lb_weight |  role   | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
---------+----------+------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | server1  | 5432 | up     | up        | 0.500000  | primary | primary | 0          | false             | 0                 |                   |                        | 2022-02-28 12:03:18
 1       | server2  | 5432 | up     | up        | 0.500000  | standby | standby | 0          | true              | 0                 | streaming         | async                  | 2022-02-28 12:04:09
Then, let's stop the standby PostgreSQL server.

[root@server2 ~]# su - postgres -c "/usr/pgsql-14/bin/pg_ctl stop -D 14/data"

The status has changed up -> down.

[root@server2 ~]# psql -h server1 -U postgres -p 9999 -c "show pool_nodes"
 node_id | hostname | port | status | pg_status | lb_weight |  role   | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
---------+----------+------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | server1  | 5432 | up     | up        | 0.500000  | primary | primary | 0          | true              | 0                 |                   |                        | 2022-02-28 12:03:18
 1       | server2  | 5432 | down   | down      | 0.500000  | standby | unknown | 0          | false             | 0                 |                   |                        | 2022-02-28 12:05:30

Next, restart the standby PostgreSQL server. 

If you are using replication slot, you may need to create replication slot before restarting the standby PostgreSQL server.

[root@server2 ~]# psql -h server1 -U postgres -c "select * from pg_create_physical_replication_slot('server2')"
[root@server2 ~]# su - postgres -c "/usr/pgsql-14/bin/pg_ctl start -D 14/data"

Wait about one minute, the standby server rejoins the cluster and the status changed to up.

[root@server2 ~]# psql -h server1 -U postgres -p 9999 -c "show pool_nodes"
 node_id | hostname | port | status | pg_status | lb_weight |  role   | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_sta
tus_change  
---------+----------+------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | server1  | 5432 | up     | up        | 0.500000  | primary | primary | 0          | true              | 0                 |                   |                        | 2022-02-28 12:03:18
 1       | server2  | 5432 | up     | up        | 0.500000  | standby | standby | 0          | false             | 0                 | streaming         | async                  | 2022-02-28 12:12:12

Comments

Popular posts from this blog

Installing Pgpool-II on Debian/Ubuntu

Query Load Balancing in Pgpool-II

Connection Pooling in Pgpool-II