Pgpool-II Logging and Debugging

black and yellow computer mouse on white desk

Logging and debugging help to monitor and identify issues or problems occurring in your program. Sometimes we need to log debug information to figure out the problems during software development and testing. However, if debug is enabled,  a large number of debug messages are generated and it is hard to read. Proper logging and debugging configurations are important.

There are a number of ways to retrieve debug information from Pgpool-II. In this post, I will describe the various ways for logging and debugging Pgpool-II. 

Logging

Before Pgpool-II 4.1, some log processing tools (e.g. rsyslog) are required to store Pgpool-II logs and rotate them.  

For example, below are the relevant configuration parameters for logging to syslog.

log_destination = 'syslog'
syslog_facility = 'LOCAL1'
syslog_ident = 'pgpool' 

Since Pgpool-II 4.2, logging collector process has been implemented. The logging collector process collects log messages sent to stderr and redirects them into the specified log file. And the built-in log rotation functionality is available by setting the configuration parameter log_rotation_age and log_rotation_size.

Below are the relevant configuration parameters:

log_destination = 'stderr'
logging_collector = on
log_directory = '/tmp/pgpool_log'
log_filename = 'pgpool-%Y-%m-%d_%H%M%S.log'
log_file_mode = 0600
log_truncate_on_rotation = off
log_rotation_age = 1d
log_rotation_size = 10MB

Useful logging parameters for debugging

Debugging messages can be enabled by starting Pgpool-II with "-d" option or configuring log_min_messages. However, it generates a large number of debug massages.

Proper logging and debugging configurations are important. in this section I will describe some useful configuration parameters for troubleshooting.

log_line_prefix

log_line_prefix outputs a printf-style string at the beginning of each log line. %a in log_line_prefix parameter outputs the application name. 

Since Pgpool-II 4.2, %a allows Pgpool-II internal process names to be logged in the log files.

log_line_prefix = '%t %a [%p] ' 

The log below shows the process names of the main process, child process, health check process, etc.

2021-04-29 20:59:50 main [10249] LOG:  
2021-04-
29 20:59:50 main [10249] DEBUG:  
2021-04-
29 20:59:50 child [10280] DEBUG:  
2021-04-
29 20:59:50 sr_check_worker [10287] DEBUG:
2021-04-
29 20:59:50 health_check0 [10288] DEBUG:
2021-04-
29 20:59:50 health_check1 [10289] DEBUG:
2021-04-
29 20:59:50 pcp_main [10286] DEBUG:

if you specify %a in log_line_prefix, Pgpool-II outputs the application name to the log file same as PostgreSQL. Before Pgpool-II 4.1, only the application name included in the startup message could be logged. Since Pgpool-II 4.2, this parameter supports the application name specified in connection parameter PGAPPNAME and SET command.

log_line_prefix = '%t %a [%p] ' 

For example, it is able to easily identify the log messages output by the currently connected process by specify a specific application name using SET command. Sometime it could be helpful to test the child process behavior.

$ PGAPPNAME=myapp1 psql -h /tmp -p 11000
postgres=# select 1;
postgres=# SET application_name TO 'myapp2';
postgres=# select 2;

Log output:

main [11423] LOG:  pgpool-II successfully started. version 4.2.1
main [11423] LOG:  node status[0]: 1
main [11423] LOG:  node status[1]: 2
myapp1 [11445] LOG:  DB node id: 0 backend pid: 11469 statement: SELECT
myapp1 [11445] LOG:  pool_reuse_block: blockid: 0
myapp1 [11445] CONTEXT:  while searching system catalog, When relcache is
myapp1 [11445] LOG:  DB node id: 1 backend pid: 11470 statement: select 1;
myapp1 [11445] LOG:  DB node id: 0 backend pid: 11469 statement:  DISCARD
myapp1 [11445] LOG:  DB node id: 1 backend pid: 11470 statement:  DISCARD
myapp1 [11445] LOG:  DB node id: 0 backend pid: 11625 statement:  DISCARD
myapp1 [11445] LOG:  DB node id: 1 backend pid: 11651 statement: select 1;
myapp1 [11445] LOG:  DB node id: 0 backend pid: 11650 statement: SET
myapp1 [11445] LOG:  DB node id: 1 backend pid: 11651 statement: SET
myapp2 [11445] LOG:  DB node id: 1 backend pid: 11651 statement: select 2;

log_per_node_statement

log_per_node_statement logs the SQL statements for each DB node separately. 

log_per_node_statement = on

Below is an example of streaming replication mode. In this session DB node1 is the load balancing node.

postgres=# show pool_nodes;
 node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node
---------+----------+-------+--------+-----------+---------+------------+-------------------
  0       | /tmp     | 11002 | up     | 0.500000  | primary | 0          | false             
  1       | /tmp     | 11003 | up     | 0.500000  | standby | 6          | true

Let's run several READ and WRITE queries to verify load balancing.  As shown in the result, the WRITE query is sent to "DB node id: 0" (primary) and the READ query is sent to "DB node id: 1" (standby).

LOG:  DB node id: 0 backend pid: 2649 statement: insert into t1 values (0);
LOG:  DB node id: 1 backend pid: 2650 statement: select * from t1;

Next, let's see another example of native replication mode. In native replication mode, Pgpool-II replicates data between DB nodes. log_per_node_statement is useful to verify Pgpool-II's replication feature. In addition, it provides an easy way to verify if the same query is executed on each DB node. 

Execute a query with now() function.

postgres=# insert into t1 values (0, now());
INSERT 0 1 

As shown in the result, now() function is rewritten to timestamp and the same query is sent to each DB node.

LOG:  DB node id: 0 backend pid: 4099 statement: INSERT INTO "t1" VALUES (0,"pg_catalog"."timestamptz"('2021-04-29 23:43:50.485898+09'::text))
LOG:  DB node id: 1 backend pid: 4100 statement: INSERT INTO "t1" VALUES (0,"pg_catalog"."timestamptz"('2021-04-29 23:43:50.485898+09'::text))

log_client_messages

Sometimes to figure out the problems in extended query protocol, we need to check the client messages.
However, if debug is enabled, a large number of debug messages are output.
log_client_messages allows you to log only client messages without debug messages. These parameter makes log files more readable.

log_client_messages = on

Below are the client messages output in extended query protocol.

LOG:  Parse message from frontend.
DETAIL:  statement: "S2", query: "SELECT 1 FROM pgbench_accounts"
LOG:  Bind message from frontend.
DETAIL:  portal: "P1", statement: "S2"
LOG:  DB node id: 0 backend pid: 24797 statement: B message
LOG:  Execute message from frontend.
DETAIL:  portal: "P1"
LOG:  DB node id: 0 backend pid: 24797 statement: Execute: SELECT 1 …
LOG:  Close message from frontend.
DETAIL:  statement: "S2"
LOG:  DB node id: 0 backend pid: 24797 statement: C message

log_connections/log_disconnections

log_connections/log_disconnections parameters have the same behavior as PostgreSQL. log_connections logs each attempted connection and log_disconnections (Since Pgpool-II 4.2) logs the session termination to the log file. These parameters are useful when you want to log the connections and disconnections for auditing purposes. 

log_connections = on
log_disconnections= on  

Conslusion

Pgpool-II built-in logging mechanism added in 4.2 simplifies Pgpool-II management.
In addition, Pgpool-II provides various parameters for different testing purposes.
In this blog, I have explained several parameters for debugging load balancing, extended query protocol, etc. See Error Reporting and Logging for more information.

Comments

Popular posts from this blog

Installing Pgpool-II on Debian/Ubuntu

Query Load Balancing in Pgpool-II

Connection Pooling in Pgpool-II