Comments (11)
Looks like autoctl
user is created outside of pg_autoctl. When you create it you should grant usage privilege on pgautofailover
schema and execute privileges on the functions you want to call.
Could you confirm ?
from pg_auto_failover.
from pg_auto_failover.
I am not comfortable with granting all users permission to perform failover capabilities.
We already have a user autoctl_node
that can only do node related operations. If we grant failover capability to all user we would be enabling autoctl_node
user to do those operations, meaning anybody that can execute pg_autoctl
command line can execute any function on the monitor node.
@DimCitus what is your take on this ?
from pg_auto_failover.
Hi @danielwestermann, @mtuncer,
The pgautofailover
extension creates the autoctl_node
user. We use that user from the pg_autoctl
processes to call in the public API, mainly pgautofailver.register_node
and pgautofailover.node_active
, plus some other entry points.
We exclude the operators API from the privileges of the autoctl_node
user on purpose, for security reasons. We don't want that any node in the system would be allowed to manually perform a failover at any point in time with the default settings.
The pg_autoctl create monitor
command creates the autoctl
user as the owner of the pg_auto_failover
database, and then creates the pgautofailover
extension in there. The CREATE EXTENSION has to be run as a superuser though, and then the ops API is not granted to the autoctl
user.
I think it's fair to qualify this as a bug, and we should GRANT USAGE to our autoctl
user for the following API:
pgautofailover.perform_failover
pgautofailover.start_maintenance
pgautofailover.stop_maintenance
pgautofailover.enable_secondary
pgautofailover.disable_secondary
from pg_auto_failover.
from pg_auto_failover.
I tried to fix the formatting to make it easier to read the logs, but it seems GitHub didn't care for the edits. Sorry about that. Thanks for debugging the GRANT idea to its full picture with the dependencies (schema, table, etc).
The HBA should be handled by pg_auto_failover so that the failover can happen automatically. That said the errors you have are about the pgautofailover_monitor
user which is used from the monitor to the node. And then it's all about your DNS settings where it seems like you don't have reverse DNS setup. You can debug that situation with pg_autoctl in the following way:
$ PG_AUTOCTL_DEBUG=1 pg_autoctl -vv do show nodename
Of course the usual DNS tooling is going to be even more useful, the previous debug command allows to reproduce the internal nodename detection implemented at pg_autoctl create
time.
Can you share the HBA contents added by pg_auto_failover in your nodes?
Also, you might want to use the --listen
option of the pg_autoctl create
command so that you can have different listen addresses settings on each node, controlled by pg_autoctl, and kept out of the PostgreSQL.conf
so that we listen to the right address after a failover.
from pg_auto_failover.
hba.conf monitor node:
[postgres@pg-af3 ~]$ cat /u02/pgdata/12/af/pg_hba.conf | egrep -v "^$|^#"
local all all trust
host all all 127.0.0.1/32 trust
host all all ::1/128 trust
local replication all trust
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
host "pg_auto_failover" "autoctl_node" 192.168.22.0/24 trust # Auto-generated by pg_auto_failover
hba.conf primary:
[postgres@pg-af1 ~]$ cat /u02/pgdata/12/PG1/pg_hba.conf | egrep -v "^$|^#"
local all all trust
host all all 127.0.0.1/32 trust
host all all ::1/128 trust
local replication all trust
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
host "postgres" "postgres" pg-af1.it.dbi-services.com trust # Auto-generated by pg_auto_failover
host all "pgautofailover_monitor" pg-af3 trust # Auto-generated by pg_auto_failover
host replication "pgautofailover_replicator" pg-af2.it.dbi-services.com trust # Auto-generated by pg_auto_failover
host "postgres" "pgautofailover_replicator" pg-af2.it.dbi-services.com trust # Auto-generated by pg_auto_failover
hba.conf replica:
[postgres@pg-af2 ~]$ cat /u02/pgdata/12/PG1/pg_hba.conf | egrep -v "^$|^#"
local all all trust
host all all 127.0.0.1/32 trust
host all all ::1/128 trust
local replication all trust
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
host "postgres" "postgres" pg-af1.it.dbi-services.com trust # Auto-generated by pg_auto_failover
host all "pgautofailover_monitor" pg-af3 trust # Auto-generated by pg_auto_failover
host replication "pgautofailover_replicator" pg-af2.it.dbi-services.com trust # Auto-generated by pg_auto_failover
host "postgres" "pgautofailover_replicator" pg-af2.it.dbi-services.com trust # Auto-generated by pg_auto_failover
Hostname, monitor:
[postgres@pg-af3 ~]$ PG_AUTOCTL_DEBUG=1 pg_autoctl -vv do show nodename
14:58:44 DEBUG cli_do_show.c:209: cli_show_nodename: ip 10.0.2.15
14:58:44 DEBUG cli_do_show.c:220: cli_show_nodename: host pg-af3
14:58:44 DEBUG cli_do_show.c:231: cli_show_nodename: ip 192.168.22.72
pg-af3
Hostname, master:
[postgres@pg-af1 ~]$ PG_AUTOCTL_DEBUG=1 pg_autoctl -vv do show nodename
14:58:46 DEBUG cli_do_show.c:209: cli_show_nodename: ip 10.0.2.15
14:58:46 DEBUG cli_do_show.c:220: cli_show_nodename: host pg-af1
14:58:46 DEBUG cli_do_show.c:231: cli_show_nodename: ip 192.168.22.70
pg-af1
Hostname, replica:
[postgres@pg-af2 ~]$ PG_AUTOCTL_DEBUG=1 pg_autoctl -vv do show nodename
14:58:49 DEBUG cli_do_show.c:209: cli_show_nodename: ip 10.0.2.15
14:58:49 DEBUG cli_do_show.c:220: cli_show_nodename: host pg-af2
14:58:49 DEBUG cli_do_show.c:231: cli_show_nodename: ip 192.168.22.71
pg-af2
from pg_auto_failover.
There seems to be a mix of local DNS entry names in the HBA config files (pg-af3
) and also full DNS names in other places (pg-af2.it.dbi-services.com
). I'm not sure why that would be the case, maybe that's what you used with the --nodename
option at pg_autoctl create
time?
The do show nodename
subcommand first finds the local IP address that allows you to connect to a default external service (we use 8.8.8.8:53
in UDP). Then the do show nodename
command does a reverse DNS lookup from the returned address , and finally it does a forward DNS lookup on the hostname retrieved in step 2. If the forward DNS lookup answer contains a local IP address on the node, we keep the hostname.
The reason we do all of this is to make sure that the HBA rules are going to be aligned with Postgres' own matching rules. Postgres does DNS lookups of the client hostname to match with the HBA rules entries. Here it seems the entry has pg-af3
and the DNS reverse lookup has pg-af3.it.dbi-services.com
, which is not the same thing, and connection fails for the monitor process.
Did you use --nodename
in the setup, or did you use the default from pg_autoctl
automatic network discovery? The automatic discovery goes only so far, but should work in many cases. Is it possible that you can fix the --nodename
in a way that is compatible with both your DNS setup and Postgres expectations?
from pg_auto_failover.
from pg_auto_failover.
Have a look at https://www.postgresql.org/docs/12/auth-pg-hba-conf.html in the address
section, you will find the following text:
If a host name is specified (anything that is not an IP address range or a special key word is treated as a host name), that name is compared with the result of a reverse name resolution of the client's IP address (e.g., reverse DNS lookup, if DNS is used). Host name comparisons are case insensitive. If there is a match, then a forward name resolution (e.g., forward DNS lookup) is performed on the host name to check whether any of the addresses it resolves to are equal to the client's IP address. If both directions match, then the entry is considered to match. (The host name that is used in pg_hba.conf should be the one that address-to-name resolution of the client's IP address returns, otherwise the line won't be matched. Some host name databases allow associating an IP address with multiple host names, but the operating system will only return one host name when asked to resolve an IP address.)
That's why we're doing all that dance in the --nodename
default value. Short of having proper DNS setup, please consider using --nodename <IP address>
to make things work.
from pg_auto_failover.
Thanks for the hint. Converted all to IP addresses but it still does not work.
State before the failover:
[postgres@pg-af1 pg_autoctl]$ pg_autoctl show state
Name | Port | Group | Node | Current State | Assigned State
--------------+--------+-------+-------+-------------------+------------------
192.168.22.70 | 5432 | 0 | 1 | wait_primary | primary
192.168.22.71 | 5432 | 0 | 2 | secondary | secondary
After granting the missing permissions and initiating the failover these are the state changes:
[postgres@pg-af1 pg_autoctl]$ pg_autoctl show state
Name | Port | Group | Node | Current State | Assigned State
--------------+--------+-------+-------+-------------------+------------------
192.168.22.70 | 5432 | 0 | 1 | primary | primary
192.168.22.71 | 5432 | 0 | 2 | secondary | secondary
[postgres@pg-af1 pg_autoctl]$ pg_autoctl show state
Name | Port | Group | Node | Current State | Assigned State
--------------+--------+-------+-------+-------------------+------------------
192.168.22.70 | 5432 | 0 | 1 | primary | draining
192.168.22.71 | 5432 | 0 | 2 | secondary | prepare_promotion
[postgres@pg-af1 pg_autoctl]$ pg_autoctl show state
Name | Port | Group | Node | Current State | Assigned State
--------------+--------+-------+-------+-------------------+------------------
192.168.22.70 | 5432 | 0 | 1 | primary | demote_timeout
192.168.22.71 | 5432 | 0 | 2 | prepare_promotion | stop_replication
[postgres@pg-af1 pg_autoctl]$ pg_autoctl show state
Name | Port | Group | Node | Current State | Assigned State
--------------+--------+-------+-------+-------------------+------------------
192.168.22.70 | 5432 | 0 | 1 | demote_timeout | demote_timeout
192.168.22.71 | 5432 | 0 | 2 | stop_replication | stop_replication
[postgres@pg-af1 pg_autoctl]$ pg_autoctl show state
Name | Port | Group | Node | Current State | Assigned State
--------------+--------+-------+-------+-------------------+------------------
192.168.22.70 | 5432 | 0 | 1 | demote_timeout | demote_timeout
192.168.22.71 | 5432 | 0 | 2 | stop_replication | stop_replication
[postgres@pg-af1 pg_autoctl]$ pg_autoctl show state
Name | Port | Group | Node | Current State | Assigned State
--------------+--------+-------+-------+-------------------+------------------
192.168.22.70 | 5432 | 0 | 1 | demote_timeout | demote_timeout
192.168.22.71 | 5432 | 0 | 2 | stop_replication | stop_replication
[postgres@pg-af1 pg_autoctl]$ pg_autoctl show state
Name | Port | Group | Node | Current State | Assigned State
--------------+--------+-------+-------+-------------------+------------------
192.168.22.70 | 5432 | 0 | 1 | demote_timeout | demote_timeout
192.168.22.71 | 5432 | 0 | 2 | stop_replication | stop_replication
[postgres@pg-af1 pg_autoctl]$ pg_autoctl show state
Name | Port | Group | Node | Current State | Assigned State
--------------+--------+-------+-------+-------------------+------------------
192.168.22.70 | 5432 | 0 | 1 | demote_timeout | demote_timeout
192.168.22.71 | 5432 | 0 | 2 | stop_replication | stop_replication
[postgres@pg-af1 pg_autoctl]$ pg_autoctl show state
Name | Port | Group | Node | Current State | Assigned State
--------------+--------+-------+-------+-------------------+------------------
192.168.22.70 | 5432 | 0 | 1 | demote_timeout | demote_timeout
192.168.22.71 | 5432 | 0 | 2 | stop_replication | stop_replication
[postgres@pg-af1 pg_autoctl]$ pg_autoctl show state
Name | Port | Group | Node | Current State | Assigned State
--------------+--------+-------+-------+-------------------+------------------
192.168.22.70 | 5432 | 0 | 1 | demote_timeout | demoted
192.168.22.71 | 5432 | 0 | 2 | wait_primary | wait_primary
[postgres@pg-af1 pg_autoctl]$ pg_autoctl show state
Name | Port | Group | Node | Current State | Assigned State
--------------+--------+-------+-------+-------------------+------------------
192.168.22.70 | 5432 | 0 | 1 | demoted | catchingup
192.168.22.71 | 5432 | 0 | 2 | wait_primary | wait_primary
[postgres@pg-af1 pg_autoctl]$ pg_autoctl show state
Name | Port | Group | Node | Current State | Assigned State
--------------+--------+-------+-------+-------------------+------------------
192.168.22.70 | 5432 | 0 | 1 | demoted | catchingup
192.168.22.71 | 5432 | 0 | 2 | wait_primary | wait_primary
and then it stays like that. What caught my attention is, that from time to time you see two pg_basebackup processes on the old master:
postgres 2636 1 0 18:35 ? 00:00:00 /u01/app/postgres/product/12/db_0/bin/pg_autoctl run
postgres 2954 2636 20 18:37 ? 00:00:00 /u01/app/postgres/product/12/db_0/bin/pg_basebackup -w -h 192.168.22.71 -p 5432 --pgdata /u02/pgdata/12/backup -U pgautofailo
postgres 2955 2954 0 18:37 ? 00:00:00 /u01/app/postgres/product/12/db_0/bin/pg_basebackup -w -h 192.168.22.71 -p 5432 --pgdata /u02/pgdata/12/backup -U pgautofailo
I don't think that is intended.
hba file on the old master:
host "postgres" "postgres" 192.168.22.70/32 trust # Auto-generated by pg_auto_failover
host all "pgautofailover_monitor" 192.168.22.72/32 trust # Auto-generated by pg_auto_failover
host replication "pgautofailover_replicator" 192.168.22.71/32 trust # Auto-generated by pg_auto_failover
host "postgres" "pgautofailover_replicator" 192.168.22.71/32 trust # Auto-generated by pg_auto_failover
host replication "pgautofailover_replicator" 192.168.22.70/32 trust # Auto-generated by pg_auto_failover
host "postgres" "pgautofailover_replicator" 192.168.22.70/32 trust # Auto-generated by pg_auto_failover
hba file on the new master:
host "postgres" "postgres" 192.168.22.70/32 trust # Auto-generated by pg_auto_failover
host all "pgautofailover_monitor" 192.168.22.72/32 trust # Auto-generated by pg_auto_failover
host replication "pgautofailover_replicator" 192.168.22.71/32 trust # Auto-generated by pg_auto_failover
host "postgres" "pgautofailover_replicator" 192.168.22.71/32 trust # Auto-generated by pg_auto_failover
host replication "pgautofailover_replicator" 192.168.22.70/32 trust # Auto-generated by pg_auto_failover
host "postgres" "pgautofailover_replicator" 192.168.22.70/32 trust # Auto-generated by pg_auto_failover
hba file on the monitor:
host "pg_auto_failover" "autoctl_node" 192.168.22.0/24 trust # Auto-generated by pg_auto_failover
From journalctl on the old master:
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:19 INFO Calling node_active for node default/1/0 with current state: demoted, PostgreSQL is not running, sync_state is "", current lsn is "0/0".
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:19 INFO FSM transition from "demoted" to "catchingup": A new primary is available. First, try to rewind. If that fails, do a pg_basebackup.
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:19 INFO The primary node returned by the monitor is 192.168.22.71:5432
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:19 INFO Rewinding PostgreSQL to follow new primary 192.168.22.71:5432
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:19 ERROR Connection to database failed: could not connect to server: No such file or directory
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: Is the server running locally and accepting
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:19 ERROR Failed to get the postgresql.conf path from the local postgres server, see above for details
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:19 WARN Failed to rewind demoted primary to standby, trying pg_basebackup instead
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:19 INFO Initialising PostgreSQL as a hot standby
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:19 INFO Target directory exists: "/u02/pgdata/12/PG1", stopping PostgreSQL
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:19 INFO pg_ctl: no server running
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:19 ERROR could not identify current directory: No such file or directory
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:19 INFO pg_ctl stop failed, but PostgreSQL is not running anyway
Nov 05 18:50:19 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:19 INFO Running /u01/app/postgres/product/12/db_0/bin/pg_basebackup -w -h 192.168.22.71 -p 5432 --pgdata /u02/pgdata/12/backup -U pgautofailover_replicator --write-recovery-conf --max-rate 100M --wal-method=stream --slot pgautofailover_standby ...
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:20 INFO could not identify current directory: No such file or directory
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: initiating base backup, waiting for checkpoint to complete
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: checkpoint completed
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: write-ahead log start point: 0/A1000028 on timeline 2
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: starting background WAL receiver
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: 0/23702 kB (0%), 0/1 tablespace (/u02/pgdata/12/backup/backup_label )
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: 23711/23711 kB (100%), 0/1 tablespace (...data/12/backup/global/pg_control)
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: 23711/23711 kB (100%), 1/1 tablespace
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: write-ahead log end point: 0/A1000100
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: waiting for background process to finish streaming ...
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: syncing data to disk ...
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: base backup completed
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:20 INFO Postgres is not running, starting postgres
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:20 INFO /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/12/PG1 --options "-p 5432" --options "-h *" --wait start
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:20 ERROR Failed to start PostgreSQL. pg_ctl start returned: 1
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:20 ERROR Failed to become standby server, see above for details
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:20 ERROR Failed to transition from state "demoted" to state "catchingup", see above.
Nov 05 18:50:20 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:20 ERROR Failed to transition to state "catchingup", retrying...
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:25 INFO Calling node_active for node default/1/0 with current state: demoted, PostgreSQL is not running, sync_state is "", current lsn is "0/0".
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:25 INFO FSM transition from "demoted" to "catchingup": A new primary is available. First, try to rewind. If that fails, do a pg_basebackup.
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:25 INFO The primary node returned by the monitor is 192.168.22.71:5432
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:25 INFO Rewinding PostgreSQL to follow new primary 192.168.22.71:5432
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:25 ERROR Connection to database failed: could not connect to server: No such file or directory
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: Is the server running locally and accepting
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:25 ERROR Failed to get the postgresql.conf path from the local postgres server, see above for details
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:25 WARN Failed to rewind demoted primary to standby, trying pg_basebackup instead
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:25 INFO Initialising PostgreSQL as a hot standby
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:25 INFO Target directory exists: "/u02/pgdata/12/PG1", stopping PostgreSQL
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:25 INFO pg_ctl: no server running
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:25 ERROR could not identify current directory: No such file or directory
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:25 INFO pg_ctl stop failed, but PostgreSQL is not running anyway
Nov 05 18:50:25 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:25 INFO Running /u01/app/postgres/product/12/db_0/bin/pg_basebackup -w -h 192.168.22.71 -p 5432 --pgdata /u02/pgdata/12/backup -U pgautofailover_replicator --write-recovery-conf --max-rate 100M --wal-method=stream --slot pgautofailover_standby ...
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:27 INFO could not identify current directory: No such file or directory
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: initiating base backup, waiting for checkpoint to complete
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: checkpoint completed
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: write-ahead log start point: 0/A2000028 on timeline 2
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: starting background WAL receiver
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: 0/23702 kB (0%), 0/1 tablespace (/u02/pgdata/12/backup/backup_label )
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: 11378/23702 kB (48%), 0/1 tablespace (...pgdata/12/backup/base/12723/4150)
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: 23711/23711 kB (100%), 0/1 tablespace (...data/12/backup/global/pg_control)
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: 23711/23711 kB (100%), 1/1 tablespace
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: write-ahead log end point: 0/A2000100
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: waiting for background process to finish streaming ...
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: syncing data to disk ...
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: pg_basebackup: base backup completed
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:27 INFO Postgres is not running, starting postgres
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:27 INFO /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/12/PG1 --options "-p 5432" --options "-h *" --wait start
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:27 ERROR Failed to start PostgreSQL. pg_ctl start returned: 1
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:27 ERROR Failed to become standby server, see above for details
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:27 ERROR Failed to transition from state "demoted" to state "catchingup", see above.
Nov 05 18:50:27 pg-af1.it.dbi-services.com pg_autoctl[2636]: 18:50:27 ERROR Failed to transition to state "catchingup", retrying...
from pg_auto_failover.
Related Issues (20)
- Primary lost connection to secondary and no replication but monitor from third site still have connection to both allowing read access to secondary. HOT 2
- Any Kubernetes Statefulset YAML ready for a cluster with pg_auto_failover ? HOT 1
- Initialize an already existing primary server
- memory leak in version 2.0 HOT 4
- Deadlocks during pg_auto_failover operations HOT 1
- PostgreSQL 16 support HOT 13
- Error when building image using docs/citus/Dockerfile HOT 1
- ERROR candidate-priority value 10 is not valid. Valid values are integers from 0 to 100 🧐 HOT 1
- Switchover and failover get stuck in the report_lst status in certain configurations HOT 5
- Release 2.1 HOT 5
- Upgrade to 2.1 fails: extension "pgautofailover" has no update path from version "2.0" to version "2.1" HOT 4
- Citus formation upgrade path
- Error when creating Citus worker to a formation HOT 2
- When a secondary node is dropped, the FSM is promoted from secondary to single.
- Configuration parameter in pg_autoctl.cfg for the ability to clear PGDATA before running pg_basebackup HOT 1
- Possible FAILURE STATE in State Machine HOT 1
- How can I configure the maximum number of WAL segments a standby can lag behind the primary in pg_auto_failover?
- Hung (unclosed) old connections in idle status on the monitor from datanodes HOT 7
- Question: Running with wal_level = logical?
- If you modify the PGDATA directory, an error is reported when creating a Seondary Node. And the same operation is normal when creating Primary Node.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pg_auto_failover.