Comments (7)
Thanks Dimitri,
I was not aware about this documentation. Maybe it would be a good idea to add it to the README.md. I will check what you recommended and let you know how it worked out.
Best regards
Daniel
from pg_auto_failover.
Hi Dimitri,
when installing from source and you are on CentOS/RedHat 8 you should add the following to the documentation, otherwise the systemd service will not start:
sudo semanage fcontext -a -t bin_t [PATH_TO]/pg_autoctl
restorecon -v [PATH_TO]/pg_autoctl
The state is reported fine now, that was an issue with the DNS.
Best regards
Daniel
from pg_auto_failover.
Hi Daniel! Thanks for your interest in pg_auto_failover and for your patience while the team was busy and traveling!
The CATCHINGUP state is documented with the others at https://pg-auto-failover.readthedocs.io/en/latest/fsm.html#state-reference and we can read:
The monitor assigns catchingup to the standby node when the primary is ready for a replication connection (pg_hba.conf has been properly edited, connection role added, etc).
The standby node keeper runs pg_basebackup, connecting to the primary’s nodename and port. The keeper then edits recovery.conf and starts PostgreSQL in hot standby node.
So the replication is now ongoing and the monitor needs some more information before it can ask the primary to add the secondary to the synchronous_standby_names
. Namely:
- the monitor health check must succeed, you can have a look at
pgautofailover.node
table for more information - the WAL lag must be kept under 1 WAL segment, by default, you can change this setting on the Monitor side (https://pg-auto-failover.readthedocs.io/en/latest/ref/configuration.html#pg-auto-failover-monitor).
I guess that one of those elements has not converged to the acceptable state yet at the moment when you're still catchingup. As soon as those will resolve, the monitor will assign roles primary and secondary and the replication will switch to SYNC.
from pg_auto_failover.
I'm not sure where to add the extra SElinux setup commands in the documentation. Would that be somewhere in https://pg-auto-failover.readthedocs.io/en/latest/install.html#installing-a-pgautofailover-systemd-unit then?
from pg_auto_failover.
from pg_auto_failover.
Now I remember why I used the --nodename parameter:
postgres@pg-af3 ~]$ ./_reset.sh
17:51:00 INFO Using --nodename "pg-af3", which resolves to IP address "192.168.22.72"
17:51:02 INFO Initialising a PostgreSQL cluster at "/u02/pgdata/12/af"
17:51:02 INFO /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/12/af --options "-p 5432" --options "-h *" --wait start
17:51:03 INFO Granting connection privileges on 192.168.22.0/24
17:51:03 INFO Your pg_auto_failover monitor instance is now ready on port 5432.
17:51:03 INFO pg_auto_failover monitor is ready at postgres://autoctl_node@pg-af3:5432/pg_auto_failover
17:51:03 INFO Monitor has been succesfully initialized.
17:51:03 INFO Found pg_ctl for PostgreSQL 12.0 at /u01/app/postgres/product/12/db_0/bin/pg_ctl
17:51:03 INFO Registered node pg-af1.it.dbi-services.com:5432 with id 1 in formation "default", group 0.
17:51:03 INFO Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/12/PG1/pg_autoctl.init"
17:51:03 INFO Successfully registered as "single" to the monitor.
17:51:05 INFO Initialising a PostgreSQL cluster at "/u02/pgdata/12/PG1"
17:51:05 INFO Postgres is not running, starting postgres
17:51:05 INFO /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/12/PG1 --options "-p 5432" --options "-h *" --wait start
17:51:05 INFO CREATE DATABASE postgres;
17:51:05 INFO The database "postgres" already exists, skipping.
17:51:06 INFO FSM transition from "init" to "single": Start as a single node
17:51:06 INFO Initialising postgres as a primary
17:51:06 INFO Transition complete: current state is now "single"
17:51:06 INFO Keeper has been succesfully initialized.
17:51:06 WARN Failed to connect: Permission denied
17:51:06 FATAL Failed to find a local IP address, please provide --nodename.
17:51:06 FATAL Failed to auto-detect the hostname of this machine, please provide one via --nodename
17:51:07 WARN Failed to connect: Permission denied
17:51:07 FATAL Failed to find a local IP address, please provide --nodename.
17:51:07 FATAL Failed to auto-detect the hostname of this machine, please provide one via --nodename
Name | Port | Group | Node | Current State | Assigned State
---------------------------+--------+-------+-------+-------------------+------------------
pg-af1.it.dbi-services.com | 5432 | 0 | 1 | single | single
[postgres@pg-af3 ~]$
Without the nodename the initialization of the replica fails with:
17:51:06 WARN Failed to connect: Permission denied
17:51:06 FATAL Failed to find a local IP address, please provide --nodename.
17:51:06 FATAL Failed to auto-detect the hostname of this machine, please provide one via --nodename
from pg_auto_failover.
Closing this issue for triaging purposes. Feel free to re-open of more actions are expected.
from pg_auto_failover.
Related Issues (20)
- Primary lost connection to secondary and no replication but monitor from third site still have connection to both allowing read access to secondary. HOT 2
- Any Kubernetes Statefulset YAML ready for a cluster with pg_auto_failover ? HOT 1
- Initialize an already existing primary server
- memory leak in version 2.0 HOT 4
- Deadlocks during pg_auto_failover operations HOT 1
- PostgreSQL 16 support HOT 13
- Error when building image using docs/citus/Dockerfile HOT 1
- ERROR candidate-priority value 10 is not valid. Valid values are integers from 0 to 100 🧐 HOT 1
- Switchover and failover get stuck in the report_lst status in certain configurations HOT 5
- Release 2.1 HOT 5
- Upgrade to 2.1 fails: extension "pgautofailover" has no update path from version "2.0" to version "2.1" HOT 4
- Citus formation upgrade path
- Error when creating Citus worker to a formation HOT 2
- When a secondary node is dropped, the FSM is promoted from secondary to single.
- Configuration parameter in pg_autoctl.cfg for the ability to clear PGDATA before running pg_basebackup HOT 1
- Possible FAILURE STATE in State Machine HOT 1
- How can I configure the maximum number of WAL segments a standby can lag behind the primary in pg_auto_failover?
- Hung (unclosed) old connections in idle status on the monitor from datanodes HOT 7
- How to install v2.0 with CentOS7 using yum? HOT 1
- If you modify the PGDATA directory, an error is reported when creating a Seondary Node. And the same operation is normal when creating Primary Node.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pg_auto_failover.